mcaceresb / stata-gtools

Faster implementation of Stata's collapse, reshape, xtile, egen, isid, and more using C plugins
https://gtools.readthedocs.io
MIT License
182 stars 38 forks source link

gegen max does not properly evaluate string expressions #85

Closed adamreir closed 2 years ago

adamreir commented 2 years ago

I just updated gtools to version 1.8.1, and some code that used to work now issues an error message.

The command

gegen newvar=(stringvar=="value"), by(group)

Now gives the error message

warning: gegen is NOT parsing the expression '(stringvar==value")' by group." invalid name

Here is a minimal example that produces the error:

clear
set seed 1
set obs 10 
g cat=round(runiform()*5)
g stringvar = "1"
replace stringvar="2" if runiform()>.7
gegen newvar = max(stringvar=="2"), by(cat)

I've managed to circumvent the problem by creating a temporary variable and evaluating that in egen max. But if it indicates some more significant issue I thought you should know.

(and since I'm here: thanks for making this awesome package)

Version info

mcaceresb commented 2 years ago

@adamreir I put that warning there to make it plain this was not parsed in the same way as by varlist: egen (i.e. it's parsed by creating the temporary variable first). However, I forgot about this Stata quirk with strings, which has gotten me many times over the years. You can replace lines 646 and 647 in egen.ado with

                mata printf("{bf:warning}: gegen is {bf:NOT} parsing the expression '%s' by group.\n", st_local("args"))
                mata printf("To parse this expression by group, call gegen using the -by:- prefix.\n")

or delete them if you understand the warning.

adamreir commented 2 years ago

Oh, right. I didn't consider the possibility the bug was in the error message and not in the parsing. That does fix the issue.

Thanx again!