worldbank / iefieldkit

Stata commands designed for Impact Evaluations field work. These are tools that are used during/after a survey in the field for data quality monitoring.
MIT License
38 stars 18 forks source link

[WIP] Revisions for SJ R2 #128

Closed bbdaniels closed 4 years ago

MRuzzante commented 4 years ago

This fixes

variable __000001 not found

but I still run in the following error regarding scalars

something not found

as in issue #116

MRuzzante commented 4 years ago

I haven't looked at the ado-file but isn't there a way you could first get the list of variables mentioned in the do-files and then, before keeping those vars, you actually verify that those vars exist in the data-set with something like conf var. If they don't, you remove them from the final list.

Otherwise, another way would be to rule out they are scalars, before including them in the list of vars to keep, with conf scalar.

luizaandrade commented 4 years ago

The match option is now partially working with 3 rounds: it merges the third round to varnames is the first one. But if the variables are only common to the 2nd and 3rd round, they won't be matched

bbdaniels commented 4 years ago

UPDATE @luizaandrade logic is hard. Implemented in c0e0a9485d8b6d4f69b9deef838613eadd2dfa22.

That's correct, and intentional at this stage; merging twice creates a hellscape of duplicates and order changes such that I'm not sure I can implement it without worse bugs. The fourth dataset, for example, would have to be sequentially matched to all preceding three etc. Is achieving this behavior a bugfix in your opinion or a new feature? If the former do you have thoughts on the best way to implement -- the current code is below:

https://github.com/worldbank/iefieldkit/blob/58c3dc9a1d57c8dbe4775d6fc3ca6fa18d367717/src/ado_files/iecodebook.ado#L312-L330

bbdaniels commented 4 years ago

I haven't looked at the ado-file but isn't there a way you could first get the list of variables mentioned in the do-files and then, before keeping those vars, you actually verify that those vars exist in the data-set with something like conf var. If they don't, you remove them from the final list.

Otherwise, another way would be to rule out they are scalars, before including them in the list of vars to keep, with conf scalar.

@MRuzzante is this still happening? It does now check that everything is a variable, by the following code (which conveniently also makes abbreviated variables and varlists work, although it still fails for some syntaxes like [pw=var]). If it is still breaking on scalars could you provide the example?

https://github.com/worldbank/iefieldkit/blob/58c3dc9a1d57c8dbe4775d6fc3ca6fa18d367717/src/ado_files/iecodebook.ado#L229-L235

MRuzzante commented 4 years ago

Hi @bbdaniels,

I am not getting anymore the error

variable __000001 not found

as in https://github.com/worldbank/iefieldkit/issues/116#issue-479599700, but I still have

something not found

where something is a scalar used in one of the do-files.

Also, a new error is:

no variables defined invalid syntax local macro `something2' not found

where something2 is a local used in one of the do-files.

Note that, despite the error, the codebook files are still created in the indicated folder.

bbdaniels commented 4 years ago

@MRuzzante Can you provide a zip with a minimal working example to get these errors?

bbdaniels commented 4 years ago

@MRuzzante I've run the code you provided. I fixed it with kind of a nuclear option but I don't think it breaks anything new 🤪 I could not for the life of me figure out how to suppress that particular message from unab even though it wasn't breaking and was already done quietly!

Note that with the fix, the command no longer accepts variable abbreviations (it is not possible to correctly differentiate between many common short strings and at least one variable), but it still works with wildcards.

MRuzzante commented 4 years ago

That's great, @bbdaniels: the scalar issue is now solved and the code is running without breaking! :raised_hands: