thomas-rasmussen / sas_macros

SAS macros
Creative Commons Zero v1.0 Universal
5 stars 4 forks source link

hash_match weird irreproducible bug #42

Closed thomas-rasmussen closed 2 years ago

thomas-rasmussen commented 3 years ago

Figure out a way to make a minimal reproducible example that recreates the weird bug that Trine has in her study where, even though matching was done without replacement, there were cases with a person being a control multiple times for the same case, BUT ONLY for that one case, ie no replacement between matched sets.

The unwanted behavior could be fixed by putting an extra set of superfluous parenthesis around an expression in the match_inexact parameter, but it is unclear why the macro does not work as it should in this case. It seems impossible to be a coding error, but it also seems impossible that there should be any replacement done because of

https://github.com/thomas-rasmussen/sas_macros/blob/8c5d4b37c70320d2a62c016ba986670bf026743d/hash_match.sas#L840-L842

that removes the match from the hash table as soon as it has been selected.

Is there something that goes wrong because of the use of %str() ? Is this some strange bug in SAS?

thomas-rasmussen commented 2 years ago

The problem is in https://github.com/thomas-rasmussen/sas_macros/blob/672983edc0f5fb5cafa6d0fa3bdeb9069b4a26a2/hash_match.sas#L834 There needs to be parentheses around &match_inexact to make sure that the inexact matching conditions are evaluated entirely separately before being combined with other conditions using the and operator. If not, and the inexact matching criterias included use of eg the or operator, evaluation of the combined set of conditions might not be done as intended.

minimal_repex.txt

thomas-rasmussen commented 2 years ago

Fix implemented in hash_match_dev. Will be included in hash_match version 0.4.0