Closed kbjarkefur closed 5 months ago
Ah they must have updated the text. I took that from an older Statalist posting. https://www.statalist.org/forums/forum/general-stata-discussion/general/1551609-merge-m-m-vs-joinby
The reason m:m
is not just a bad practice but should be flagged as a reproducibility issue is as the documentation warns. m:m
specifically is a command that is known to depend on the sort order of data and produces garbage results whenever it is used. I am not aware of any other such built-in command that will not trigger either our existing sort flag or our existing randomization flag; and I don't believe m:m
will trigger either of them because it does not actually rearrange or randomize the data in the process of destroying it.
How can it not show up in our flags if it creates garbage data? Either it is stable or reprun
needs to be improved to catch this? I don't think the command should be opinionated in this way as it assumes we can imagine all possible cases where this command can be used.
If we want to keep the warning, I think we should just refer to the helpfile of merge
and not say something that quickly can be outdated.
It is always stable, it is just sort-dependent on two datasets and the results are meaningless. I agree with you that it should not be flagged as an error. However, we wanted to flag it in the rep checks, so I enabled the warning in this command. It could be moved to an option, but I think it should be the default given the frequency with which we are seeing m:m
in submitted work. All results after m:m
are necessarily invalid. I also agree with referring to the wording of the current helpfile.
The more you read the funnier it gets...
Anyway -- this is fixed in 40bb89199ef96963cb96887f3ee4d191f7465ba1 with a link to the documentation. Closing this one.
This makes this text display correctly regardless if screen size and settings.
@bbdaniels , do we have a reference for this as we claim this is what Stata says. I find this in the documentation, which is similar but not word by word: