Open davidskalinder opened 8 months ago
I believe the na.fail
option would help for the analyst to have more awareness about the dataset. On the other hand, this holds for IVs as well I assume and forces the analyst to impute or drop NAs.
So question is: 1) do we keep this as is and document the behavior 2) do we make it more strict enforcing the analyst to have a 'clean' dataset
Will think about this. Comments are appreciated.
Well, like I say, couldn't we add an na.action
argument to OaxacaBlinderDecomp()
and then just pass that to the lm()
calls?
I think that would be the best option. Whether or not we do that, I think documenting the behavior would be enough -- no need to make the default more strict I think. After all, this package is just defaulting to the global settings, which is probably what users should expect anyway (part of the reason the behavior was less clear to me was because of #5 and #6).
Come to think of it, this whole issue might not even really be a bug unless you share my opinion that it's silly for R to have a global na.action
option at all and that it should be set at the function level. But I do think my opinion is right of course. :)
This reprex shows how to change
OaxacaBlinderDecomp()
's handling ofNA
s (in the DV, in this case, but the same holds forNA
s in IVs):Created on 2024-03-01 with reprex v2.1.0
It looks
OaxacaBlinderDecomp()
usually runs when there areNA
s in the DV because it fits the model usinglm()
with no options:https://github.com/sinanpl/OaxacaBlinder/blob/5109b3c6a2f6491fa64ec13c4dedf1344023bc70/R/oaxaca.R#L93
Since
lm()
is called without itsna.action
argument, its handling ofNA
s can (only) be set by changing the globalna.action
option. It seems like it might be worth making it so thatOaxacaBlinderDecomp()
can pass arguments likena.action
, or perhaps others, tolm()
? Or at least documenting the behavior?Probably not an urgent fix for me now that I know that it does this, but it took some investigating to clarify what was happening under the hood...