Open ankur-tutlani opened 5 days ago
I don't immediately see anything wrong; I'm not super familiar with DataBricks, but I wonder if maybe they don't guarantee that rows are returned in the same order, or if it's possible that additional rows are added over time?
Thanks for response. I sorted the dataframe to ensure the order remains consistent before running DML or KernelDML.
data1=data1.sort_values(['Y','X']).reset_index(drop=True)
I am not sure what you mean by additional rows are added over time? Can you clarify. data1
is pandas dataframe.
One thing I observed is nuisance_scores_y
and nuisance_scores_t
are not consistent. Meaning if I run 10-fold cross validation along with 10 mc_iters
, I expect the output (nuisance_scores_y
and nuisance_scores_t
) should be a list with length of 10. But sometimes its 10, sometimes it's less than 10 like 3 or 7. Also, the values in the list elements (nuisance_scores_y
and nuisance_scores_t
) vary significantly across different runs for the same seed and treatment and control combinations.
What can explain this behavior?
That behavior is very strange: the nuisance scores should always be a nested list where the length of the outer list is mc_iters
and the length of each element is the number of folds, and the logic which creates those lists is straightforward (and covered by our tests).
I have a situation where I am getting different ATE estimates with same input dataset and same random seed. If I run today the average ATE number is around 1. If I run after few hours, it increases to 4 or even more. This is for the same treatment (T1) and control value (T0) combination. What could be potentially wrong here? I have one confounder, one treatment and one outcome column. All are continuous. I tried manually passing folds in cv argument but still the stability of estimates is not there. I have tried passing input dataset in a specific order, but again results are not the same. I observed this with both DoubleML and Kernel DML. What should be changed here to get more stable ATE estimates?
'common_causes2' contains one continuous variable.
For DoubleML using the following final model.
The average value differs a lot. Although there is not much variation in "results". E.g. sometimes I get "results" in the range from 1 to 3. Other times it increases to 8 to 9. that drives the "average_result" value to differ significantly among different runs. This is for the same treatment (T1) and control value (T0) combination. e.g. at one instance with T0=20, T1=25, average value shows 2, while after few hours with the same T0 and T1 values of 20 and 25 respectively, it shows the average value of 10. I am running this on databricks cluster. Is there anything wrong in the arguments specified above?
econml-0.15.1