pzivich / zEpid

Epidemiology analysis package
http://zepid.readthedocs.org
MIT License
141 stars 33 forks source link

Check if Donsker class #124

Closed pzivich closed 3 years ago

pzivich commented 5 years ago

As in the updated #109 I still now write a function to check the user inputs to check whether the nuisance function estimator is Donsker class. This is where the difficult part (and the help request) come in. Checking whether something is Donsker or not is not necessarily straightforward.

Convergence Heuristic

Rather than trying to determine whether each estimator is Donsker (which is beyond my mathematical ability), there is a heuristic I can use. To assess whether estimators are getting approximately uniform root-n convergence, I need to select various points on the domain. For each of those points, I need to plot(x=sqrt(n), y=estimate_n - truth) for various different n's. The points should lie in a straight line that goes to zero, and the slope of the line should be approximately -0.50. A similar procedure is described in Westreich et al. 2012 for IPW.

What I need help with

Basically, I would need help checking the available functions in sklearn for prediction. This needs to be done for both continuous and categorical functions. I can write the code and what points to run everything at but it would help to have a second set of eyes and help assessing all the functions.

Estimators

...more to add that I will write out later...

pzivich commented 3 years ago

I am not going to try figuring out what is non-Donsker. It is a lot of work. With the cross-fit estimators being added, I will have a custom warning generated instead when custom_model is specified for AIPTW and TMLE. The custom warning can be hidden by users if they want (but will recommend the cross-fit estimators)