py-why / dodiscover

[Experimental] Global causal discovery algorithms
https://www.pywhy.org/dodiscover/
MIT License
87 stars 18 forks source link

Data type specification and checking #103

Open robertness opened 1 year ago

robertness commented 1 year ago

Is your feature request related to a problem? Please describe. The type and domain of the variables in the data should be a first class citizen

Describe the solution you'd like

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context This is also related to making assumptions first class citizens.

adam2392 commented 1 year ago

I think we can follow a similar approach to scikit-learn and assume continuous by default and allow users to pass in a categorical mask (e.g. https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.HistGradientBoostingClassifier.html).

Idk if range of variables is important tho?

Then, we could have private attributes for each method _supports_categorical, _supports_mixed, _supports_continuous that is checked during fit(...)