guillermo-navas-palencia / optbinning

Optimal binning: monotonic binning with constraints. Support batch & stream optimal binning. Scorecard modelling and counterfactual explanations.
http://gnpalencia.org/optbinning/
Apache License 2.0
434 stars 98 forks source link

Read/save bins Issue #96 #277

Closed alexliap closed 6 months ago

alexliap commented 8 months ago

This PR aims to resolve Issue #96 by adding to class methods to OptimalBinning object, to_json & read_json. Continuous and Multiclass binning inherit from that class, so no other methods were needed. The to_json method also saves the transformation of the training data as requested in the Issue.

guillermo-navas-palencia commented 7 months ago

Hi @alexliap.

Thanks for taking the time to develop this PR. However, this is not what I had in mind (sorry if the description was lacking details). The json file should contain all the information needed to build the binning table.

alexliap commented 7 months ago

Thank you for the response. I will work on it, but could you be a little more specific in order to avoid refactoring again?

alexliap commented 6 months ago

What about now, i save all the data required for the binning table, but is this the way you want it to behave? should the self._check_is_fitted() change in order to be able to call self.binning_table instead of self._binning_table ?

guillermo-navas-palencia commented 6 months ago

Yes, this looks good to me. Regarding self._is_fitted, yes, I think it should be set to True when reading from JSON.

guillermo-navas-palencia commented 6 months ago

To complete the PR, it would be nice to have unit tests for all binning classes. I mention this because the binning table classes require different arguments depending on the target type, so continuous and multiclass targets will not work automatically.

alexliap commented 6 months ago

Yes of course, I'm on it

alexliap commented 6 months ago

uuum question, why MulticlassOptBinning doesn't require dtype of variable as input?

guillermo-navas-palencia commented 6 months ago

uuum question, why MulticlassOptBinning doesn't require dtype of variable as input?

MulticlassOptBinning only accepts numerical dtype.

guillermo-navas-palencia commented 6 months ago

Thanks!