Non-reproducible confidence

mindsdb / mindsdb_native

Machine Learning in one line of code

http://mindsdb.com

GNU General Public License v3.0

37 stars 28 forks source link

Non-reproducible confidence #442

Closed paxcema closed 3 years ago

paxcema commented 3 years ago

Very minor issue at the moment, but training a bunch of predictors on the same dataset and target column can yield slightly different width in the prediction ranges (which, in turn, means slightly different confidence and reported accuracies). For more context, see #436.

This could be caused by some un-seeded parameter in nonconformist, or by another as-of-yet unidentified bug in our model analysis phase.

paxcema commented 3 years ago

The underlying issue here is that our LightGBMMixer is currently not reproducible due to the Optuna tuner not being reproducible (see this issue). Different models will incur in different calibration scores, hence variable prediction regions output by the ICP make sense. I will close this and file an issue in Lightwood about the mixer.

EDIT: We actually have already filed this behavior here.