Open nanophyto opened 1 week ago
Hello @nanophyto ,
Thank you for bringing this issue to our attention.
You're testing the MapieRegressor
with cv=10
on a zero-inflated dataset and are dissatisfied with the results. You'd like to use a non-BaseCrossValidator, but the MAPIE package doesn't currently support this.
To help you, consider using MapieQuantileRegressor
to capture the heteroscedasticity of your data and obtain more realistic lower bounds for prediction intervals. Unlike MapieRegressor
, which produces prediction intervals of constant size, MapieQuantileRegressor
might provide more satisfying results.
Please share more details about your issue so we can better assist you. If you still need to use all your data in a cross-validation setup, provide more information about the properties of your non-BaseCrossValidator. This will help us determine if there are other solutions to adapt it to be BaseCrossValidator-compatible.
Looking forward to your response.
I'm working with a highly zero-inflated dataset. Because of this, I'm using a custom defined zero-stratified CV splitter in my sklearn pipeline.
Currently, MAPIE does not seem to support non-BaseCrossValidator CV methods, which means that if I try to run e.g.
MapieRegressor
withcv=10
some of the folds are nearly all zeros - resulting in some very unrealistic lower bounds for the PI estimates.Would it be possible to add support for user-defined CV splitters?