Open miranov25 opened 3 years ago
In order to include new regression, classifiers - MLpipeline code to be restructured https://alice.its.cern.ch/jira/browse/ATO-459
quantiles obtained during training time - using appropriate cost function
loss{‘ls’, ‘lad’, ‘huber’, ‘quantile’}, default=’ls’
loss function to be optimized. ‘ls’ refers to least squares regression. ‘lad’ (least absolute deviation) is a highly robust loss function solely based on order information of the input variables. ‘huber’ is a combination of the two. ‘quantile’ allows quantile > regression (use alpha to specify the quantile).
https://towardsdatascience.com/regression-prediction-intervals-with-xgboost-428e0a018b https://heartbeat.fritz.ai/5-regression-loss-functions-all-machine-learners-should-know-4fb140e9d4b0 https://www.evergreeninnovations.co/blog-quantile-loss-function-for-machine-learning/
based on the cost function discussed in the:
describe scalar version - one quantile per Neural network.
Quantile vector implementation in Jupyter notebook: https://github.com/strongio/quantile-regression-tensorflow/blob/master/Quantile%20Loss.ipynb
In general - quantiles should be defined before fitting (not needed in the Scikit -garden - but skgradern not supported anymore)
BDTs and neural nets should be constructed knowing which quantiles are needed
GradientBoostingRegressor wrapper should be added to the list of wrappers i RootInteractive:
https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html https://medium.com/@qucit/a-simple-technique-to-estimate-prediction-intervals-for-any-regression-model-2dd73f630bcb
To be integrated in similar way also QuantileRegressionForest -clone https://scikit-garden.github.io/examples/QuantileRegressionForests/ https://jmlr.org/papers/volume7/meinshausen06a/meinshausen06a.pdf