Add Regression losses - Githubissues

5uperpalo commented 2 years ago

Add losses that show promising results for regression prediction in scenarios with highly imbalanced datasets used e.g. in Life Time Value prediction

5uperpalo commented 2 years ago

additional notes:

tweedie loss

ZILNloss

after playing around with it I am not sure if this is really helpful, it seems to me that the idea is plausible but it is almost impossible for me to understand the proof in the original paper as:
- the authors are using some 0.68 number in KDD dataset that I could not figure out
- the metrics are hard to understand - they are separated into deciles etc.
nevertheless, I implemented the authors Keras code in pytorch, including their test and compared the original and pywtorch_widedeep implementation results in the [notebook]()

QuantileLoss

I decided to add new method - "multiregression", as the quantileloss function is regressing multiple values at the same time
- implementation is "shamelessly" copied from pytorch-forecasting
- @jrzaurin could you please recheck it? it is passing the unittests but it seems to have "strange" values, that are not adhering to idea that the quantile values should be increasing in each sample - maybe just a lack of training etc.

FINALL UPDATE:

branch is ready to be merged, and I can figure out later how to add whole family of tweedie distributions, ie. their losses, to the library
I also added enforce_positive parameter to wide_deep to fight possible issue with negative input in the initial training with RMSLE or Tweedie losses which require either positive or non-negative input

5uperpalo commented 2 years ago

additional notes for TweedieLoss:

this post only briefly summarizes this paper which calculates Probability Density Function and then expresses Negative Log Likelihood, which is a common function used for Losses in DeepNets thanks to it's nice characteristics in interval (0,1)
this post does, in my opinion, slightly better job in summarizing the information
use of NLL is usually preferred in DeepNets, but other possibility is to use distribution deviance which is not defined as nicely as NLL on interval (0,1), ie. sklearn approach
this post gives additional summary on some(not all) other tweedie family Probability Density Functions, next steps is to do NLL and we have other Loss functions

Summary

jrzaurin / pytorch-widedeep