Closed SergeySakharovskiy closed 1 year ago
Yes this is a known bug, please see how to deal with it before the official fix comes in this issue : #438
@Optimox thank you, that works. I will define the custom rmsle as you suggested:
from pytorch-tabnet.metrics import Metric
class my_RMSLE(Metric):
"""
Mean squared logarithmic error regression loss.
Scikit-implementation:
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.mean_squared_log_error.html
Note: In order to avoid error, negative predictions are clipped to 0.
This means that you should clip negative predictions manually after calling predict.
"""
def __init__(self):
self._name = "working_rmsle"
self._maximize = False
def __call__(self, y_true, y_score):
"""
Compute RMSLE of predictions.
Parameters
----------
y_true : np.ndarray
Target matrix or vector
y_score : np.ndarray
Score matrix or vector
Returns
-------
float
RMSLE of predictions vs targets.
"""
y_score = np.clip(y_score, a_min=0, a_max=None)
return mean_squared_log_error(y_true, y_score, squared=False)
TabNetRegressor calculates MSLE rather than RMSLE when eval_metric is set to ['rmsle']
epoch 28 | loss: 844.81883| train_rmsle: 0.09658 | valid_rmsle: 0.09734 | 0:03:29s epoch 29 | loss: 843.59653| train_rmsle: 0.09634 | valid_rmsle: 0.09707 | 0:03:36s Expected behavior
It seems this line of code https://github.com/dreamquark-ai/tabnet/blob/fc59ea61139228440d2063ead9db42f656d84ff7/pytorch_tabnet/metrics.py#L403 should have squared=False.
It gives the correct score when clf.best_cost is square rooted:
TabNet VALID SCORE RMSLE: 0.09658693470083708 SKLEARN VALID SCORE RMSLE: 0.309868387923609 TabNet best score + numpy square root RMSLE: [0.31078439]