dreamquark-ai / tabnet

PyTorch implementation of TabNet paper : https://arxiv.org/pdf/1908.07442.pdf
https://dreamquark-ai.github.io/tabnet/
MIT License
2.65k stars 488 forks source link

Loss goes to -inf #515

Closed CTO-yai closed 1 year ago

CTO-yai commented 1 year ago

Describe the bug When training pretraining phase on data, in the UnsupervisedLoss calculation, there is a batch normalization per feature. if a feature std is 0, the normalization uses the mean value. In my training case, the mean value ended up being small and negative. so that turned my loss to be extremely large in absolute values, and negative. pushing my network to exploit that on features with general negative mean, enlarging the error on those features to reduce the loss.

This from my experience is not intended by the training process. It is not that the negative loss is a problem by definition, but when this behavior's is found, lower loss doesn't mean better reconstruction at the end - which misses the point..

What is the current behavior? Loss going to -1e17 without good predictions.

If the current behavior is a bug, please provide the steps to reproduce.

Expected behavior

I don't think filling the batch norm with mean values when std is zero is the best choice.. at minimum, we can add torch.abs on the mean values to maintain positive components in the loss, but maybe removing the normalization completly with 1 values is better. I'm not sure, but that should be debatable.

Screenshots

image

Other relevant information: poetry version:
python version: Operating System: Additional tools:

Additional context

Optimox commented 1 year ago

Why would you let a feature with only one modality in your training data ?