Classification loss does not converge

ZikangZhou / HiVT

[CVPR 2022] HiVT: Hierarchical Vector Transformer for Multi-Agent Motion Prediction

Apache License 2.0

577 stars 115 forks source link

Yeah, this is indeed an issue, and I feel that cross-entropy loss is not a good choice. For the classification loss, you can try to optimize the NLL of the goal point's mixture of Laplace (detaching the gradient of y_hat and optimizing the classification score only). For the regression part, you can still use the winner-take-all LaplaceNLL loss. I found that this can improve the Brier score a lot. But ultimately, ensembling is the best way to improve the classification score. Some top-ranking methods like WayFormer even ensemble 15 models. However, it is impractical to do ensembling on a real autonomous car, and using ensembling causes unfair comparisons.

ZikangZhou / HiVT

Classification loss does not converge #15