about loss - Githubissues

sangmichaelxie / doremi

Pytorch implementation of DoReMi, a method for optimizing the data mixture weights in language modeling datasets

https://arxiv.org/abs/2305.10429

MIT License

286 stars 32 forks source link

Closed ywb2018 closed 1 year ago

ywb2018 commented 1 year ago

help please.
1、why excess loss do not follow the paper： max(excess-loss, 0)。

sangmichaelxie commented 1 year ago