GCYZSL / MoLA

110 stars 7 forks source link

请问load_balancing_loss_func为什么从原版改成现在这种分层累加计算的呢 #6

Closed 2018211801 closed 6 months ago

2018211801 commented 6 months ago

作者你好,我在把代码改成数据并行分布式时,在aux_loss计算这里出错错误,数据的维度对不上。If you could give any advice, I would greatly appreciate it!

GCYZSL commented 6 months ago

您好,如果您能提供更多细节,比如报错内容,您的具体修改,和当前维度等细节信息,我可以更好的提供我的想法。谢谢! 您可以按照原版的load_balancing_loss_func。我修改的原因是,我感觉huggingface写的balance loss和原版的paper对不上,可能有问题。

Hi, could you provide more details related to the error, e.g., the error message, the modification, the difference between the current and previous dimensions, and so on? That would help me to understand the problem. Thanks!