Open LZUzgw opened 4 hours ago
I successfully reproduced this code. However, the loss is ‘nan’. What should I do? Is there anyone having the same problem as me?
训练代码,不是按照作者的网络结构配置的,参数缩小了很多,主要是示意可以正确运行。 你要去找个大GPU,然后简单的把那有限几个参数设置的和作者的一样大,然后训练。
thanks
I successfully reproduced this code. However, the loss is ‘nan’. What should I do? Is there anyone having the same problem as me?