预训练问题 - Githubissues

FlagOpen / FlagEmbedding

Retrieval and Retrieval-augmented LLMs

MIT License

7.19k stars 522 forks source link

预训练问题 #780

Open LLLiHaotian opened 5 months ago

LLLiHaotian commented 5 months ago

如果想基于RetroMAE预训练bart、t5系列的模型，应该如何解决呢？

bart-base-chinese-cluecorpussmall-retromae_batch256_max350.log

staoxiao commented 5 months ago

Currently, this script doesn't support encoder-decoder architecture.

LLLiHaotian commented 5 months ago

好的谢谢还有想请问，在预训练过程中的report是这样的，在您的预训练实验中是如何判断何时停止的呢？仅凭loss曲线的变化吗？ {'loss': 2.8222, 'learning_rate': 1.1313075087080098e-05, 'step': 103000, 'epoch': 1.3} 另外我注意到，在训练过程中也会偶尔出现loss值变高（但不明显，很小的变化）的情况，请问你们在预训练的过程中是否遇到过这种情况，又是如何判断何时停止的呢？