PaddlePaddle / RocketQA

🚀 RocketQA, dense retrieval for information retrieval and question answering, including both Chinese and English state-of-the-art models.
Apache License 2.0
764 stars 130 forks source link

训练dual网络,为什么loss越来越大了? #101

Open archwolf118 opened 1 year ago

archwolf118 commented 1 year ago

您好!感谢rocketQA的工作,我已经成功将rocketQA应用到我们的系统中。 为了能提高rocketQA的召回率,我尝试对zh_dureader_de_v2进行finetune,使用的是repo中自带的训练集,下面是训练代码:

import rocketqa
dual_encoder = rocketqa.load_model(model="zh_dureader_de_v2", use_cuda=True, device_id=0, batch_size=4)
dual_encoder.train('./examples/data/dual.train.tsv', 20, 'de_models', save_steps=500, learning_rate=1e-5, log_folder='log_de')

我发现loss是越来越大的,如下图所示,不知道为什么? image 谢谢!

zhangpeng-HEBUT commented 1 year ago

您好,请问您是用几个GPU训练的呀?