Closed zhiweihu1103 closed 1 year ago
As you can see on Wandb, I confirm that there are no changes made to the code about model and data processing. We have verified that the pre-trained model, environment configuration, code, and data are all consistent. I don't think the CUDA version and NVIDIA driver version should have such a significant impact.
This is a very vexing question because I can't think of any other reason that could cause this problem.
My thought is, perhaps you can make some adjustments to the parameters and then observe if there's a decrease in loss within a few iterations (compared to your previous loss). Maybe consider replacing the optimizer with stochastic gradient descent? I'm not sure.
I'll try adjusting the parameters tomorrow and keep in touch at any time.
I discovered that the folder for the mention images should be named mention_images
instead of mention_image
. This would result in the absence of image data during the training process. This could also be the reason why you were unable to reproduce the original results.
You need to manually modify the following line in the config file for RichpediaMEL.
It is indeed like this, a very subtle problem, thank you very much, I will re-run the code and give the final result.
As a reminder, would you mind uploading the numerical results of Figure 4?
I have just updated the detailed results in the README file.
Great, Thx.
Hi, Pengfei. I have reproduced the results, thanks for your solution. Good luck. I will close the issue.
我似乎无法复现wikimel中的结果,我的超参数文件使用的是作者在GitHub中提供的文件,能麻烦您告诉我您在复现的过程中有什么需要注意的吗
Hi, Pengfei. Nice work. I find I cannot reproduce the RichpediaMEL dataset result,, I use the same yaml as you provided, can you help me? attachment is the training logs. richpediamel.txt