Questions about model training to replicate results

BorgDiven commented 9 months ago

Hello,

I am working with your paper and trying to reproduce the results. I have trained the model as described, but I am unable to achieve the same test metrics reported in your paper when training the model weights myself.

I am reaching out to ask for any additional details on how you trained your models to achieve your reported results. Are there other training parameters or configuration details that may explain the difference in results?

Any insights you can provide around the model training process would be greatly appreciated. Please let me know if you have any other suggestions on what might explain or close the gap between my training results and what is reported. I look forward to hearing your thoughts!

Best regards

xiaoye-hhh commented 9 months ago

Thank you for your attention. There may be some fluctuations in network performance. I retrain the model today.You can download the log file for comparison and inspection：https://pan.baidu.com/s/1UK3tVK4rXneIhuVMjXtH2A?pwd=icei 【icei】

yuanc3 commented 7 months ago

您好，对比了一下，训练结果相比论文低了5个点左右，观察您上面的网盘链接发现，每给epoch都能有150的interation，但是按照代码中的dataset处理后len(trainloader)为138，但是len（engine.state.dataloader）为292，不知是哪里有问题

xiaoye-hhh commented 7 months ago

你有做什么修改嘛？

yuanc3 commented 7 months ago

没有修改，就是目前github上这个版本

xiaoye-hhh commented 7 months ago

你可以将日志发给我看看1318137184@qq.com。我暂时也不知道什么问题。

xiaoye-hhh commented 7 months ago

遇到相同问题，可以核对一下环境：pytorch-ignite==0.1.2

xiaoye-hhh / SAAI

Questions about model training to replicate results #6