qhfan / RMT

(CVPR2024)RMT: Retentive Networks Meet Vision Transformer
282 stars 20 forks source link

Results cannot be reproduced #25

Closed LMMMEng closed 4 months ago

LMMMEng commented 4 months ago

Thank you for your excellent work.

RMT has demonstrated remarkable performance on ImageNet-1K. For instance, the RMT-S model achieves an outstanding top-1 accuracy of 84.1 at a complexity of 27M&4.5G. In my experience, attaining such high performance at this level of complexity is exceedingly challenging. However, upon reproducing the results using the provided code, I encountered a considerable disparity in the experimental outcomes. The reproduced RMT-S model yielded a top-1 accuracy of only 83.2, which differed significantly from the reported results in the paper. Could you kindly provide a detailed training log and the pre-training model as a reference? Sharing comprehensive open-source information would greatly benefit the community, especially considering the exceptional results reported in your paper.

qhfan commented 4 months ago

Thank you for your interest in our work. I have just uploaded the complete training logs, and you can refer to the test_ema_acc1 column to see our experimental results. Except for RMT-T, whose best results can be found in the test_acc1 column. Regarding the replication issue, your results differ significantly from ours. I am more inclined to believe that there might be some issues with the ImageNet dataset you downloaded. For instance, when one of my junior colleagues tried to replicate the results, he used a corrupted dataset, which led to poor results. As for the training weights, we are currently developing RMT++ based on the original training model. We will open-source the RMT weights together when it is ready.

LMMMEng commented 4 months ago

Thank you so much for your prompt response. I'm certain that the dataset is not corrupted, as some classical models like Swin and ConvNeXt can be successfully reproduced. I will investigate the issues further.

LMMMEng commented 4 months ago

Hi, upon reviewing the code, training log, and paper, I've identified some discrepancies. For example, the RMT_S model provided has approximately 25.71M parameters, whereas the paper and training log report around 27M parameters. Could you please verify whether the provided code is accurate?

image

Snowwwwwwwwwww commented 4 months ago

You may address the author's other projects, but none of these projects have provided complete details, such as pre-trained weights. Despite numerous GitHub issues, the authors have chosen to disregard them😅. This raises doubts about whether the authors actually conducted any experiments for their published works🤨.

Authors should prioritize taking responsibility for their open-source projects, particularly in the context of papers published in prestigious conferences. @qhfan

qhfan commented 4 months ago

Thank you for your attention to our work. Since the code and weights were organized by someone else, there might inevitably be some discrepancies. I have updated the repository with the original code and weights. Furthermore, I do not believe that it is very polite to casually question someone else's work, especially when we have already made the training logs publicly available.@Snowwwwwwwwwww

LMMMEng commented 4 months ago

Thanks a lot! @qhfan

YuHengsss commented 3 months ago

Thanks a lot! @qhfan

May I ask have you reproduced the reported accuracy now?

LMMMEng commented 2 months ago

@YuHengsss Apologies for the late reply. I've stopped trying to replicate the results.