Reproducing code indicator problems

Dear Author First of all, thank you for providing me a new branch for training the model. I tried to use the trained completed model you provided and got the results consistent with the paper, but I found that the name of the model you provided the trained completed model should be CEM and not IAMM model in the original paper, is there any error in this? Subsequently, I trained the model using the dataset in the ED file provided by you on the basis of consistent parameters, and the cross-validation results were relatively normal, however, in the testing and evaluation stage of the model, I tested and evaluated the model using the dataset provided by you, and the results of the ACC metrics as well as the DIST metrics were quite different from the results of the paper, with the PPL metrics being 35.69, and the ACC metrics being just 51.28, and the The DIST-1 metric is 1.03 and the DIST-2 metric is 3.39. Is there a problem with the dataset used for testing as well as evaluation? I would like to ask you what is the reason for this and hope you can give me some suggestions.

I look forward to your reply and hope to have a chance to learn more from your work.

Thank you for your question. (1) The 'main' branch contains the latest code. Due to code changes, the 'main' branch cannot load older trained models. To help reproduce the model, we created an 'IAMM_bak' branch. This branch is for reference only; we recommend using the 'main' branch. (2) Given the large scale of this dataset, neither previous models nor IAMM used cross-validation. Data splitting in cross-validation can affect performance, leading to differences. Additionally, we didn't use early stopping. Based on experience, we set the number of iterations to 14,400. You can save checkpoints around this iteration and use them to load the model. Please feel free to ask if you have any questions. Thank you.

zhouzhouyang520 / IAMM

Reproducing code indicator problems #4