Closed TornadoXuRocket closed 1 year ago
Hi Jing, I can not directly give any suggestion according to the information you pasted. And I also noticed that you are not using the complete codes we provided. So please update more information about your reimplementation (like learning rate, optimizer type, etc.). I am happy to help you.
Thank you very much for your response. I just used your complete codes and run the script run_nyt.sh which is provided by you. My only change is to close the wandb service. Could you give me some suggestions, please? Thanks again.
Sorry, can you give me more details? I cannot figure out what problem caused this from the information you provided. But from the results, especially the loss, it looks like the model have not be well convergent.
Sorry again. I didn't notice the learning rate is provided in your picture, also forgot the runtime summary is the output from transformers trainer (which I have thought is generated by yourself).
The learning rate looks not correct in your picture, please double check it.
Thank you very much. I will check them carefully. Best regards.
I run the script run_webnlg.sh and the performance is also much worse than the results reported in the paper. I carefully have checked the learning rate and the training losses. At the beginning of training, the training information is shown as follows: When the epoch is about 20, the training information is shown as follows: As for the end of the training, the learning rate has decreased to a very small value due to the weight decay and the training information is shown as follows: May I ask if this training process is normal? Thank you again.
This is strange, I tried bash run_nyt.sh
and got a fine result at 18 epochs, maybe there is something wrong with your experimental setup.
I run the script run_webnlg.sh and the performance is also much worse than the results reported in the paper. I carefully have checked the learning rate and the training losses. At the beginning of training, the training information is shown as follows: When the epoch is about 20, the training information is shown as follows: As for the end of the training, the learning rate has decreased to a very small value due to the weight decay and the training information is shown as follows: May I ask if this training process is normal? Thank you again.
Hi, The training procedure looks good. Do you use the dataset directly downloaded from TPLinker?
This is strange, I tried
bash run_nyt.sh
and got a fine result at 18 epochs, maybe there is something wrong with your experimental setup.
Thank you. I will check my experimental setup again.
I run the script run_webnlg.sh and the performance is also much worse than the results reported in the paper. I carefully have checked the learning rate and the training losses. At the beginning of training, the training information is shown as follows: When the epoch is about 20, the training information is shown as follows: As for the end of the training, the learning rate has decreased to a very small value due to the weight decay and the training information is shown as follows: May I ask if this training process is normal? Thank you again.
Hi, The training procedure looks good. Do you use the dataset directly downloaded from TPLinker?
Yes, I followed TPLinker to download and preprocess the dataset which is provided by CasRel. Maybe there is something wrong with my data preprocessing. Would it be convenient for you to provide me with your preprocessed data set by email? Thank you very much.
Of course. Here is the link. Actually the uploaded data is directly obtained from the preprocessed data of TPLinker.
Thank you again. Best regards!
Hello, thank you very much for the code release. I have used your code to train and test on nyt_star. Why is the performance much worse than that provided in your paper?