Closed nguyenhuuthuat09 closed 3 years ago
By the way, I tried to retrain bert-base-cased model and the F1 and Ign_f1 score on the dev set are 60.934 and 59.023, respectively. But when I change below line:
https://github.com/wzhouad/ATLOP/blob/9c6f7585042e689e7e6b2293e065f89dc52176f8/train.py#L220
to
dev_score, dev_output = evaluate(args, model, test_features, tag="test")
to evaluate saved model on the test set. The result I got on test is realy low: {'test_F1': 0.10317699153088863, 'test_F1_ign': 0.10317699148653238}
(they are quite close).
Am I wrong when changing code like that? And do you know why this error is happening?
Thank you so much.
I've released the trained models.
60.934 is a normal result. I report the result of bert-base as 61.09 +- 0.16, and 60.934 is around 1 std below average.
The only way to get test results is to upload result.json to Colab. The downloaded DocRED test data does not provide the ground truth label so evaluating on it leads to undefined behavior.
Thank you so so much for releasing the pre-trained model! I am truly grateful for that. And I was able to successfully get correct model's result on test set from Codalab (I think you may have a small typo here and in the readme.md, too). I will close this issue.
Once again, thank you so much!
Hi. Thank you for releasing the codes of your model, it is really helpful.
However I tried to retrain ATLOP based on the bert-base-cased model on the DocRED dataset but I can't get high result as your result on the paper. And I can't retrain roberta-large model because I don't have strong enough GPU (strongest GPU on Google Colab is V100). So can you please release your trained model. I would be very very happy if you can release your model, and I believe that it can help many other people, too.
Thank you so much.