wzhouad / ATLOP

Source code for paper "Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling", AAAI 2021
197 stars 39 forks source link

Can you please release trained model? #10

Closed nguyenhuuthuat09 closed 3 years ago

nguyenhuuthuat09 commented 3 years ago

Hi. Thank you for releasing the codes of your model, it is really helpful.

However I tried to retrain ATLOP based on the bert-base-cased model on the DocRED dataset but I can't get high result as your result on the paper. And I can't retrain roberta-large model because I don't have strong enough GPU (strongest GPU on Google Colab is V100). So can you please release your trained model. I would be very very happy if you can release your model, and I believe that it can help many other people, too.

Thank you so much.

nguyenhuuthuat09 commented 3 years ago

By the way, I tried to retrain bert-base-cased model and the F1 and Ign_f1 score on the dev set are 60.934 and 59.023, respectively. But when I change below line:

https://github.com/wzhouad/ATLOP/blob/9c6f7585042e689e7e6b2293e065f89dc52176f8/train.py#L220

to

dev_score, dev_output = evaluate(args, model, test_features, tag="test")

to evaluate saved model on the test set. The result I got on test is realy low: {'test_F1': 0.10317699153088863, 'test_F1_ign': 0.10317699148653238} (they are quite close).

Am I wrong when changing code like that? And do you know why this error is happening?

Thank you so much.

wzhouad commented 3 years ago

I've released the trained models.

60.934 is a normal result. I report the result of bert-base as 61.09 +- 0.16, and 60.934 is around 1 std below average.

The only way to get test results is to upload result.json to Colab. The downloaded DocRED test data does not provide the ground truth label so evaluating on it leads to undefined behavior.

nguyenhuuthuat09 commented 3 years ago

Thank you so so much for releasing the pre-trained model! I am truly grateful for that. And I was able to successfully get correct model's result on test set from Codalab (I think you may have a small typo here and in the readme.md, too). I will close this issue.

Once again, thank you so much!