PlusLabNLP / DEGREE

Code for our NAACL-2022 paper DEGREE: A Data-Efficient Generation-Based Event Extraction Model.
Apache License 2.0
72 stars 12 forks source link

Unable to reappear? #5

Closed OPilgrim closed 1 year ago

OPilgrim commented 1 year ago

Dear Author, I used the DEGREE_eae_ace05e.mdl you provided and the script degree/eval_pipelineEE.py -ceae config/ config_degree_eae_ace05e.json-eae [eae_model] -g, ace05e was tested, and the following is the result. Why is it so low? Did I make a mistake? image

ihungalexhsu commented 1 year ago

Can you please check whether your environment setting is the same as we stated and whether the data is generated correctly?

I just re-evaluate again and get this:

Test: 100%|████████████████████████████████| 70/70 [01:13<00:00,  1.06s/it]
---------------------------------------------------------------------
Trigger I  - P: 100.00 ( 403/ 403), R: 100.00 ( 403/ 403), F: 100.00
Trigger C  - P: 100.00 ( 403/ 403), R: 100.00 ( 403/ 403), F: 100.00
---------------------------------------------------------------------
Role I     - P:  75.93 ( 429/ 565), R:  76.47 ( 429/ 561), F:  76.20
Role C     - P:  73.63 ( 416/ 565), R:  74.02 ( 416/ 562), F:  73.82
---------------------------------------------------------------------

It seems that both the gold trigger instance number (424) and the gold argument instance number (671) are different from mine.

OPilgrim commented 1 year ago

Sorry, the above picture is ace05ep, this picture is ace05e. image

And then this is my environment:

Package                 Version
----------------------- -----------
absl-py                 1.2.0
asttokens               2.0.8
backcall                0.2.0
beautifulsoup4          4.9.3
bs4                     0.0.1
cachetools              5.2.0
certifi                 2022.9.24
charset-normalizer      2.1.1
click                   8.1.3
decorator               5.1.1
executing               1.1.1
filelock                3.8.0
google-auth             2.12.0
google-auth-oauthlib    0.4.6
grpcio                  1.49.1
idna                    3.4
importlib-metadata      5.0.0
ipdb                    0.13.9
ipython                 8.5.0
jedi                    0.18.1
joblib                  1.2.0
lxml                    4.6.3
Markdown                3.4.1
MarkupSafe              2.1.1
matplotlib-inline       0.1.6
numpy                   1.23.3
oauthlib                3.2.1
packaging               21.3
parso                   0.8.3
pexpect                 4.8.0
pickleshare             0.7.5
Pillow                  9.2.0
pip                     22.2.2
prompt-toolkit          3.0.31
protobuf                3.19.6
ptyprocess              0.7.0
pure-eval               0.2.2
pyasn1                  0.4.8
pyasn1-modules          0.2.8
Pygments                2.13.0
pyparsing               3.0.9
regex                   2022.9.13
requests                2.28.1
requests-oauthlib       1.3.1
rsa                     4.9
sacremoses              0.0.53
sentencepiece           0.1.95
setuptools              63.4.1
six                     1.16.0
soupsieve               2.3.2.post1
stack-data              0.5.1
stanza                  1.2
tensorboard             2.10.1
tensorboard-data-server 0.6.1
tensorboard-plugin-wit  1.8.1
tensorboardX            2.4
tokenizers              0.8.1rc2
toml                    0.10.2
torch                   1.8.0+cu111
torchaudio              0.8.0
torchvision             0.9.0+cu111
tqdm                    4.64.1
traitlets               5.4.0
transformers            3.1.0
typing_extensions       4.4.0
urllib3                 1.26.12
wcwidth                 0.2.5
Werkzeug                2.2.2
wheel                   0.37.1
zipp                    3.9.0
ihungalexhsu commented 1 year ago

Not sure what happens on your end. There is the results I got from re-inference:

Test: 100%|████████████████████████████████| 34/34 [00:38<00:00,  1.14s/it]
---------------------------------------------------------------------
Trigger I  - P: 100.00 ( 424/ 424), R: 100.00 ( 424/ 424), F: 100.00
Trigger C  - P: 100.00 ( 424/ 424), R: 100.00 ( 424/ 424), F: 100.00
---------------------------------------------------------------------
Role I     - P:  75.04 ( 508/ 677), R:  75.93 ( 508/ 669), F:  75.48
Role C     - P:  72.27 ( 490/ 678), R:  73.03 ( 490/ 671), F:  72.65
---------------------------------------------------------------------

I also update the readme a little bit and maybe you can reference the environment I put there to check whether there's any issue in your environment installation.

ihungalexhsu commented 1 year ago

@OPilgrim Are you able to reproduce the results now? If yes, I'll close this issue.

OPilgrim commented 1 year ago

OK, thanks very much! I'll try again later.

kk19990709 commented 2 months ago

Dear Author, I used the DEGREE_eae_ace05e.mdl you provided and the script degree/eval_pipelineEE.py -ceae config/ config_degree_eae_ace05e.json-eae [eae_model] -g, ace05e was tested, and the following is the result. Why is it so low? Did I make a mistake? image

which model do you use? base or large?

simon-p-j-r commented 2 months ago

I finished this problem, but I forgot how to fixed it. lol. Thanks

ihungalexhsu commented 2 months ago

@kk19990709 can you please double check your execution environment? Thanks