Reproducing results with pre-trained models from google drive

QipengGuo commented 2 years ago

Thanks for your great work, and I have a few questions about reproducing the results. I followed your steps on "Install Requirements", download your pre-trained models, and processed data from your google drive. The code ran smoothly, but pre-trained models work somehow weirdly, hope to get your help.

I tested the downloaded pre-trained model from your google drive on OntoNotes and Litbank, but the latter get strange results.

My run command is

python main.py experiment=litbank paths.model_dir=../models/onto_best/  model/doc_encoder/transformer=longformer_ontonotes override_encoder=True train=False

The F1 score (58.7) is pretty low, but the more interesting thing is that the Oracle F-score is 0.825. If I understand correctly, this is the upper bound of the F1-score with the mention detection results. I wonder if there are some changes in huggingface models, but hard to track them.

BTW, I also try the command without override_encoder, but the result is less than 10 points.

[2022-05-10 02:59:40,882][HYDRA] Test
[2022-05-10 02:59:40,882][HYDRA] Dataset: LitBank

[2022-05-10 02:59:40,882][HYDRA] Dataset: litbank, Cluster Threshold: 1
[2022-05-10 02:59:40,882][HYDRA] Evaluating on 10 examples
[2022-05-10 02:59:56,783][HYDRA] F-score: 58.7 , MUC: 80.7, Bcub: 60.8, CEAFE: 34.6
[2022-05-10 02:59:56,785][HYDRA] Oracle F-score: 0.825
[2022-05-10 02:59:56,785][HYDRA] /home/ec2-user/git/incremental-coref/fast-coref/models/onto_best/litbank/test.log.jsonl
[2022-05-10 02:59:56,785][HYDRA] Inference time: 15.83
[2022-05-10 02:59:56,785][HYDRA] Max inference memory: 3.5 GB
[2022-05-10 02:59:56,786][HYDRA] Final performance summary at /home/ec2-user/git/incremental-coref/fast-coref/models/onto_best/litbank/perf.json
[2022-05-10 02:59:56,786][HYDRA] Performance summary file: /home/ec2-user/git/incremental-coref/fast-coref/models/onto_best/perf.json

shtoshni commented 2 years ago

The command looks right to me. The ontonotes model should not work well on the LitBank data (see our CRAC paper). Change the experiment=litbank to experiment=ontonotes.

If you want to experiment with LitBank, I would recommend using model/doc_encoder/transformer=longformer_joint and downloading the joint model parameters from Google Drive. The Joint model is trained on OntoNotes, LitBank, and PreCo and performs quite well on LitBank.

Note I have updated the codebase a bit and the new models are slightly better than the results reported earlier. I'm going to release them by this weekend.

QipengGuo commented 2 years ago

Great thanks, it works now. I recommend giving more example commands in the README :).

shtoshni / fast-coref

Reproducing results with pre-trained models from google drive #9