Open songwang41 opened 4 years ago
Are you sure models/longformer-large-4096
is the triviaqa pretrained checkpoint or the vanilla longformer?
I'm running into the same issue, did anyone find what was wrong?
according to cheatsheet.txt
:
--save_prefix triviaqa-longformer-large \ # pretrained pytorch-lighting checkpoint
--model_path path/to/pretrained/longformer-large-4096 \ # loaded but not used
but from @ibeltagy 's comment, it seems like the checkpoint should be in --model_path
, which one is correct?
--model_path
expects a path to a directory containing a config.json file, so I can't provide the downloaded checkpoint to that flag. But if I provide it to --save_prefix
, following the cheatsheet's instructions, I get very low results, similar to @songwanguw
ok it seems like --save_prefix
is ignored, not --model_path
.
now I tried grabbing the downloaded triviaqa-longformer-large/checkpoints/_ckpt_epoch_4_v2.ckpt
, renaming it to pytorch_model.bin
and placing it in the folder of --model_path
(which overwrites the vanilla model that I had there).
However I still get very low values. Any ideas of what to try?
solved. Indeed, those very low results come from a model that wasn't fine-tuned for triviaqa. To properly load the provided checkpoint, follow cheatsheet.txt with these exceptions:
--saved_prefix choose-a-name-for-output-dir
--model_path path/to/pretrained/longformer-large-4096 # path to folder of downloaded model pretrained with Masked LM, creating your own roberta-large-4096 following "convert_model_to_long.ipynb" will not work here
--resume_ckpt path/to/triviaqa-longformer-large/checkpoints/fixed_ckpt_epoch_4_v2.ckpt # path to downloaded model finetuned for triviaqa
however, fixed_ckpt_epoch_4_v2.ckpt
will fail to load. To fix, use a python console to load the file (with torch.load()
), apply these changes and save them (with torch.save()
):
checkpoint["state_dict"]["model.embeddings.position_ids"] = torch.arange(4098).to('cuda').unsqueeze(0)
checkpoint["checkpoint_callback_best_model_path"]="" # some versions of pytorch lightning may not need this
I'll create a pull request adding these comments to cheatsheet.txt
during evaluation, I only got this score
{'exact_match': 0.025021894157387713, 'f1': 4.5473948151449575, 'common': 7993, 'denominator': 7993, 'pred_len': 7993, 'gold_len': 7993}