Can't download bart model

bjascob commented 3 years ago

When I run bash run/run_experiment.sh configs/amr2.0-structured-bart-large-sep-voc.sh (or bash tests/minimal_test.sh) I get the following error..

Downloading: "https://github.com/pytorch/fairseq/archive/master.zip" to /home/bjascob/.cache/torch/hub/master.zip
Traceback (most recent call last):
  File "fairseq_ext/preprocess_bartsv.py", line 331, in <module>
    cli_main()
  File "fairseq_ext/preprocess_bartsv.py", line 327, in cli_main
    main(args)
  File "fairseq_ext/preprocess_bartsv.py", line 290, in main
    make_bart_encodings(args, tokenize=tokenize)
  File "/home/bjascob/DataRepoTemp/venv_ibm_parser/lib/python3.8/site-packages/fairseq_ext/extract_bart/binarize_encodings.py", line 152, in make_bart_encodings
    make_binary_bert_features(args, args.trainpref, "train", tokenize)
  File "/home/bjascob/DataRepoTemp/venv_ibm_parser/lib/python3.8/site-packages/fairseq_ext/extract_bart/binarize_encodings.py", line 62, in make_binary_bert_features
    pretrained_embeddings = SentenceEncodingBART(
  File "/home/bjascob/DataRepoTemp/venv_ibm_parser/lib/python3.8/site-packages/fairseq_ext/extract_bart/sentence_encoding.py", line 111, in __init__
    self.model = torch.hub.load('pytorch/fairseq', name)
  File "/home/bjascob/DataRepoTemp/venv_ibm_parser/lib/python3.8/site-packages/torch/hub.py", line 358, in load
    repo_dir = _get_cache_or_reload(github, force_reload, verbose)
...etc...

The download https://github.com/pytorch/fairseq/archive/master.zip is not a valid location. Since this is buried deep in the torch hub download code I'm thinking this is a compatibility issue between the old torch 1.4.0 version being used and the current torch 1.10 version? Any idea the best way to get this to work?

bjascob commented 3 years ago

A hack that seems to work is to change the venv installed torch file hub.py (ie.. /lib/python3.8/site-packages/torch/hub.py). Change line 59 from MASTER_BRANCH = 'master' to MASTER_BRANCH = 'main' Doing this allows bash tests/minimal_test.sh to download models and finish successfully. Per the pytorch/fairseq readme, they just renamed the branch from master to main in September 2021.

Hacking the torch library doesn't seem like the "right" fix. Maybe there is a better place to override this?

ramon-astudillo commented 3 years ago

This is for the current master i.e. v0.5.1 right? this should use torch 1.4, are using this?

bjascob commented 3 years ago

Yes. my torch==1.4 and using the latest (0.5.1) branch of this project. Doing a little digging, here's what has happened... Torch 1.4 has in hub.py a define for MASTER_BRANCH = 'master' In newer torch versions (ie the latest is 1.10) they added logic in hub.py that says try branch main first and then try branch master.
In September of this year fairseq changed their github project branch naming from master to main so torch 1.4 logic will not work with that repo but newer versions (ie.. torch 1.10) will.

IBM / transition-amr-parser

Can't download bart model #18