dmis-lab / biobert

Bioinformatics'2020: BioBERT: a pre-trained biomedical language representation model for biomedical text mining
http://doi.org/10.1093/bioinformatics/btz682
Other
1.93k stars 451 forks source link

BioBert pre-trained weights and models are out of reach and download #185

Closed vahab-mspour closed 1 year ago

vahab-mspour commented 1 year ago

BioBert pre-trained weights and models are out of reach and can not download them. All kinks to pre-trained weights are deactivated.

Downloading them from google faces with this error:

--2022-12-07 07:59:22-- https://docs.google.com/uc?export=download&confirm=&id=1R84voFKHfWV9xjzeLzWBbmY1uOMYpnyD Resolving docs.google.com (docs.google.com)... 142.251.162.138, 142.251.162.102, 142.251.162.113, ... Connecting to docs.google.com (docs.google.com)|142.251.162.138|:443... connected. HTTP request sent, awaiting response... 404 Not Found 2022-12-07 07:59:22 ERROR 404: Not Found.

my download code is:

!wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=1R84voFKHfWV9xjzeLzWBbmY1uOMYpnyD' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=1R84voFKHfWV9xjzeLzWBbmY1uOMYpnyD" -O biobert_weights && rm -rf /tmp/cookies.txt

wonjininfo commented 1 year ago

Hi, It seems like my school has ceased to provide google drive access.

We have made a mirror at the following URLs: http://nlp.dmis.korea.edu/projects/biobert-2020-checkpoints/biobert_v1.1_pubmed.tar.gz http://nlp.dmis.korea.edu/projects/biobert-2020-checkpoints/NERdata.zip http://nlp.dmis.korea.edu/projects/biobert-2020-checkpoints/REdata.zip http://nlp.dmis.korea.edu/projects/biobert-2020-checkpoints/QA.zip

Thank you. Best, Wonjin

wonjininfo commented 1 year ago

Hi @vahab-mspour Is your local repo up to date? I think I updated the link in download.sh to a new (valid) one a few weeks ago.

JaskaranKaurGill commented 1 year ago

Not sure how to use download.sh in a colab notebook. @vahab-mspour were you able to fix the issue? Could you help me with a replacement code to download the weights instead of the following !wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=1R84voFKHfWV9xjzeLzWBbmY1uOMYpnyD' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=1R84voFKHfWV9xjzeLzWBbmY1uOMYpnyD" -O biobert_weights && rm -rf /tmp/cookies.txt

vahab-mspour commented 1 year ago

@wonjininfo Thank you. It solved my problem. @JaskaranKaurGill yes, it fixed my problem in colab.

For download and unzip you can use the following two commands:

"/content/" is the download and working path of my colab environment. should replace with your download path