pytorch / text

Models, data loaders and abstractions for language processing, powered by PyTorch
https://pytorch.org/text
BSD 3-Clause "New" or "Revised" License
3.49k stars 813 forks source link

unzip error when using glove embeddings from torchtext #2185

Open lasgel opened 1 year ago

lasgel commented 1 year ago

πŸ› Bug

Describe the bug A clear and concise description of what the bug is. When using GloVe from torchtext.vocab an error occurs saying that

zipfile.BadZipFile: File is not a zip file

To Reproduce Steps to reproduce the behavior:

  1. Go to a directory where torchtext has not been used (meaning that there is no .vector_cache)
  2. from torchtext.vocab import GloVe
  3. glove = GloVe(name='6B', dim=50) or use any other valid combination of name and dim
  4. See error

Expected behavior The server at Stanford university from where it is downloaded is down till 3rd of July, so instead of trying to unzip a zip file where just a 404-page is stored (and is no zip archive either) one would expect to get a message that the download could not be completed

Environment torchtext Please copy and paste the output from our environment collection script (or fill out the checklist below manually).

You can get the script and run it with:

wget https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py
# For security purposes, please check the contents of collect_env.py before running it.
python collect_env.py
python -c "import torchtext; print(\"torchtext version is \", torchtext.__version__)"

Additional context Add any other context about the problem here.