pytorch / text

Models, data loaders and abstractions for language processing, powered by PyTorch
https://pytorch.org/text
BSD 3-Clause "New" or "Revised" License
3.51k stars 810 forks source link

Unable to vocab_from_file from a HTTPResponse #1004

Open astaff opened 4 years ago

astaff commented 4 years ago

🐛 Bug

To Reproduce

import urllib
from torchtext.experimental.vocab import vocab_from_file

with urllib.request.urlopen("https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-vocab.txt") as f:
    vocab_from_file(f)

Gives AttributeError: 'HTTPResponse' object has no attribute 'name'

Expected behavior

being able to vocab_from_file from a file-like object HTTPResponse

zhangguanheng66 commented 4 years ago

Those functions were created to process local files and we haven't tested them against the http request. Worth some investigation later. An alternative way is to use the download func or wget.

astaff commented 4 years ago

We can clarify docstring then. Currently it says file_object (FileObject): a file like object to read data from. which refers to https://docs.python.org/3/glossary.html#term-file-object.