allenai / allennlp

An open-source NLP research library, built on PyTorch.
http://www.allennlp.org
Apache License 2.0
11.75k stars 2.25k forks source link

Unable to open file #5359

Closed lenyabloko closed 3 years ago

lenyabloko commented 3 years ago

Checklist

Description

Python traceback:

``` OSError Traceback (most recent call last) in () 1 from allennlp.predictors.predictor import Predictor 2 import allennlp_models.structured_prediction ----> 3 predictor = Predictor.from_path("https://storage.googleapis.com/allennlp-public-models/elmo-constituency-parser-2020.02.10.tar.gz") 25 frames /usr/local/lib/python3.7/dist-packages/h5py/_hl/files.py in make_fid(name, mode, userblock_size, fapl, fcpl, swmr) 188 if swmr and swmr_support: 189 flags |= h5f.ACC_SWMR_READ --> 190 fid = h5f.open(name, flags, fapl=fapl) 191 elif mode == 'r+': 192 fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl) h5py/_objects.pyx in h5py._objects.with_phil.wrapper() h5py/_objects.pyx in h5py._objects.with_phil.wrapper() h5py/h5f.pyx in h5py.h5f.open() OSError: Unable to open file (truncated file: eof = 335544320, sblock->base_addr = 0, stored_eof = 374434792) ```

Related issues or possible duplicates

Environment

OS: Google Colab

Python version: 3.7

Output of pip freeze:

``` absl-py==0.12.0 alabaster==0.7.12 albumentations==0.1.12 allennlp==2.6.0 allennlp-models==2.6.0 altair==4.1.0 appdirs==1.4.4 argcomplete==1.12.3 argon2-cffi==20.1.0 arviz==0.11.2 astor==0.8.1 astropy==4.3.1 astunparse==1.6.3 async-generator==1.10 atari-py==0.2.9 atomicwrites==1.4.0 attrs==21.2.0 audioread==2.1.9 autograd==1.3 Babel==2.9.1 backcall==0.2.0 backports.csv==1.0.7 beautifulsoup4==4.6.3 bleach==4.0.0 blis==0.4.1 bokeh==2.3.3 boto3==1.18.21 botocore==1.21.21 Bottleneck==1.3.2 branca==0.4.2 bs4==0.0.1 CacheControl==0.12.6 cached-property==1.5.2 cachetools==4.2.2 catalogue==1.0.0 certifi==2021.5.30 cffi==1.14.6 cftime==1.5.0 chardet==3.0.4 charset-normalizer==2.0.4 checklist==0.0.11 cheroot==8.5.2 CherryPy==18.6.1 click==7.1.2 cloudpickle==1.3.0 cmake==3.12.0 cmdstanpy==0.9.5 colorcet==2.0.6 colorlover==0.3.0 community==1.0.0b1 configparser==5.0.2 conllu==4.4 contextlib2==0.5.5 convertdate==2.3.2 coverage==3.7.1 coveralls==0.5 crcmod==1.7 cryptography==3.4.7 cufflinks==0.17.3 cupy-cuda101==9.1.0 cvxopt==1.2.6 cvxpy==1.0.31 cycler==0.10.0 cymem==2.0.5 Cython==0.29.24 daft==0.0.4 dask==2.12.0 datascience==0.10.6 debugpy==1.0.0 decorator==4.4.2 defusedxml==0.7.1 descartes==1.1.0 dill==0.3.4 distributed==1.25.3 dlib @ file:///dlib-19.18.0-cp37-cp37m-linux_x86_64.whl dm-tree==0.1.6 docker-pycreds==0.4.0 docopt==0.6.2 docutils==0.17.1 dopamine-rl==1.0.5 earthengine-api==0.1.277 easydict==1.9 ecos==2.0.7.post1 editdistance==0.5.3 en-core-web-sm @ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.5/en_core_web_sm-2.2.5.tar.gz entrypoints==0.3 ephem==4.0.0.2 et-xmlfile==1.1.0 fa2==0.3.5 fastai==1.0.61 fastdtw==0.3.4 fastprogress==1.0.0 fastrlock==0.6 fbprophet==0.7.1 feather-format==0.4.1 feedparser==6.0.8 filelock==3.0.12 firebase-admin==4.4.0 fix-yahoo-finance==0.0.22 Flask==1.1.4 flatbuffers==1.12 folium==0.8.3 ftfy==6.0.3 future==0.16.0 gast==0.4.0 GDAL==2.2.2 gdown==3.6.4 gensim==3.6.0 geographiclib==1.52 geopy==1.17.0 gin-config==0.4.0 gitdb==4.0.7 GitPython==3.1.18 glob2==0.7 google==2.0.3 google-api-core==1.26.3 google-api-python-client==1.12.8 google-auth==1.34.0 google-auth-httplib2==0.0.4 google-auth-oauthlib==0.4.5 google-cloud-bigquery==1.21.0 google-cloud-bigquery-storage==1.1.0 google-cloud-core==1.7.2 google-cloud-datastore==1.8.0 google-cloud-firestore==1.7.0 google-cloud-language==1.2.0 google-cloud-storage==1.41.1 google-cloud-translate==1.5.0 google-colab @ file:///colabtools/dist/google-colab-1.0.0.tar.gz google-crc32c==1.1.2 google-pasta==0.2.0 google-resumable-media==1.3.3 googleapis-common-protos==1.53.0 googledrivedownloader==0.4 graphql-core==3.1.5 graphviz==0.10.1 greenlet==1.1.1 grpcio==1.34.1 gspread==3.0.1 gspread-dataframe==3.0.8 gym==0.17.3 h5py==3.1.0 HeapDict==1.0.1 hijri-converter==2.1.3 holidays==0.10.5.2 holoviews==1.14.5 html5lib==1.0.1 httpimport==0.5.18 httplib2==0.17.4 httplib2shim==0.0.3 huggingface-hub==0.0.12 humanize==0.5.1 hyperopt==0.1.2 ideep4py==2.0.0.post3 idna==2.10 imageio==2.4.1 imagesize==1.2.0 imbalanced-learn==0.4.3 imblearn==0.0 imgaug==0.2.9 importlib-metadata==4.6.3 importlib-resources==5.2.2 imutils==0.5.4 inflect==2.1.0 iniconfig==1.1.1 install==1.3.4 intel-openmp==2021.3.0 intervaltree==2.1.0 ipykernel==4.10.1 ipython==5.5.0 ipython-genutils==0.2.0 ipython-sql==0.3.9 ipywidgets==7.6.3 iso-639==0.4.5 itsdangerous==1.1.0 jaraco.classes==3.2.1 jaraco.collections==3.4.0 jaraco.functools==3.3.0 jaraco.text==3.5.1 jax==0.2.17 jaxlib @ https://storage.googleapis.com/jax-releases/cuda110/jaxlib-0.1.69+cuda110-cp37-none-manylinux2010_x86_64.whl jdcal==1.4.1 jedi==0.18.0 jieba==0.42.1 Jinja2==2.11.3 jmespath==0.10.0 joblib==1.0.1 jpeg4py==0.1.4 jsonnet==0.17.0 jsonschema==2.6.0 jupyter==1.0.0 jupyter-client==5.3.5 jupyter-console==5.2.0 jupyter-core==4.7.1 jupyterlab-pygments==0.1.2 jupyterlab-widgets==1.0.0 kaggle==1.5.12 kapre==0.3.5 Keras==2.4.3 keras-nightly==2.5.0.dev2021032900 Keras-Preprocessing==1.1.2 keras-vis==0.4.1 kiwisolver==1.3.1 korean-lunar-calendar==0.2.1 librosa==0.8.1 lightgbm==2.2.3 llvmlite==0.34.0 lmdb==0.99 LunarCalendar==0.0.9 lxml==4.2.6 Markdown==3.3.4 MarkupSafe==2.0.1 matplotlib==3.2.2 matplotlib-inline==0.1.2 matplotlib-venn==0.11.6 missingno==0.5.0 mistune==0.8.4 mizani==0.6.0 mkl==2019.0 mlxtend==0.14.0 more-itertools==8.8.0 moviepy==0.2.3.5 mpmath==1.2.1 msgpack==1.0.2 multiprocess==0.70.12.2 multitasking==0.0.9 munch==2.5.0 murmurhash==1.0.5 music21==5.5.0 natsort==5.5.0 nbclient==0.5.3 nbconvert==5.6.1 nbformat==5.1.3 nest-asyncio==1.5.1 netCDF4==1.5.7 networkx==2.6.2 nibabel==3.0.2 nltk==3.2.5 notebook==5.3.1 numba==0.51.2 numexpr==2.7.3 numpy==1.19.5 nvidia-ml-py3==7.352.0 oauth2client==4.1.3 oauthlib==3.1.1 okgrade==0.4.3 opencv-contrib-python==4.1.2.30 opencv-python==4.1.2.30 openpyxl==2.5.9 opt-einsum==3.3.0 osqp==0.6.2.post0 overrides==3.1.0 packaging==21.0 palettable==3.3.0 pandas==1.1.5 pandas-datareader==0.9.0 pandas-gbq==0.13.3 pandas-profiling==1.4.1 pandocfilters==1.4.3 panel==0.12.1 param==1.11.1 parso==0.8.2 pathlib==1.0.1 pathtools==0.1.2 patsy==0.5.1 patternfork-nosql==3.6 pdfminer.six==20201018 pep517==0.11.0 pexpect==4.8.0 pickleshare==0.7.5 Pillow==7.1.2 pip-tools==6.2.0 plac==1.1.3 plotly==4.4.1 plotnine==0.6.0 pluggy==0.7.1 pooch==1.4.0 portend==2.7.1 portpicker==1.3.9 prefetch-generator==1.0.1 preshed==3.0.5 prettytable==2.1.0 progressbar2==3.38.0 prometheus-client==0.11.0 promise==2.3 prompt-toolkit==1.0.18 protobuf==3.17.3 psutil==5.4.8 psycopg2==2.7.6.1 ptyprocess==0.7.0 py==1.10.0 py-rouge==1.1 pyarrow==3.0.0 pyasn1==0.4.8 pyasn1-modules==0.2.8 pycocotools==2.0.2 pycparser==2.20 pyct==0.4.8 pydata-google-auth==1.2.0 pydot==1.3.0 pydot-ng==2.0.0 pydotplus==2.0.2 PyDrive==1.3.1 pyemd==0.5.1 pyerfa==2.0.0 pyglet==1.5.0 Pygments==2.6.1 pygobject==3.26.1 pymc3==3.11.2 PyMeeus==0.5.11 pymongo==3.12.0 pymystem3==0.2.0 PyOpenGL==3.1.5 pyparsing==2.4.7 pyrsistent==0.18.0 pysndfile==1.3.8 PySocks==1.7.1 pystan==2.19.1.1 pytest==3.6.4 python-apt==0.0.0 python-chess==0.23.11 python-dateutil==2.8.2 python-docx==0.8.11 python-louvain==0.15 python-slugify==5.0.2 python-utils==2.5.6 pytz==2018.9 pyviz-comms==2.1.0 PyWavelets==1.1.1 PyYAML==3.13 pyzmq==22.2.1 qdldl==0.1.5.post0 qtconsole==5.1.1 QtPy==1.9.0 regex==2019.12.20 requests==2.23.0 requests-oauthlib==1.3.0 resampy==0.2.2 retrying==1.3.3 rpy2==3.4.5 rsa==4.7.2 s3transfer==0.5.0 sacremoses==0.0.45 scikit-image==0.16.2 scikit-learn==0.22.2.post1 scipy==1.4.1 screen-resolution-extra==0.0.0 scs==2.1.4 seaborn==0.11.1 semver==2.13.0 Send2Trash==1.8.0 sentencepiece==0.1.96 sentry-sdk==1.3.1 setuptools-git==1.2 sgmllib3k==1.0.0 Shapely==1.7.1 shortuuid==1.0.1 simplegeneric==0.8.1 six==1.15.0 sklearn==0.0 sklearn-pandas==1.8.0 smart-open==5.1.0 smmap==4.0.0 snowballstemmer==2.1.0 sortedcontainers==2.4.0 SoundFile==0.10.3.post1 spacy==2.2.4 Sphinx==1.8.5 sphinxcontrib-serializinghtml==1.1.5 sphinxcontrib-websupport==1.2.4 SQLAlchemy==1.4.22 sqlparse==0.4.1 srsly==1.0.5 statsmodels==0.10.2 sympy==1.7.1 tables==3.4.4 tabulate==0.8.9 tblib==1.7.0 tempora==4.1.1 tensorboard==2.5.0 tensorboard-data-server==0.6.1 tensorboard-plugin-wit==1.8.0 tensorboardX==2.4 tensorflow @ file:///tensorflow-2.5.0-cp37-cp37m-linux_x86_64.whl tensorflow-datasets==4.0.1 tensorflow-estimator==2.5.0 tensorflow-gcs-config==2.5.0 tensorflow-hub==0.12.0 tensorflow-metadata==1.2.0 tensorflow-probability==0.13.0 termcolor==1.1.0 terminado==0.10.1 testpath==0.5.0 text-unidecode==1.3 textblob==0.15.3 Theano-PyMC==1.1.2 thinc==7.4.0 tifffile==2021.8.8 tokenizers==0.10.3 toml==0.10.2 tomli==1.2.1 toolz==0.11.1 torch @ https://download.pytorch.org/whl/cu102/torch-1.9.0%2Bcu102-cp37-cp37m-linux_x86_64.whl torchsummary==1.5.1 torchtext==0.10.0 torchvision @ https://download.pytorch.org/whl/cu102/torchvision-0.10.0%2Bcu102-cp37-cp37m-linux_x86_64.whl tornado==5.1.1 tqdm==4.62.0 traitlets==5.0.5 transformers==4.8.2 tweepy==3.10.0 typeguard==2.7.1 typing-extensions==3.7.4.3 tzlocal==1.5.1 uritemplate==3.0.1 urllib3==1.25.11 vega-datasets==0.9.0 wandb==0.11.1 wasabi==0.8.2 wcwidth==0.2.5 webencodings==0.5.1 Werkzeug==1.0.1 widgetsnbextension==3.5.1 word2number==1.1 wordcloud==1.5.0 wrapt==1.12.1 xarray==0.18.2 xgboost==0.90 xkit==0.0.0 xlrd==1.1.0 xlwt==1.3.0 yellowbrick==0.9.1 zc.lockfile==2.0 zict==2.0.0 zipp==3.5.0 ```

Steps to reproduce

Example source:

``` !pip install allennlp allennlp-models !pip install nltk !pip install tqdm !pip install spacy !python -m spacy download en_core_web_sm from allennlp.predictors.predictor import Predictor import allennlp_models.structured_prediction predictor = Predictor.from_path("https://storage.googleapis.com/allennlp-public-models/elmo-constituency-parser-2020.02.10.tar.gz") ```

dirkgr commented 3 years ago

I think your file got truncated during download. Can you clear the cache (rm -r ~/.allennlp/cache/), and try again?