allenai / scispacy

A full spaCy pipeline and models for scientific/biomedical documents.
https://allenai.github.io/scispacy/
Apache License 2.0
1.68k stars 225 forks source link

zipfile.BadZipFile: File is not a zip file #443

Closed lzxlin closed 2 years ago

lzxlin commented 2 years ago

when I run follow line: nlp.add_pipe("scispacy_linker", config={"resolve_abbreviations": True, "linker_name": "umls"}) There was an error, “zipfile.BadZipFile: File is not a zip file”

dakinggg commented 2 years ago

Please post the full error

lzxlin commented 2 years ago

I encounter the following error when executing the fifth line of the example of EntityLinker:

import spacy
import scispacy
from scispacy.linking import EntityLinker
nlp = spacy.load("en_core_sci_sm")
nlp.add_pipe("scispacy_linker", config={"resolve_abbreviations": True, "linker_name": "umls"})

=======================



Finished download, copying /tmp/tmp2uhkewjb to cache at /root/.scispacy/datasets/e9f7327283e43f0482f7c0c71b71dec278a58ccb3ffdd03c2c2350159e7ef146.f2a350ad19015b2591545f7feeed6a6d6d2fffcd635d868a5d7fc0dfc3cadfd8.tfidf_vectors_sparse.npz
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/root/anaconda3/envs/test/lib/python3.6/site-packages/spacy/language.py", line 797, in add_pipe
    validate=validate,
  File "/root/anaconda3/envs/test/lib/python3.6/site-packages/spacy/language.py", line 674, in create_pipe
    resolved = registry.resolve(cfg, validate=validate)
  File "/root/anaconda3/envs/test/lib/python3.6/site-packages/thinc/config.py", line 747, in resolve
    config, schema=schema, overrides=overrides, validate=validate, resolve=True
  File "/root/anaconda3/envs/test/lib/python3.6/site-packages/thinc/config.py", line 796, in _make
    config, schema, validate=validate, overrides=overrides, resolve=resolve
  File "/root/anaconda3/envs/test/lib/python3.6/site-packages/thinc/config.py", line 867, in _fill
    getter_result = getter(*args, **kwargs)
  File "/root/anaconda3/envs/test/lib/python3.6/site-packages/scispacy/linking.py", line 85, in __init__
    name=linker_name
  File "/root/anaconda3/envs/test/lib/python3.6/site-packages/scispacy/candidate_generation.py", line 223, in __init__
    linker_paths=linker_paths, ef_search=ef_search
  File "/root/anaconda3/envs/test/lib/python3.6/site-packages/scispacy/candidate_generation.py", line 133, in load_approximate_nearest_neighbours_index
    cached_path(linker_paths.tfidf_vectors)
  File "/root/anaconda3/envs/test/lib/python3.6/site-packages/scipy/sparse/_matrix_io.py", line 123, in load_npz
    with np.load(file, **PICKLE_KWARGS) as loaded:
  File "/root/anaconda3/envs/test/lib/python3.6/site-packages/numpy/lib/npyio.py", line 432, in load
    pickle_kwargs=pickle_kwargs)
  File "/root/anaconda3/envs/test/lib/python3.6/site-packages/numpy/lib/npyio.py", line 186, in __init__
    _zip = zipfile_factory(fid)
  File "/root/anaconda3/envs/test/lib/python3.6/site-packages/numpy/lib/npyio.py", line 112, in zipfile_factory
    return zipfile.ZipFile(file, *args, **kwargs)
  File "/root/anaconda3/envs/test/lib/python3.6/zipfile.py", line 1131, in __init__
    self._RealGetContents()
  File "/root/anaconda3/envs/test/lib/python3.6/zipfile.py", line 1198, in _RealGetContents
    raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file```
dakinggg commented 2 years ago

Could you please share the output of pip/conda list? I will try to reproduce, but first it would be great if you could try in a new, clean environment.

lzxlin commented 2 years ago

I create a new and clean environment and have solved this problem, thanks.