facebookresearch / DrQA

Reading Wikipedia to Answer Open-Domain Questions
Other
4.48k stars 898 forks source link

`ValueError: Object arrays cannot be loaded when allow_pickle=False` #247

Open hughperkins opened 4 years ago

hughperkins commented 4 years ago
$ python scripts/pipeline/interactive.py --tokenizer regexp
04/22/2020 12:16:52 PM: [ Running on CPU only. ]
04/22/2020 12:16:52 PM: [ Initializing pipeline... ]
04/22/2020 12:16:52 PM: [ Initializing document ranker... ]
04/22/2020 12:16:52 PM: [ Loading /persist/git/DrQA/data/wikipedia/docs-tfidf-ngram=2-hash=16777216-tokenizer=simple.npz ]
Traceback (most recent call last):
  File "scripts/pipeline/interactive.py", line 70, in <module>
    tokenizer=args.tokenizer
  File "/persist/git/DrQA/drqa/pipeline/drqa.py", line 109, in __init__
    self.ranker = ranker_class(**ranker_opts)
  File "/persist/git/DrQA/drqa/retriever/tfidf_doc_ranker.py", line 37, in __init__
    matrix, metadata = utils.load_sparse_csr(tfidf_path)
  File "/persist/git/DrQA/drqa/retriever/utils.py", line 36, in load_sparse_csr
    return matrix, loader['metadata'].item(0) if 'metadata' in loader else None
  File "/persist/git/DrQA/conda/lib/python3.6/_collections_abc.py", line 666, in __contains__
    self[key]
  File "/persist/git/DrQA/conda/lib/python3.6/site-packages/numpy/lib/npyio.py", line 262, in __getitem__
    pickle_kwargs=self.pickle_kwargs)
  File "/persist/git/DrQA/conda/lib/python3.6/site-packages/numpy/lib/format.py", line 739, in read_array
    raise ValueError("Object arrays cannot be loaded when "
ValueError: Object arrays cannot be loaded when allow_pickle=False
(base) (env) ubuntu@hughg4:~/git/DrQA$ pip freeze
asn1crypto==1.3.0
certifi==2020.4.5.1
cffi==1.14.0
chardet==3.0.4
conda==4.8.3
conda-package-handling==1.6.0
cryptography==2.8
drqa==0.1.0
elasticsearch==7.0.4
idna==2.9
joblib==0.14.1
mkl-fft==1.0.15
mkl-random==1.1.0
mkl-service==2.3.0
nltk==3.4.5
numpy==1.18.1
pexpect==4.8.0
prettytable==0.7.2
ptyprocess==0.6.0
pycosat==0.6.3
pycparser==2.20
pyOpenSSL==19.1.0
PySocks==1.7.1
regex==2020.4.4
requests==2.23.0
ruamel-yaml==0.15.87
scikit-learn==0.22.1
scipy==1.4.1
six==1.14.0
termcolor==1.1.0
torch==1.4.0
tqdm==4.42.1
urllib3==1.25.8
$ conda list
# packages in environment at /persist/git/DrQA/conda:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main
_pytorch_select           0.1                       cpu_0
asn1crypto                1.3.0                    py36_0
blas                      1.0                         mkl
ca-certificates           2020.1.1                      0
certifi                   2020.4.5.1               py36_0
cffi                      1.14.0           py36h2e261b9_0
chardet                   3.0.4                 py36_1003
conda                     4.8.3                    py36_0
conda-package-handling    1.6.0            py36h7b6447c_0
cryptography              2.8              py36h1ba5d50_0
cudatoolkit               10.2.89              hfd86e86_0
elasticsearch             7.0.4                      py_0
idna                      2.9                        py_1
intel-openmp              2020.0                      166
joblib                    0.14.1                     py_0
ld_impl_linux-64          2.33.1               h53a641e_7
libedit                   3.1.20181209         hc058e9b_0
libffi                    3.2.1                hd88cf55_4
libgcc-ng                 9.1.0                hdf63c60_0
libgfortran-ng            7.3.0                hdf63c60_0
libstdcxx-ng              9.1.0                hdf63c60_0
mkl                       2020.0                      166
mkl-service               2.3.0            py36he904b0f_0
mkl_fft                   1.0.15           py36ha843d7b_0
mkl_random                1.1.0            py36hd6b4f25_0
ncurses                   6.2                  he6710b0_0
ninja                     1.9.0            py36hfd86e86_0
nltk                      3.4.5                    py36_0
numpy                     1.18.1           py36h4f9e942_0
numpy-base                1.18.1           py36hde5b4d6_1
openssl                   1.1.1g               h7b6447c_0
pexpect                   4.8.0                    py36_0
pip                       20.0.2                   py36_1
prettytable               0.7.2                    pypi_0    pypi
ptyprocess                0.6.0                    py36_0
pycosat                   0.6.3            py36h7b6447c_0
pycparser                 2.20                       py_0
pyopenssl                 19.1.0                   py36_0
pysocks                   1.7.1                    py36_0
python                    3.6.10               h191fe78_1
pytorch                   1.4.0           cpu_py36h7e40bad_0
readline                  7.0                  h7b6447c_5
regex                     2020.4.4         py36h7b6447c_0
requests                  2.23.0                   py36_0
ruamel_yaml               0.15.87          py36h7b6447c_0
scikit-learn              0.22.1           py36hd81dba3_0
scipy                     1.4.1            py36h0b6359f_0
setuptools                46.1.3                   py36_0
six                       1.14.0                   py36_0
sqlite                    3.31.1               h7b6447c_0
termcolor                 1.1.0                    py36_1
tk                        8.6.8                hbc83047_0
tqdm                      4.42.1                     py_0
urllib3                   1.25.8                   py36_0
wheel                     0.34.2                   py36_0
xz                        5.2.4                h14c3975_4
yaml                      0.1.7                had09818_2
zlib                      1.2.11               h7b6447c_3
hughperkins commented 4 years ago

Seems like the versions of dependenices used when this was last working are not pinned or documented, and latest versions are incompatible.

Idea: if someone has a working venv or conda env, please could you do pip freeze (and also conda list if it is conda environment), and I can simply install those specific package versions.

hughperkins commented 4 years ago

Fix: change line 33 of drqa/retriever/utils.py to:

    loader = np.load(filename, allow_pickle=True)
hughperkins commented 4 years ago

I will send a PR

Emmanuel-Messulam commented 3 years ago

Was this fixed?