stanfordnlp / GloVe

Software in C and data files for the popular GloVe model for distributed word representations, a.k.a. word vectors or embeddings
Apache License 2.0
6.86k stars 1.51k forks source link

Update evaluate.py for python 3 compatibility #122

Closed akanshajainn closed 4 years ago

akanshajainn commented 6 years ago

This file throws error if ran on python 3 environment, made changes to it making it successfully run on python3.5.

jmikedupont2 commented 4 years ago

Is anyone reviewing these patches for python3? it seems a year old?

AngledLuffa commented 4 years ago

As I've commented elsewhere, at a minimum, the application of this change for fixing issue #132 can't possibly be right. Dict objects have length:

>>> len({})
0

If anyone has a way to reproduce the error caused, that would be useful for reproducing the underlying error. Otherwise, ¯\_(ツ)_/¯

aolney commented 4 years ago

@AngledLuffa Here is a conda environment that duplicates this bug

name: iis4011
channels:
  - conda-forge
  - defaults
dependencies:
  - _libgcc_mutex=0.1=main
  - attrs=19.3.0=py_0
  - backcall=0.1.0=py_0
  - blas=1.0=mkl
  - bleach=3.1.0=py_0
  - ca-certificates=2019.11.28=hecc5488_0
  - certifi=2019.11.28=py37_0
  - cycler=0.10.0=py_2
  - decorator=4.4.1=py_0
  - defusedxml=0.6.0=py_0
  - entrypoints=0.3=py37_1000
  - freetype=2.10.0=he983fc9_1
  - icu=58.2=hf484d3e_1000
  - importlib_metadata=1.3.0=py37_0
  - intel-openmp=2019.4=243
  - ipykernel=5.1.3=py37h5ca1d4c_0
  - ipympl=0.4.1=py_0
  - ipython=7.11.1=py37h5ca1d4c_0
  - ipython_genutils=0.2.0=py_1
  - ipywidgets=7.5.1=py_0
  - jedi=0.15.2=py37_0
  - jinja2=2.10.3=py_0
  - json5=0.8.5=py_0
  - jsonschema=3.2.0=py37_0
  - jupyter_client=5.3.3=py37_1
  - jupyter_core=4.6.1=py37_0
  - jupyterlab=1.2.4=py_0
  - jupyterlab_server=1.0.6=py_0
  - kiwisolver=1.1.0=py37hc9558a2_0
  - libedit=3.1.20181209=hc058e9b_0
  - libffi=3.2.1=hd88cf55_4
  - libgcc-ng=9.1.0=hdf63c60_0
  - libgfortran-ng=7.3.0=hdf63c60_0
  - libpng=1.6.37=hed695b0_0
  - libsodium=1.0.17=h516909a_0
  - libstdcxx-ng=9.1.0=hdf63c60_0
  - libuuid=2.32.1=h14c3975_1000
  - markupsafe=1.1.1=py37h516909a_0
  - matplotlib-base=3.1.3=py37hef1b27d_0
  - mistune=0.8.4=py37h516909a_1000
  - mkl=2019.4=243
  - mkl-service=2.3.0=py37he904b0f_0
  - mkl_fft=1.0.15=py37ha843d7b_0
  - mkl_random=1.1.0=py37hd6b4f25_0
  - more-itertools=8.0.2=py_0
  - nbconvert=5.6.1=py37_0
  - nbformat=4.4.0=py_1
  - ncurses=6.1=he6710b0_1
  - notebook=6.0.1=py37_0
  - numpy-base=1.18.1=py37hde5b4d6_1
  - openssl=1.1.1d=h516909a_0
  - pandas=1.0.0=py37h0573a6f_0
  - pandoc=2.9.1=0
  - pandocfilters=1.4.2=py_1
  - parso=0.5.2=py_0
  - pexpect=4.7.0=py37_0
  - pickleshare=0.7.5=py37_1000
  - pip=19.3.1=py37_0
  - prometheus_client=0.7.1=py_0
  - prompt_toolkit=3.0.2=py_0
  - ptvsd=4.3.2=py37h516909a_0
  - ptyprocess=0.6.0=py_1001
  - pygments=2.5.2=py_0
  - pyparsing=2.4.6=py_0
  - pyrsistent=0.15.6=py37h516909a_0
  - python=3.7.5=h0371630_0
  - python-dateutil=2.8.1=py_0
  - pytz=2019.3=py_0
  - pyzmq=18.1.1=py37h1768529_0
  - readline=7.0=h7b6447c_5
  - send2trash=1.5.0=py_0
  - setuptools=42.0.2=py37_0
  - six=1.13.0=py37_0
  - sqlite=3.30.1=h7b6447c_0
  - terminado=0.8.3=py37_0
  - testpath=0.4.4=py_0
  - tk=8.6.8=hbc83047_0
  - tornado=6.0.3=py37h516909a_0
  - traitlets=4.3.3=py37_0
  - wcwidth=0.1.8=py_0
  - webencodings=0.5.1=py_1
  - wheel=0.33.6=py37_0
  - widgetsnbextension=3.5.1=py37_0
  - xeus=0.23.3=h4d8c418_0
  - xeus-python=0.6.8=py37hc9558a2_0
  - xz=5.2.4=h14c3975_4
  - zeromq=4.3.2=he1b5a44_2
  - zipp=0.6.0=py_0
  - zlib=1.2.11=h7b6447c_3
  - pip:
    - absl-py==0.9.0
    - astor==0.8.1
    - beautifulsoup4==4.8.2
    - blis==0.4.1
    - boto==2.49.0
    - boto3==1.10.46
    - botocore==1.13.46
    - bs4==0.0.1
    - cachetools==4.0.0
    - catalogue==0.2.0
    - chardet==3.0.4
    - click==7.1.1
    - cymem==2.0.3
    - docutils==0.15.2
    - filelock==3.0.12
    - funcy==1.14
    - future==0.18.2
    - gast==0.2.2
    - gensim==3.8.1
    - google-auth==1.11.3
    - google-auth-oauthlib==0.4.1
    - google-pasta==0.2.0
    - grpcio==1.27.2
    - h5py==2.10.0
    - httplib2==0.17.0
    - idna==2.8
    - jmespath==0.9.4
    - joblib==0.14.1
    - keras-applications==1.0.8
    - keras-preprocessing==1.1.0
    - markdown==3.2.1
    - matplotlib==3.1.2
    - murmurhash==1.0.2
    - mypy==0.761
    - mypy-extensions==0.4.3
    - networkx==2.4
    - nlpnet==1.2.4
    - nltk==3.4.5
    - numexpr==2.7.1
    - numpy==1.18.0
    - oauthlib==3.1.0
    - opt-einsum==3.2.0
    - packaging==20.3
    - pillow==7.0.0
    - plac==1.1.3
    - pluggy==0.13.1
    - preshed==3.0.2
    - protobuf==3.11.3
    - py==1.8.1
    - pyasn1==0.4.8
    - pyasn1-modules==0.2.8
    - pyldavis==2.1.2
    - pytest==5.4.1
    - regex==2020.2.20
    - requests==2.22.0
    - requests-oauthlib==1.3.0
    - rsa==4.0
    - s3transfer==0.2.1
    - sacremoses==0.0.38
    - scikit-learn==0.22.1
    - scipy==1.4.1
    - seaborn==0.10.0
    - sentencepiece==0.1.85
    - smart-open==1.9.0
    - soupsieve==1.9.5
    - spacy==2.2.3
    - tensorboard==2.1.1
    - tensorflow==2.1.0
    - tensorflow-estimator==2.1.0
    - termcolor==1.1.0
    - thinc==7.3.1
    - tokenizers==0.5.2
    - torch==1.3.1
    - torchvision==0.4.2
    - tqdm==4.41.1
    - transformers==2.5.1
    - typed-ast==1.4.0
    - typing-extensions==3.7.4.1
    - urllib3==1.25.7
    - wasabi==0.6.0
    - webrtcvad==2.0.10
    - werkzeug==1.0.0
    - wrapt==1.12.1
prefix: /z/aolney/software/miniconda3/envs/iis4011

Screenshot from 2020-04-08 20-24-54

AngledLuffa commented 4 years ago

Can you try the current git code instead of the released version? I just realized this may be the source of the problem.

aolney commented 4 years ago

Apologies for the noise. I confirm it runs without error in same same conda environment specified above with latest commit https://github.com/stanfordnlp/GloVe/commit/9c7bbced4ab813e9f7ba3eb57db16f03a7cb92bf

I suggest removing the release on the main page which appears to match the 2015 release in GitHub, and replace with instructions stating to always clone the repo.

AngledLuffa commented 4 years ago

Not noise! As you point out, it was faulty documentation.

AngledLuffa commented 4 years ago

Thanks for pointing that out. It should now be fixed.