ixobert / birds-generation

11 stars 1 forks source link

Please add installation/environment set up instructions #7

Closed sammlapp closed 5 months ago

sammlapp commented 5 months ago

Thank you for providing open source code and model check point for this project!

For me, installation has failed with python 3.9 in a fresh conda environment the first 3 attempts. The first time was related to issue #5 and now I'm getting error: can't find Rust compiler

I followed the installation instructions from https://rustup.rs/ but still get the error. I'm not a Rust user and don't know how to proceed. Thanks for any help!

ixobert commented 5 months ago

Hi @sammlapp, this a strange error considering the fact that we don't use Rust in this repo. I wonder which package in the requirements.txt caused that error. Do you mind sharing few details about your environment:

sammlapp commented 5 months ago

Sorry for the lack of detail above. Further inspection of the error message shows that building a wheel for tokenizers is failing (apparently this package uses rust internally). I have tried installing rust, updating conda, and creating a fresh environment.

My python version is Python 3.10.13, and I previously also tried with python 3.9.18 and with python 3.8.18.

My operating system is Mac OS 12.6 (Apple M1 chip).

The error output is quite long so I put it in a gist (not including the many lines of "using cached"... at the beginning): https://gist.github.com/sammlapp/2f47bccff6aefaa5c9de8f388e85646c

ixobert commented 5 months ago

Hi @sammlapp, sorry for the late reply. Unfortunately I couldn't reproduce the error you got. I have replicated your environment( on with MacIntel), but the installation worked fine. Here the end of logs:

Building wheels for collected packages: antlr4-python3-runtime, lmdb
  Building wheel for antlr4-python3-runtime (setup.py) ... done
  Created wheel for antlr4-python3-runtime: filename=antlr4_python3_runtime-4.9.3-py3-none-any.whl size=144554 sha256=236016281b8bd10ded0e72bae31f61180af2b89145cd34d01652cef517ab225e
  Stored in directory: /Users/test/Library/Caches/pip/wheels/23/cf/80/f3efa822e6ab23277902ee9165fe772eeb1dfb8014f359020a
  Building wheel for lmdb (setup.py) ... done
  Created wheel for lmdb: filename=lmdb-1.4.1-cp39-cp39-macosx_10_9_x86_64.whl size=96425 sha256=dc6ba5a92ba13e82b41d2819144f2e91c0070fd16b8402e65978d711295a46bd
  Stored in directory: /Users/test/Library/Caches/pip/wheels/84/55/b3/bf057744f7438df5fdd827e608375a671404240663202e5ff0
Successfully built antlr4-python3-runtime lmdb
Installing collected packages: tokenizers, pytz, lmdb, appdirs, antlr4-python3-runtime, zipp, validators, urllib3, tzlocal, tzdata, typing-extensions, tqdm, tornado, toolz, tomli, toml, threadpoolctl, tenacity, smmap, six, setuptools, setproctitle, rpds-py, regex, PyYAML, pyparsing, pygments, pyDeprecate, pycparser, psutil, protobuf, pluggy, platformdirs, pillow, packaging, numpy, networkx, natsort, multidict, msgpack, mdurl, MarkupSafe, llvmlite, lazy-loader, kiwisolver, joblib, iniconfig, idna, fsspec, frozenlist, fonttools, filelock, exceptiongroup, docstring-parser, decorator, cycler, click, charset-normalizer, certifi, cachetools, blinker, audioread, attrs, async-timeout, yarl, torch, tifffile, tensorboardX, soxr, sentry-sdk, scipy, sacremoses, requests, referencing, python-dateutil, pytest, pyarrow, opencv-python-headless, opencv-python, omegaconf, numba, markdown-it-py, lightning-utilities, jsonargparse, jinja2, importlib-resources, importlib-metadata, imageio, gitdb, docker-pycreds, contourpy, cffi, aiosignal, torchvision, torchmetrics, torchaudio, soundfile, scikit-learn, scikit-image, rich, pydeck, pooch, pandas, matplotlib, jsonschema-specifications, hydra-core, huggingface-hub, GitPython, aiohttp, wandb, transformers, qudida, librosa, jsonschema, pytorch_lightning, altair, albumentations, streamlit, lightning-flash
  Attempting uninstall: setuptools
    Found existing installation: setuptools 68.2.2
    Uninstalling setuptools-68.2.2:
      Successfully uninstalled setuptools-68.2.2
Successfully installed GitPython-3.1.41 MarkupSafe-2.1.4 PyYAML-6.0.1 aiohttp-3.9.1 aiosignal-1.3.1 albumentations-1.3.1 altair-5.2.0 antlr4-python3-runtime-4.9.3 appdirs-1.4.4 async-timeout-4.0.3 attrs-23.2.0 audioread-3.0.1 blinker-1.7.0 cachetools-5.3.2 certifi-2023.11.17 cffi-1.16.0 charset-normalizer-3.3.2 click-8.1.7 contourpy-1.2.0 cycler-0.12.1 decorator-5.1.1 docker-pycreds-0.4.0 docstring-parser-0.15 exceptiongroup-1.2.0 filelock-3.13.1 fonttools-4.47.2 frozenlist-1.4.1 fsspec-2023.12.2 gitdb-4.0.11 huggingface-hub-0.20.2 hydra-core-1.3.2 idna-3.6 imageio-2.33.1 importlib-metadata-7.0.1 importlib-resources-6.1.1 iniconfig-2.0.0 jinja2-3.1.3 joblib-1.3.2 jsonargparse-4.9.0 jsonschema-4.21.1 jsonschema-specifications-2023.12.1 kiwisolver-1.4.5 lazy-loader-0.3 librosa-0.10.1 lightning-flash-0.8.1 lightning-utilities-0.10.1 llvmlite-0.41.1 lmdb-1.4.1 markdown-it-py-3.0.0 matplotlib-3.8.2 mdurl-0.1.2 msgpack-1.0.7 multidict-6.0.4 natsort-8.4.0 networkx-3.2.1 numba-0.58.1 numpy-1.26.3 omegaconf-2.3.0 opencv-python-4.9.0.80 opencv-python-headless-4.9.0.80 packaging-23.2 pandas-2.2.0 pillow-10.2.0 platformdirs-4.1.0 pluggy-1.3.0 pooch-1.8.0 protobuf-3.20.1 psutil-5.9.8 pyDeprecate-0.3.2 pyarrow-15.0.0 pycparser-2.21 pydeck-0.8.1b0 pygments-2.17.2 pyparsing-3.1.1 pytest-7.4.4 python-dateutil-2.8.2 pytorch_lightning-1.8.6 pytz-2023.3.post1 qudida-0.0.4 referencing-0.32.1 regex-2023.12.25 requests-2.31.0 rich-13.7.0 rpds-py-0.17.1 sacremoses-0.1.1 scikit-image-0.22.0 scikit-learn-1.3.2 scipy-1.12.0 sentry-sdk-1.39.2 setproctitle-1.3.3 setuptools-59.5.0 six-1.16.0 smmap-5.0.1 soundfile-0.12.1 soxr-0.3.7 streamlit-1.30.0 tenacity-8.2.3 tensorboardX-2.6.2.2 threadpoolctl-3.2.0 tifffile-2023.12.9 tokenizers-0.10.3 toml-0.10.2 tomli-2.0.1 toolz-0.12.0 torch-1.13.1 torchaudio-0.13.1 torchmetrics-0.9.3 torchvision-0.14.1 tornado-6.4 tqdm-4.66.1 transformers-4.10.2 typing-extensions-4.9.0 tzdata-2023.4 tzlocal-5.2 urllib3-2.1.0 validators-0.22.0 wandb-0.16.2 yarl-1.9.4 zipp-3.17.0
(tmp_ecogen) 

One thing worth trying is to ensure that your pip install uses the right python environment. You can check which pip is being used used with which pip then pip install -r requirements.txt

Nevertheless, I have updated the transformers in the requirements.txt, you can pull the code and try again, hopefully it will address your issue. Otherwise let me know.

sammlapp commented 5 months ago

I checked that I'm using the correct pip. I also tried installing tokenizers in my environment before running pip install -r requirements.txt. Weirdly, tokenizers installs without any problem, but I still get the same error. By searching around google, it seems M1 chip macs sometimes have had similar tokenizer issues eg here for other packages.

I noticed that the specific requirement "transformers==4.10.2" leads to

Collecting tokenizers<0.11,>=0.10.1 (from transformers==4.10.2->-r requirements.txt (line 8))
  Using cached tokenizers-0.10.3.tar.gz (212 kB)

(tokenizers-0.10.3.tar.gz is the wheel-building that eventually fails)

When I remove the version requirement for transformers in requirements.txt (ie replace transformers==4.10.2 with just transformers), everything installs fine. My environment has transformers-4.37.0 and tokenizers-0.15.0.

Do you think that allowing newer versions of transformers will break the birds-generation code? I'll try it out and report back

ixobert commented 5 months ago

@sammlapp I have reverted the transformers version to 4.10.2. This is the version used to develop and test the tool. I'm not sure if future transformers releases will have backward compatibility. I believe your case is a bit special since you have M1, and has you said above there are few discrepancies with the Intel arch(on Mac). Happy to hear that transformers-4.37.0 and tokenizers-0.15.0. works fine with your M1, I have added new requirements.txt specially for M1 series.