Open qolina opened 1 year ago
I test on mce, ecp, and NSCC: mce: got the same warnings as your records, but it seems SciAssist works well. You can import SciAssist in python console. (The latest pytorch is incompatible to mce's gpu, so specify it to 1.12.0) ecp: no problem except "DEPRECATION" warnings from pytorch-lightning. When import SciAssist, there'are some "Import Error: No module named xx" . It seems that the default python version is 2.x and all of them come from Linxiao's from transformers import *. I'm not sure whether it's related to the server's setting, but python3 -m pip install SciAssist works well. NSCC: same with 2. Todo:
[ ] 1. Change Linxiao's code
[ ] 2. Specify the torch version to 1.12.0 explicitly in the requirements.txt. We may add this notes to README, to remind users to install a version compatible to their machine.
Pytorch-lightning 1.7 still works well in our toolkit. I don't recommend to update it now because we are not sure the impact yet.
I think there should be some problems with the server themselves, as many error files are in "/usr/share" ,and if one doesn't have root account, it's hard to discover the causes.
With Sciassist=0.1.1
~$ lsb_release -a
Distributor ID: Ubuntu Description: Ubuntu 20.04.6 LTS Release: 20.04
~$ nvidia-smi
NVIDIA-SMI 470.199.02 Driver Version: 470.199.02 CUDA Version: 11.4
conda create --name assist python=3.8
conda activate assist
pip install sciassist
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. requests-oauthlib 1.3.1 requires oauthlib>=3.0.0, which is not installed. Successfully installed PyYAML-6.0.1 async-timeout-4.0.3 attrs-23.1.0 beautifulsoup4-4.9.3 certifi-2023.11.17 chardet-3.0.4 click-8.1.7 exceptiongroup-1.2.0 idna-2.8 iniconfig-2.0.0 jinja2-3.1.2 lightning-utilities-0.10.0 multiprocess-0.70.12.2 numpy-1.24.4 packaging-23.2 pluggy-1.3.0 protobuf-3.20.3 pyparsing-3.1.1 pytest-7.4.3 python-magic-0.4.27 pytorch-lightning-2.0.9.post0 requests-2.22.0 responses-0.18.0 safetensors-0.4.1 sciassist-0.1.1 sentry-sdk-1.9.0 six-1.16.0 tomli-2.0.1 transformers-4.30.2 urllib3-1.25.11
Reflection to (https://github.com/WING-NUS/SciAssist/issues/32#issuecomment-1765840118): 1) no torch installed here torch is installed together with pytorch lightning torch.version is '2.1.0+cu121', 2) pytorch lightning is a recent version 2.0.9, 3) the mentioned oauthlib is installed.
from SciAssist import Summarization
summerizer = Summarization(device="gpu")
res = summerizer.predict(text, type="str")
print(res)
Failed to import transformers.models.llama.tokenization_llama_fast because of the following error (look up to see its traceback): tokenizers>=0.13.3 is required for a normal functioning of this module, but found tokenizers==0.12.1.
pip install pytorch-lightning==1.7.1
Inference again
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False Wrong torch version, cannot recognize cuda.
conda install pytorch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 cudatoolkit=11.6 -c pytorch -c conda-forge
Do 'pip install torch==1.12.0+cu113 torchvision==0.13.0+cu113 torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cu113'
Correct Inference with reference string parsing and summarization!!
In summary, as you mentioned, torch should match the local machine (1.12.0 for our case), lightning=1.7.1 works for SciAssist.
I agree that we shall recommend users install their own torch before installing SciAssist.
Todo:
Try different lightning versions based on the correct torch.
Lightning: 1.8.0, 1.9.0, 2.0.0, 2.1.0 success in inference.
Try on MacOS, Windows system.
'pip3 install torch torchvision torchaudio'
Successfully installed MarkupSafe-2.1.3 certifi-2023.11.17 charset-normalizer-3.3.2 filelock-3.13.1 fsspec-2023.10.0 idna-3.6 jinja2-3.1.2 mpmath-1.3.0 networkx-3.1 numpy-1.24.4 pillow-10.1.0 requests-2.31.0 sympy-1.12 torch-2.1.1 torchaudio-2.1.1 torchvision-0.16.1 typing-extensions-4.8.0 urllib3-2.1.0
Successfully installed GitPython-3.1.40 PyPDF2-2.10.9 PyYAML-6.0.1 aiohttp-3.9.1 aiosignal-1.3.1 antlr4-python3-runtime-4.9.3 async-timeout-4.0.3 attrs-23.1.0 beautifulsoup4-4.9.3 cffi-1.16.0 chardet-3.0.4 click-8.1.7 commonmark-0.9.1 cryptography-41.0.7 cycler-0.12.1 datasets-2.2.2 dill-0.3.4 docker-pycreds-0.4.0 exceptiongroup-1.2.0 fonttools-4.45.1 frozenlist-1.4.0 gitdb-4.0.11 huggingface-hub-0.19.4 hydra-core-1.3.2 idna-2.8 importlib-resources-6.1.1 iniconfig-2.0.0 joblib-1.3.2 kiwisolver-1.4.5 lightning-utilities-0.10.0 lxml-4.9.3 matplotlib-3.5.3 multidict-6.0.4 multiprocess-0.70.12.2 nltk-3.8.1 omegaconf-2.2.3 packaging-23.2 pandas-1.4.4 pathtools-0.1.2 pdfminer.six-20221105 pluggy-1.3.0 promise-2.3 protobuf-3.20.3 psutil-5.9.6 pyarrow-14.0.1 pycparser-2.21 pygments-2.17.2 pyparsing-3.1.1 pytest-7.4.3 python-dateutil-2.8.2 python-magic-0.4.27 pytorch-crf-0.7.2 pytorch-lightning-2.0.9.post0 pytz-2023.3.post1 regex-2023.10.3 requests-2.22.0 responses-0.18.0 rich-12.4.4 sacremoses-0.1.1 safetensors-0.4.1 sciassist-0.1.1 scikit-learn-1.3.2 scipy-1.10.1 seaborn-0.11.2 sentry-sdk-1.9.0 seqeval-1.2.2 setproctitle-1.3.3 shortuuid-1.0.11 six-1.16.0 smmap-5.0.1 soupsieve-2.5 threadpoolctl-3.2.0 tokenizers-0.13.3 tomli-2.0.1 torchcrf-1.1.0 torchmetrics-0.11.4 tqdm-4.66.1 transformers-4.30.2 urllib3-1.25.11 wandb-0.12.21 xxhash-3.4.1 yarl-1.9.3 zipp-3.17.0
Reference string parsing and summarization test passed!
Miniconda cache 1.5G Model checkpoints cache 2.7G Memory: 803MB for reference string parsing, 1.3G for summarization
~$ lsb_release -a
No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 22.04.1 LTS Release: 22.04
nvidia-smi
NVIDIA-SMI 535.112 Driver Version: 537.42 CUDA Version: 12.2
python3 -m venv SciAssist
source SciAssist/bin/activate
pip install SciAssist
Successfully installed GitPython-3.1.40 MarkupSafe-2.1.3 PyPDF2-2.10.9 PyYAML-6.0.1 aiohttp-3.9.1 aiosignal-1.3.1 antlr4-python3-runtime-4.9.3 async-timeout-4.0.3 attrs-23.1.0 beautifulsoup4-4.9.3 certifi-2023.11.17 cffi-1.16.0 chardet-3.0.4 charset-normalizer-3.3.2 click-8.1.7 commonmark-0.9.1 cryptography-41.0.7 cycler-0.12.1 datasets-2.2.2 dill-0.3.4 docker-pycreds-0.4.0 exceptiongroup-1.2.0 filelock-3.13.1 fonttools-4.45.1 frozenlist-1.4.0 fsspec-2023.10.0 gitdb-4.0.11 huggingface-hub-0.19.4 hydra-core-1.3.2 idna-2.8 iniconfig-2.0.0 jinja2-3.1.2 joblib-1.3.2 kiwisolver-1.4.5 lightning-utilities-0.10.0 lxml-4.9.3 matplotlib-3.5.3 mpmath-1.3.0 multidict-6.0.4 multiprocess-0.70.12.2 networkx-3.2.1 nltk-3.8.1 numpy-1.26.2 nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-8.9.2.26 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.18.1 nvidia-nvjitlink-cu12-12.3.101 nvidia-nvtx-cu12-12.1.105 omegaconf-2.2.3 packaging-23.2 pandas-1.4.4 pathtools-0.1.2 pdfminer.six-20221105 pillow-10.1.0 pluggy-1.3.0 promise-2.3 protobuf-3.20.3 psutil-5.9.6 pyarrow-14.0.1 pycparser-2.21 pygments-2.17.2 pyparsing-3.1.1 pytest-7.4.3 python-dateutil-2.8.2 python-magic-0.4.27 pytorch-crf-0.7.2 pytorch-lightning-2.0.9.post0 pytz-2023.3.post1 regex-2023.10.3 requests-2.22.0 responses-0.18.0 rich-12.4.4 sacremoses-0.1.1 safetensors-0.4.1 sciassist-0.1.1 scikit-learn-1.3.2 scipy-1.11.4 seaborn-0.11.2 sentry-sdk-1.9.0 seqeval-1.2.2 setproctitle-1.3.3 setuptools-69.0.2 shortuuid-1.0.11 six-1.16.0 smmap-5.0.1 soupsieve-2.5 sympy-1.12 threadpoolctl-3.2.0 tokenizers-0.13.3 tomli-2.0.1 torch-2.1.1 torchcrf-1.1.0 torchmetrics-0.11.4 tqdm-4.66.1 transformers-4.30.2 triton-2.1.0 typing-extensions-4.8.0 urllib3-1.25.11 wandb-0.12.21 xxhash-3.4.1 yarl-1.9.3
setup_grobid
BUILD SUCCESSFUL in 54s 30 actionable tasks: 25 executed, 5 up-to-date Grobid is installed.
run_grobid
environments/SciAssist/lib/python3.10/site-packages/transformers/generation_utils.py:24: FutureWarning: Importing
GenerationMixin
fromsrc/transformers/generation_utils.py
is deprecated and will be removed in Transformers v5. Import asfrom transformers import GenerationMixin
instead. warnings.warn( Xformers is not installed correctly. If you want to use memory_efficient_attention to accelerate training use the following command to install Xformers pip install xformers. Grobid is running now.
from SciAssist import Summarization
pipeline = Summarization()
res = pipeline.predict('examples_H01-1042.pdf', type="pdf", num_beams=4, num_return_sequences=2)
print(res["summary"])
from SciAssist import ReferenceStringParsing
ref_parser = ReferenceStringParsing()
res = ref_parser.predict("examples_H01-1042.pdf", type="pdf")
print(res)
environments/SciAssist/lib/python3.10/site-packages/transformers/generation_utils.py:24: FutureWarning: Importing
GenerationMixin
fromsrc/transformers/generation_utils.py
is deprecated and will be removed in Transformers v5. Import asfrom transformers import GenerationMixin
instead. warnings.warn( Xformers is not installed correctly. If you want to use memory_efficient_attention to accelerate training use the following command to install Xformers pip install xformers. Loading the model... ...
summarization and rsp test passed!
Even though I did not install torch or pytorch-lightning before installing SciAssist, it could still run properly. Hence I believe users can do pip install SciAssist
straightaway. However, note that when testing, I ran into FutureWarning
, telling me to pip install xformers
.
I tried pip install xformers
:
Successfully installed torch-2.1.0 xformers-0.0.22.post7
When running test, it is successful but same problem:
environments/SciAssist2/lib/python3.10/site-packages/transformers/generation_utils.py:24: FutureWarning: Importing
GenerationMixin
fromsrc/transformers/generation_utils.py
is deprecated and will be removed in Transformers v5. Import asfrom transformers import GenerationMixin
instead. warnings.warn( Loading the model...
Thanks for testing different versions of Ubuntu system and CUDA, and test grobid which I forgot.
Description: Ubuntu 22.04.1 LTS NVIDIA-SMI 535.112 Driver Version: 537.42 CUDA Version: 12.2
Installation (Python 3.10.12):
setup_grobid
I notice your machine has advanced CUDA version 12.2, which matches with the default Pytorch installed by 'pip install sciassist'. It gives errors when you have an older version of CUDA and a non-compatible Pytorch. And yes, the version of pytorch-lightning is not the reason for errors. I also have these warning issues, which are ignored so far.
Testing summary
Even though I did not install torch or pytorch-lightning before installing SciAssist, it could still run properly. Hence I believe users can do
pip install SciAssist
straightaway. However, note that when testing, I ran intoFutureWarning
, telling me topip install xformers
.I tried
pip install xformers
:Successfully installed torch-2.1.0 xformers-0.0.22.post7
When running test, it is successful but same problem:
environments/SciAssist2/lib/python3.10/site-packages/transformers/generation_utils.py:24: FutureWarning: Importing
GenerationMixin
fromsrc/transformers/generation_utils.py
is deprecated and will be removed in Transformers v5. Import asfrom transformers import GenerationMixin
instead. warnings.warn( Loading the model...
nvidia-smi
NVIDIA-SMI 536.99 Driver Version: 536.99 CUDA Version: 12.2
Tried
python -m venv .env
.env\Scripts\activate
pip install SciAssist
python -m venv .env
.env\Scripts\activate
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
python -m venv .env
.env\Scripts\activate
pip3 install torch torchvision torchaudio
Got same error:
AppData\Local\Temp\pip-build-env-bcle4ruo\overlay\Lib\site-packages\setuptools\dist.py:674: SetuptoolsDeprecationWarning: The namespace_packages parameter is deprecated. !!
********************************************************************************
Please replace its usage with implicit namespaces (PEP 420).
See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages for details.
********************************************************************************
!!
ep.load()(self, ep.name, value)
Edit mplsetup.cfg to change the build options; suppress output with --quiet.
BUILDING MATPLOTLIB
python: yes [3.11.5 | packaged by Anaconda, Inc. | (main, Sep 11 2023,
13:26:23) [MSC v.1916 64 bit (AMD64)]]
platform: yes [win32]
tests: no [skipping due to configuration]
macosx: no [Mac OS-X only]
running build_ext Extracting /project/freetype/freetype2/2.6.1/freetype-2.6.1.tar.gz Building freetype in build\freetype-2.6.1 msbuild build\freetype-2.6.1\builds\windows\vc2010\freetype.sln /t:Clean;Build /p:Configuration=Release;Platform=x64 error: command 'msbuild' failed: None [end of output] note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for matplotlib Failed to build matplotlib ERROR: Could not build wheels for matplotlib, which is required to install pyproject.toml-based projects
I upgraded pip and setuptools to pip 23.3.1
and setuptools 69.0.2
but still same error.
Commands used
Error message
sciassist is not installed!