CambridgeMolecularEngineering / chemdataextractor2

ChemDataExtractor Version 2.0
Other
121 stars 28 forks source link

Unable to install cde 2.1.2 on a Google Colab/Kaggle notebook #13

Open Spadet opened 2 years ago

Spadet commented 2 years ago

Hi,

I tried to test cde 2.1.2 lately on a Google Colab/Kaggle notebook without success. The installation of the library is troublesome and finish by the following error:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. yellowbrick 1.4 requires scikit-learn>=1.0.0, but you have scikit-learn 0.22.1 which is incompatible. pip-tools 6.2.0 requires click>=7, but you have click 6.7 which is incompatible. imbalanced-learn 0.8.1 requires scikit-learn>=0.24, but you have scikit-learn 0.22.1 which is incompatible. google-colab 1.0.0 requires requests~=2.23.0, but you have requests 2.21.0 which is incompatible. en-core-web-sm 2.2.5 requires spacy>=2.2.2, but you have spacy 2.1.9 which is incompatible. datascience 0.10.6 requires folium==0.2.1, but you have folium 0.8.3 which is incompatible.

Python was 3.7.13 for Colab and 3.7.12 for Kaggle. I thought these versions were compatible with this version, is there something else that could cause a problem?

Thank you in advance!

ViktorWeissenborn commented 2 years ago

Hello Spadet, i'm also still struggling with dependency conflicts mostly from click, requests, spacy, urllib3, and others. So a solution to your problem would be super helpful. A colleague of mine also tried the installation of 2.1.1 and 2.1.2 without success on ubuntu and windows.

greetings viktor

Spadet commented 2 years ago

Good to know I'm not the only one facing problems with the latest version! I managed to install 2.1.2 on Ubuntu 20.04 using a Python 3.7.11 environment. I also installed jsonnet beforehand and installed GCC at one point too. Hope it will help!

ViktorWeissenborn commented 2 years ago

Thanks for the info, i also have exactly one cde 2.1.1 installation in a venv environment on ubuntu 20.04. But i couldnt replicate the installation in conda for example or another venv environment.

ti250 commented 2 years ago

Sorry for the late reply, does this seem to happen when CDE is installed in a clean environment or only if you already have other things installed? There's a difficulty in that the NER code uses allennlp 0.9.0 so we can't use newer versions of some libraries, probably leading to these conflicts... If the issue happens in a clean environment, I'm happy to look into it, otherwise I don't think there's a solution in the near term... If anyone wants to make a PR migrating the code and the NER in particular to use more recent versions of allennlp it would be much appreciated; I would love to do this but currently don't have the time

Spadet commented 2 years ago

Indeed, I had a clean environment when trying the two notebooks so it was surprising!

ti250 commented 2 years ago

I'll try installing on a Colab notebook then - my suspicion is that it comes with some libraries pre-installed for the average usecase, but those are newer versions than what CDE wants due to its dependence on allennlp 0.9.0

Spadet commented 2 years ago

This can be the reason you're right! However, I did encounter similar installation issues using a clean Python 3.9.X environment although it is said to be compatible (OS: Ubuntu 20.04).

ti250 commented 2 years ago

Yeah there seems to be issues with Python 3.9.x, I'll have to remove it from the readme saying it's compatible...

regtm commented 1 year ago

What environment is known to work? After trying in colab and recieving the above dependency lock i tried in a virtual environment with python 3.6 running into issues building as the tokenizers package seems to rely on pyo3 which requires at least 3.7:

error: failed to run custom build command for pyo3-ffi v0.16.6

Caused by: process didn't exit successfully: /tmp/pip-install-vnmpfxfz/tokenizers_32f224e543a24d05a5fefeb3c8a79c1b/target/release/build/pyo3-ffi-31b85eb77d016a47/build-script-build (exit status: 1) --- stdout cargo:rerun-if-env-changed=PYO3_CROSS cargo:rerun-if-env-changed=PYO3_CROSS_LIB_DIR cargo:rerun-if-env-changed=PYO3_CROSS_PYTHON_VERSION cargo:rerun-if-env-changed=PYO3_CROSS_PYTHON_IMPLEMENTATION cargo:rerun-if-env-changed=PYO3_PRINT_CONFIG

--- stderr
error: the configured Python interpreter version (3.6) is lower than PyO3's minimum supported version (3.7)

warning: build failed, waiting for other jobs to finish... error: build failed error: cargo failed with code: 101


ERROR: Failed building wheel for tokenizers ERROR: Could not build wheels for tokenizers, which is required to install pyproject.toml-based projects

For python 3.7 and 3.8 i ran into another dependency issue for click:

Could not find a version that matches click==6.7,>=8.0