bowang-lab / scGPT

https://scgpt.readthedocs.io/en/latest/
MIT License
1.05k stars 207 forks source link

Install issue with scvi dependency #260

Closed ERSchultz closed 3 weeks ago

ERSchultz commented 1 month ago

Hi,

I'm having trouble installing scGPT due to issues (largely) with scvi.

I'm using the following script to install:

envName=scgpt_simple
conda create --name $envName -y
conda activate $envName
conda install -y python==3.9
pip install scgpt "flash-attn<1.0.5" "numpy<2" anndata==0.10.8 wandb IPython

This leads to me installing scvi-tools-0.20.3 (I'm using anndata==0.10.8 to resolve the issue discussed here: https://github.com/scverse/scvi-tools/issues/2953)

I encounter the following runtime issue:

Traceback (most recent call last):
  File "/home/erschultz/scGPT/examples/finetune_integration.py", line 109, in <module>
    adata = scvi.data.pbmc_dataset()  # 11990 × 3346
  File "/home/erschultz/anaconda3/envs/scgpt_simple/lib/python3.9/site-packages/scvi/data/_datasets.py", line 57, in pbmc_dataset
    return _load_pbmc_dataset(
  File "/home/erschultz/anaconda3/envs/scgpt_simple/lib/python3.9/site-packages/scvi/data/_built_in_data/_pbmc.py", line 81, in _load_pbmc_dataset
    barcodes_metadata = pbmc_metadata["barcodes"].index.values.ravel().astype(np.str)
  File "/home/erschultz/anaconda3/envs/scgpt_simple/lib/python3.9/site-packages/numpy/__init__.py", line 324, in __getattr__
    raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'str'.
`np.str` was a deprecated alias for the builtin `str`. To avoid this error in existing code, use `str` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.str_` here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

I tried reverting the version of scvi-tools suggested by the docs/requirements.txt pip install scvi-tools==0.16.4

However, I then encounter the following runtime issue:

Traceback (most recent call last):
  File "/home/erschultz/scGPT/examples/finetune_integration.py", line 16, in <module>
    import scvi
  File "/home/erschultz/anaconda3/envs/scgpt_simple/lib/python3.9/site-packages/scvi/__init__.py", line 7, in <module>
    from ._settings import settings
  File "/home/erschultz/anaconda3/envs/scgpt_simple/lib/python3.9/site-packages/scvi/_settings.py", line 6, in <module>
    import pytorch_lightning as pl
  File "/home/erschultz/anaconda3/envs/scgpt_simple/lib/python3.9/site-packages/pytorch_lightning/__init__.py", line 20, in <module>
    from pytorch_lightning.callbacks import Callback  # noqa: E402
  File "/home/erschultz/anaconda3/envs/scgpt_simple/lib/python3.9/site-packages/pytorch_lightning/callbacks/__init__.py", line 14, in <module>
    from pytorch_lightning.callbacks.base import Callback
  File "/home/erschultz/anaconda3/envs/scgpt_simple/lib/python3.9/site-packages/pytorch_lightning/callbacks/base.py", line 26, in <module>
    from pytorch_lightning.utilities.types import STEP_OUTPUT
  File "/home/erschultz/anaconda3/envs/scgpt_simple/lib/python3.9/site-packages/pytorch_lightning/utilities/__init__.py", line 18, in <module>
    from pytorch_lightning.utilities.apply_func import move_data_to_device  # noqa: F401
  File "/home/erschultz/anaconda3/envs/scgpt_simple/lib/python3.9/site-packages/pytorch_lightning/utilities/apply_func.py", line 30, in <module>
    from torchtext.legacy.data import Batch
ModuleNotFoundError: No module named 'torchtext.legacy'

I have torchtext-0.16.2+cpu with torch-2.1.2+cu121. The requirements suggest torchtext==0.14.0 with torch-1.13.0.

As an alternative, I tried setting up an environment specifying all the requirements in docs/requirements.txt.

envName=scgpt_minimal2
conda deactivate
conda remove -n $envName --all -y
conda create --name $envName -y
conda activate $envName
conda install -y python==3.9
conda install -y pytorch==1.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia
pip install "matplotlib<3.7" anndata==0.8.0 datasets==2.3.2 leidenalg==0.8.10 llvmlite==0.38.1 networkx==2.6.3 numba==0.55.2 numpy==1.21.6 pandas==1.3.5 scanpy==1.9.1 scikit-learn==1.0.2 scikit-misc==0.1.4 scipy==1.7.3 scvi-tools==0.16.4 setuptools==59.5.0 scib torchtext==0.14.0 tqdm==4.64.0 transformers==4.20.1 typing-extensions==4.2.0 umap-learn==0.5.3 "flash-attn<1.0.5"
pip install -U --no-deps scgpt

I encounter the same ModuleNotFoundError as above, despite specifying the version of scvi and torchtext. This suggests that pytorch_lightning may be the issue. I have pytorch-lightning-1.5.3. Newer version of pytorch-lightning seem to have resolved this error: https://github.com/Lightning-AI/pytorch-lightning/issues/10597. However, according to pip, "scvi-tools 0.16.4 requires pytorch-lightning<1.6,>=1.5".

If the developers could provide a complete .yml file that would be much appreciated!

Please advise, thanks!

subercui commented 1 month ago

Hi, thank you for the question. We only have a bit dependency to scvi, I will try to remove that package dependency today or tomorrow. In the meantime, try to relax some constraints in your pip install command, for example, I saw some mismatch you mentioned about torchtext. Here in our project toml file: https://github.com/bowang-lab/scGPT/blob/7301b51a72f5db321fccebb51bc4dd1380d99023/pyproject.toml#L16-L17 we don't require any specific version for that. I will suggest you install it by pip install torch==version_you_need torchtext. pip should be able to pickup the right version of torchtext that matches the torch version.

Let me know if there is any update