KwanLab / Autometa

Autometa: Automated Extraction of Genomes from Shotgun Metagenomes
https://autometa.readthedocs.io
Other
40 stars 15 forks source link

Pin scipy release to `1.8.*` until scikit-bio uses scipy 1.9 #285

Closed evanroyrees closed 1 year ago

evanroyrees commented 1 year ago

Scipy recently released version 1.9 which has caused some breaking changes for scikit-bio (autometa-kmers) module now fails when installing from conda. The fix is to pin scipy to version 1.8 until scikit-bio fixes their code..

Scipy release: https://github.com/scipy/scipy/releases/tag/v1.9.0

Steps to reproduce the error

  1. Install autometa using conda (or mamba)
mamba create -n autometa-test-env autometa -y
  1. Try the autometa-kmers entrypoint
(autometa-test-env) evan@userserver:$ autometa-kmers -h
Traceback (most recent call last):
  File "/home/evan/miniconda3/envs/am-test/bin/autometa-kmers", line 7, in <module>
    from autometa.common.kmers import main
  File "/home/evan/miniconda3/envs/am-test/lib/python3.9/site-packages/autometa/common/kmers.py", line 23, in <module>
    from skbio.stats.composition import ilr, clr, multiplicative_replacement
  File "/home/evan/miniconda3/envs/am-test/lib/python3.9/site-packages/skbio/__init__.py", line 11, in <module>
    import skbio.io  # noqa
  File "/home/evan/miniconda3/envs/am-test/lib/python3.9/site-packages/skbio/io/__init__.py", line 247, in <module>
    import_module('skbio.io.format.lsmat')
  File "/home/evan/miniconda3/envs/am-test/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/home/evan/miniconda3/envs/am-test/lib/python3.9/site-packages/skbio/io/format/lsmat.py", line 77, in <module>
    from skbio.stats.distance import DissimilarityMatrix, DistanceMatrix
  File "/home/evan/miniconda3/envs/am-test/lib/python3.9/site-packages/skbio/stats/distance/__init__.py", line 197, in <module>
    from ._mantel import mantel, pwmantel
  File "/home/evan/miniconda3/envs/am-test/lib/python3.9/site-packages/skbio/stats/distance/_mantel.py", line 16, in <module>
    from scipy.stats import PearsonRConstantInputWarning
ImportError: cannot import name 'PearsonRConstantInputWarning' from 'scipy.stats' (/home/evan/miniconda3/envs/am-test/lib/python3.9/site-packages/scipy/stats/__init__.py)

To fix the behavior in the mean time

Change scipy version to 1.8 instead of 1.9

(base) evan@userserver$ mamba create -n autometa-test-env autometa scipy==1.8.* -y
(autometa-test-env) evan@userserver:$ autometa-kmers -h
usage: autometa-kmers [-h] [--fasta filepath] [--kmers filepath] [--size int] [--norm-output filepath] [--norm-method {ilr,clr,am_clr}] [--pca-dimensions int] [--embedding-output filepath] [--embedding-method {sksne,bhsne,umap,densmap,trimap}]
                      [--embedding-dimensions int] [--force] [--cpus int] [--seed int]

Count k-mer frequencies of given `fasta`

optional arguments:
  -h, --help            show this help message and exit
  --fasta filepath      Metagenomic assembly fasta file (default: None)
  --kmers filepath      K-mers frequency tab-delimited table (will skip if file exists) (default: None)
  --size int            k-mer size in bp (default: 5)
  --norm-output filepath
                        Path to normalized kmers table (will skip if file exists) (default: None)
  --norm-method {ilr,clr,am_clr}
                        Normalization method to transform kmer counts prior to PCA and embedding. ilr: isometric log-ratio transform (scikit-bio implementation). clr: center log-ratio transform (scikit-bio implementation). am_clr: center log-ratio transform (Autometa
                        implementation). (default: am_clr)
  --pca-dimensions int  Number of dimensions to reduce to PCA feature space after normalization and prior to embedding (NOTE: Setting to zero will skip PCA step) (default: 50)
  --embedding-output filepath
                        Path to write embedded kmers table (will skip if file exists) (default: None)
  --embedding-method {sksne,bhsne,umap,densmap,trimap}
                        embedding method [sk,bh]sne are corresponding implementations from scikit-learn and tsne, respectively. (default: bhsne)
  --embedding-dimensions int
                        Number of dimensions of which to reduce k-mer frequencies (default: 2)
  --force               Whether to overwrite existing annotations (default: False)
  --cpus int            num. processors to use. (default: 96)
  --seed int            Seed to set random state for dimension reduction determinism. (default: 42)

To fix, add scipy==1.8.* to autometa-env.yml (pin scipy to version 1.8 until scikit-bio fixes their imports for 1.9)