marbl / parsnp

Parsnp was designed to align the core genome of hundreds to thousands of bacterial genomes within a few minutes to few hours. Input can be both draft assemblies and finished genomes, and output includes variant (SNP) calls, core genome phylogeny and multi-alignments. Parsnp leverages contextual information provided by multi-alignments surrounding SNP sites for filtration/cleaning, in addition to existing tools for recombination detection/filtration and phylogenetic reconstruction.
Other
129 stars 25 forks source link

Issues with conda installation: incompatibilities between python versions of fastani and pyspoa? #156

Open thierrygrange opened 4 months ago

thierrygrange commented 4 months ago

You wrote in the readme file that Parsnp also requires RaxML (or FastTree), Harvest-tools, and numpy. Some additional features require pySPOA, Mash, FastANI, and Phipack. All of these packages are available via Conda (many on the Bioconda channel).

Dear colleague

I am struggling to install all tools in a single conda environment dedicated to parsnp. There are apparently incompatibilities between the python versions needed for the various tools. When installing fastani via conda, it installs a deprecated version of python, python-2.7.18, which may prevent installing of pyspoa which requires python > 3.10 <3.11 and may also affect the code.

When trying to install pyspoa after I installed fastani (which is not available anymore on the bioconda channel despite what is written on the conda web site, but can be found when using default channels), I receive the following errors:

LibMambaUnsatisfiableError: Encountered problems while solving:

Could not solve for environment specs The following packages are incompatible ├─ pin-1 is installable and it requires │ └─ python 2.7.* , which can be installed; └─ pyspoa is not installable because there are no viable options ├─ pyspoa [0.0.10|0.0.3|0.0.9|0.2.1] would require │ └─ python >=3.10,<3.11.0a0 , which conflicts with any installable versions previously reported; ├─ pyspoa [0.0.10|0.0.3|0.0.9|0.2.1] would require │ └─ python >=3.8,<3.9.0a0 , which conflicts with any installable versions previously reported; ├─ pyspoa [0.0.10|0.0.3|0.0.9|0.2.1] would require │ └─ python >=3.9,<3.10.0a0 , which conflicts with any installable versions previously reported; ├─ pyspoa 0.0.3 would require │ └─ python >=3.6,<3.7.0a0 , which conflicts with any installable versions previously reported; ├─ pyspoa 0.0.3 would require │ └─ python >=3.7,<3.8.0a0 , which conflicts with any installable versions previously reported; ├─ pyspoa 0.2.1 would require │ └─ python >=3.11,<3.12.0a0 , which conflicts with any installable versions previously reported; └─ pyspoa 0.2.1 would require └─ python >=3.12,<3.13.0a0 , which conflicts with any installable versions previously reported.

Pins seem to be involved in the conflict. Currently pinned specs:

How can I solve the issues and have a fully functional conda environment compatible with most of the parsnp script? Can you recommend specific versions of the various tools to install them with conda specifying the versions?

In addition, which version of raxml can be used, raxml of raxml-ng? It would be good to be able to use raxml-ng.

Finally, a fully functional singularity or docker environment would be ideal to circumvent problems stemming from constantly evolving dependencies creating new incompatibilities between the various tools used in your very appealing script.

Thanks for your help

Thierry Grange

thierrygrange commented 4 months ago

In addition, I have also tried installing from source parsnp in a Centos 7 environment that is not compatible anymore with conda install. I installed some of the dependencies with pip install instead of conda. I also encountered an incompatibility when trying to pip install pyspoa, in that case even though I had not yet tried to install fastani. pyspoa thus seems to create an issue in various configurations. When trying to run parsnps, I receive an error message saying that pyspoa is needed.

bkille commented 3 months ago

Hi @thierrygrange, sorry for the delayed response. I've made a fix that will make sure the pyspoa module is not imported if you run with --no-partition. It will be in the next release!

In the meantime, have you tried creating a fresh environment just for parsnp? i.e.

conda create -n parsnp-env "parsnp>=2.0.5"

This sometimes works better than creating an environment first, then installing the desired package, as then conda can decide which python version to create the environment with based on the package you're requesting.