shandley / hecatomb

hecatomb is a virome analysis pipeline for analysis of Illumina sequence data
MIT License
54 stars 12 forks source link

fastp not building #76

Closed beardymcjohnface closed 1 year ago

beardymcjohnface commented 2 years ago

running into issue with fastp not building with conda strict channel priorities on NCI. work around is to unset strict, build fastp, and reset strict.

alanorth commented 2 years ago

We are hitting this while running the hecatomb tests. So how do we do that? I tried this:

$ hecatomb run --test --threads 8
...
CreateCondaEnvironmentException:
Your conda installation is not configured to use strict channel priorities. This is however crucial for having robust and correct environments (for details, see https://conda-forge.org/docs/user/tipsandtricks.html). Please configure strict priorities by executing 'conda config --set channel_priority strict'.  
$ conda config --set channel_priority strict
$ hecatomb run --test --threads 8
...
Output:
Encountered problems while solving:
  - package fastp-0.23.2-hd36eab0_1 requires libdeflate >=1.9,<1.10.0a0, but none of the providers can be installed
$ conda config --set channel_priority false
$ hecatomb run --test --threads 8
...
CreateCondaEnvironmentException:
Your conda installation is not configured to use strict channel priorities. This is however crucial for having robust and correct environments (for details, see https://conda-forge.org/docs/user/tipsandtricks.html). Please configure strict priorities by executing 'conda config --set channel_priority strict'.  
beardymcjohnface commented 2 years ago

Hi Alan, Find your conda installation folder with

which hecatomb

it should come back with something like ~/miniconda3/envs/hecatomb/bin/hecatomb, so the hecatomb env would be ~/miniconda3/envs/hecatomb, and the conda environments for the pipeline are in snakemake/workflow/conda/ in the env directory (so ~/miniconda3/envs/hecatomb/snakemake/workflow/conda/ for me).

Find the fastp env directory with:

grep fastp ~/miniconda3/envs/hecatomb/snakemake/workflow/conda/*.yaml

for me it was ~/miniconda3/envs/hecatomb/snakemake/workflow/conda/c9bd6a8dd6dec26c660f249cd191d4e3.yaml so I would activate the fastp env like so:

conda activate ~/miniconda3/envs/hecatomb/snakemake/workflow/conda/c9bd6a8dd6dec26c660f249cd191d4e3/

then install fastp with conda. It might just be a build issue for that specific version of fastp so you could try building an older version or a specific build:

mamba install fastp
# or 
mamba install fastp=0.23.2=hb7a2d85_2
# or
mamba install fastp=0.23.1
# test it
fastp --help

Then deactivate, reactive your hecatomb env if needed and try rerunning the test dataset.

alanorth commented 2 years ago

@shandley thanks for that. I found the Conda env directory for fastp from the error output, but I can't activate it becausse

$ hecatomb run --test --threads 8
...
CreateCondaEnvironmentException:
Could not create conda environment from /var/scratch/aorth/miniconda/envs/hecatomb/snakemake/workflow/rules/../envs/fastp.yaml:
Command:
mamba env create --quiet --file "/var/scratch/aorth/miniconda/envs/hecatomb/snakemake/workflow/conda/23320e9effda05579c3e44e4a97b8
016.yaml" --prefix "/var/scratch/aorth/miniconda/envs/hecatomb/snakemake/workflow/conda/23320e9effda05579c3e44e4a97b8016"
Output:
Encountered problems while solving:
  - package fastp-0.23.1-h79da9fb_0 requires libdeflate >=1.7,<1.8.0a0, but none of the providers can be installed

$ conda activate /var/scratch/aorth/miniconda/envs/hecatomb/snakemake/workflow/conda/23320e9effda05579c3e44
e4a97b8016
Not a conda environment: /var/scratch/aorth/miniconda/envs/hecatomb/snakemake/workflow/conda/23320e9effda05579c3e44e4a97b8016

The directory doesn't exist, but there is a yaml file there:

$ ls -ld /var/scratch/aorth/miniconda/envs/hecatomb/snakemake/workflow/conda/015e448182b8ee49cd9d13963b6521ee
ls: cannot access /var/scratch/aorth/miniconda/envs/hecatomb/snakemake/workflow/conda/015e448182b8ee49cd9d13963b6521ee: No such file or directory
$ ls -l /var/scratch/aorth/miniconda/envs/hecatomb/snakemake/workflow/conda/015e448182b8ee49cd9d13963b6521ee.yaml 
-rw-rw-r--. 1 aorth aorth 95 Jun  2 16:29 /var/scratch/aorth/miniconda/envs/hecatomb/snakemake/workflow/conda/015e448182b8ee49cd9d13963b6521ee.yaml

Is this a snakemake issue or a fastp issue? Let's fix it properly.

beardymcjohnface commented 2 years ago

After some digging this looks like a weird conda feature. one of the fastp dependencies is on both bioconda and conda-forge. with strict channel priority it will try to use the bioconda version which is not a recent enough version for fastp. If you edit the yaml file to this it should work (just changing the order of bioconda and conda-forge):

name: fastp
channels:
  - conda-forge
  - bioconda
  - defaults
dependencies:
  - fastp=0.23.2

I'll need to go through all the env yaml files and make sure they're compatible with strict channel priority, including the bioconda build file for hecatomb.

beardymcjohnface commented 2 years ago

I just tried this and I think this might be a much more robust way to go:

name: fastp
channels:
  - bioconda
  - conda-forge
  - defaults
dependencies:
  - bioconda::fastp=0.23.2
  - conda-forge::_libgcc_mutex>=0.1
  - conda-forge::_openmp_mutex>=4.5
  - conda-forge::isa-l>=2.30.0
  - conda-forge::libdeflate>=1.10
  - conda-forge::libgcc-ng>=12.1.0
  - conda-forge::libgomp>=12.1.0
  - conda-forge::libstdcxx-ng>=12.1.0
linsalrob commented 2 years ago

I think we should post a bioconda issue to remove the out of date version of the dependencies. They should not be in both bioconda and conda-forge.

alanorth commented 2 years ago

Thanks for that analysis @beardymcjohnface. I tried the fastp.yaml file with qualified dependencies and got this during hecatomb test:

Building DAG of jobs...
Creating conda environment envs/hecatomb/snakemake/workflow/envs/krona.yaml...
Downloading and installing remote packages.
Environment for /var/scratch/aorth/miniconda/envs/hecatomb/snakemake/workflow/rules/../envs/krona.yaml created (location: envs/hecatomb/snakemake/workflow/conda/1605896324d479b446743f9b383cb72c)
Creating conda environment envs/hecatomb/snakemake/workflow/envs/seqkit.yaml...
Downloading and installing remote packages.
Environment for /var/scratch/aorth/miniconda/envs/hecatomb/snakemake/workflow/rules/../envs/seqkit.yaml created (location: envs/hecatomb/snakemake/workflow/conda/7f746edc96f0f94e852dc87893a848fd)
Creating conda environment envs/hecatomb/snakemake/workflow/envs/megahit.yaml...
Downloading and installing remote packages.
Environment for /var/scratch/aorth/miniconda/envs/hecatomb/snakemake/workflow/rules/../envs/megahit.yaml created (location: envs/hecatomb/snakemake/workflow/conda/11ce9fe7a9fd961aa0063d7280b2f5c1)
Creating conda environment envs/hecatomb/snakemake/workflow/envs/pysam.yaml...
Downloading and installing remote packages.
Environment for /var/scratch/aorth/miniconda/envs/hecatomb/snakemake/workflow/rules/../envs/pysam.yaml created (location: envs/hecatomb/snakemake/workflow/conda/1fa91359cdc86dbcd83d5ba26866a727)
Creating conda environment envs/hecatomb/snakemake/workflow/envs/fastp.yaml...
Downloading and installing remote packages.
CreateCondaEnvironmentException:
Could not create conda environment from /var/scratch/aorth/miniconda/envs/hecatomb/snakemake/workflow/rules/../envs/fastp.yaml:
Command:
mamba env create --quiet --file "/var/scratch/aorth/miniconda/envs/hecatomb/snakemake/workflow/conda/5ecabcbd4fedaf37fbcd62e51273c8bf.yaml" --prefix "/var/scratch/aorth/miniconda-bernice/envs/hecatomb/snakemake/workflow/conda/5ecabcbd4fedaf37fbcd62e51273c8bf"
Output:
Encountered problems while solving:
  - package libdeflate-1.10-h7f98852_0 is excluded by strict repo priority

But it does run with the other version of fastp.yaml where order of the channels is changed:

name: fastp
channels:
  - conda-forge
  - bioconda
  - defaults
dependencies:
  - fastp=0.23.2
beardymcjohnface commented 1 year ago

I'm hoping this is now fixed, but please reopen or open a new issue if you're still having problems with this.