EBI-Metagenomics / emg-viral-pipeline

VIRify: detection of phages and eukaryotic viruses from metagenomic and metatranscriptomic assemblies
Apache License 2.0
114 stars 15 forks source link

CheckV failed #96

Closed aquaiser closed 6 months ago

aquaiser commented 1 year ago

Hello, I try to run the virify pipeline using slurm singularity on our cluster, but I get : annotate:checkV failed. At first , run a small number of contigs (10contigs) using core 1 and max_cores 1 and everything is fine. But when I run with more cores the same contigs (10 and in another analysis more contigs) the error occurred. I submit with sbatch the following script (varying the cores parameter). I observed that the docker images are stored in a folder called "false". Is this ok ? In addition, I created a "/singularity_cachdir" which doesn't contain data. Is it necessary or favorable to indicate cpus an mem when running the script; as for example: sbatch --cpus-per-task=8 --mem=50G script_run_on_cluster_AQ_virify3.sh I would like to get a stable installation before running a high number of contigs.

Thanks for your help Achim

here my script

! /bin/bash

$ -S /bin/bash

$ -M achim.quaiser@univ-rennes1.fr

$ -m bea

. /softs/local/env/envnextflow-22.10.4.sh . /softs/local/env/envsingularity-3.8.5.sh

nextflow run EBI-Metagenomics/emg-viral-pipeline -r v0.4.0 --fasta 'Virsorter_vBig_10contigs.fasta' --cores 1 --max_cores 1 -profile slurm,singularity --databases /scratch/aquaiser/databases --singularity /scratch/aquaiser/singularity_cachdir --output /home/genouest/rbpe/aquaiser/virify_output

hoelzer commented 1 year ago

Hey @aquaiser thanks for your interest in the pipeline!

Please use the latest release version v1.0 (https://github.com/EBI-Metagenomics/emg-viral-pipeline/releases/tag/v1.0)

You run the pipeline with SLURM and Singularity, so the images are converted from Docker into Singularity and then stored. Per default, the Singularity images should be stored in a folder called singularity. You can adjust that via

--singularity_cachedir defines the path where images (singularity) are cached [default: $params.singularity_cachedir]

Is it necessary or favorable to indicate cpus an mem when running the script; as for example: sbatch --cpus-per-task=8 --mem=50G script_run_on_cluster_AQ_virify3.sh

That should not be necessary. Think of it like that: You start the nextflow pipeline and because you specify SLURM as the job scheduler, the pipeline will automatically create these sbatch shell scripts per process and submit your jobs to the cluster queue.

Thus said, problems may arise when submitting the nextflow command itself to a compute node. Can you just try to run your command

nextflow run EBI-Metagenomics/emg-viral-pipeline -r v1.0 --fasta 'Virsorter_vBig_10contigs.fasta' --cores 1 --max_cores 1 -profile slurm,singularity --databases /scratch/aquaiser/databases --singularity_cachedir /scratch/aquaiser/singularity_cachdir --output /home/genouest/rbpe/aquaiser/virify_output

directly from the login node? And let Nextflow handle the rest?

Does this work and solve your issue?

aquaiser commented 1 year ago

Hello, I eliminated all directories: databases, work, singularity like as a new installation. Then it worked the first run. When I rerun I got "annotate: ratio_evalue" failed. A third run failed with "annotate: checkV" with the following error below:

Could be the number of contigs in one file important ? Is it better to run several files with lower number of contigs ? I have one file with about 15000 contigs with sizes ranging from 5-300kb.

Thanks for your help Best Achim

Caused by: Process annotate:checkV (1) terminated with an error exit status (1)

Command executed:

checkv end_to_end high_confidence_viral_contigs_original.fasta -d checkv -t 24 high_confidence_viral_contigs cp high_confidence_viral_contigs/quality_summary.tsv high_confidence_viral_contigs_quality_summary.tsv

Command exit status: 1

Command output: (empty)

Command error: WARNING: DEPRECATED USAGE: Forwarding SINGULARITYENV_TMPDIR as environment variable will not be supported in the future, use APPTAINERENV_TMPDIR instead Error: database file not found 'checkv/genome_db/checkv_reps.dmnd'

Work dir: /scratch/aquaiser/work/74/707dac81f87a8473329dfe4ee01d46

Tip: when you have fixed the problem you can continue the execution adding the option -resume to the run command line

hoelzer commented 1 year ago

Hi!

could you please try to run the pipeline w/ the following release version:

# get recent versions
nextflow pull EBI-Metagenomics/emg-viral-pipeline

# run a release version that is working in progress at the moment
nextflow run EBI-Metagenomics/emg-viral-pipeline -r bugfix/issue-92-prophages-and-gff-generation ...

I would be interested to see if that solves your problem. thx!

aquaiser commented 1 year ago

Hello, Some feedback: I made several tests. Only when I eliminate database, singularity cache and work folder it runs to the end (but not always). If I don't eliminate them, checkV fails. It is independent of the bugfix realease. When I run more than one file, checkV fails. Often it indicates the checkV error but the error is ignored. Sometimes I get other errors that I can't reproduce. Actually I run a simple script (40 cores) with only one file containing 6000 contigs from 5-300kb. It is running since 20h, but it seems to be stuck at the last step of hmmscan_viphogs (see below). Is this possible ? It seems that hmmscan_viphogs needs a lot of resources, because I saw at one point that the execution is pending waiting for resources. Thanks for your help Best Achim

I can't attach files, so : executor > slurm (35) [f3/88abb7] process > download_pprmeta:pprmetaGet [100%] 1 of 1 ✔ [cf/af3462] process > download_virsorter_db:virso... [100%] 1 of 1 ✔ [90/255363] process > download_virfinder_db:virfi... [100%] 1 of 1 ✔ [95/6d78db] process > download_model_meta:metaGetDB [100%] 1 of 1 ✔ [88/f4f3cc] process > download_viphog_db:viphogGetDB [100%] 1 of 1 ✔ [28/ee3395] process > download_ncbi_db:ncbiGetDB [100%] 1 of 1 ✔ [29/6555a4] process > download_checkv_db:checkvGetDB [100%] 1 of 1 ✔ [bc/fac89a] process > preprocess:rename (1) [100%] 1 of 1 ✔ [2f/78596b] process > preprocess:length_filtering... [100%] 1 of 1 ✔ [d5/eabf0b] process > detect:virsorter (1) [100%] 1 of 1 ✔ [b7/9daf4b] process > detect:virfinder (1) [100%] 1 of 1 ✔ [ed/ad426b] process > detect:pprmeta (1) [100%] 1 of 1 ✔ [5b/3b0099] process > detect:parse (1) [100%] 1 of 1 ✔ [e2/6fc4ce] process > postprocess:restore (3) [100%] 3 of 3 ✔ [94/444121] process > annotate:prodigal (1) [100%] 3 of 3 ✔ [17/6a22da] process > annotate:hmmscan_viphogs (2) [ 66%] 2 of 3 [df/4f5909] process > annotate:hmm_postprocessing... [100%] 2 of 2 [55/407f07] process > annotate:ratio_evalue (2) [100%] 2 of 2 [7d/dd90cf] process > annotate:annotation (2) [100%] 2 of 2 [e7/f08eac] process > annotate:plot_contig_map (2) [100%] 2 of 2 [fd/31996d] process > annotate:assign (2) [100%] 2 of 2 [93/462254] process > annotate:checkV (3) [100%] 3 of 3 ✔ [- ] process > annotate:write_gff - [- ] process > plot:generate_krona_table - [- ] process > plot:krona - [- ] process > plot:generate_sankey_table - [- ] process > plot:sankey -