jiarong / VirSorter2

customizable pipeline to identify viral sequences from (meta)genomic data
GNU General Public License v2.0
210 stars 28 forks source link

Error in rule merge_split_hmmtbl_by_group_tmp: #177

Open ileleiwi opened 10 months ago

ileleiwi commented 10 months ago

Hello, I'm receiving this error

[2023-11-02 11:44 INFO] /usr/local/bin/virsorter run --working-dir /global/cfs/cdirs/m3264/GRE/Kai_work/virsorter2/virsorter2_out/GRE.SIPMG.052b4f26.contigs --seqfile /global/cfs/cdirs/m3264/GRE/contigs_renamed/failed/GRE.SIPMG.052b4f26.contigs.fna --keep-original-seq --include-groups dsDNAphage,NCLDV,ssDNA,lavidaviridae --rm-tmpdir --db-dir /global/cfs/cdirs/m3264/databases/db --min-length 5000 --min-score 0.5
[2023-11-02 11:44 INFO] Using /global/homes/l/leleiwi1/.virsorter/template-config.yaml as config template
[2023-11-02 11:44 INFO] conig file written to /global/cfs/cdirs/m3264/GRE/Kai_work/virsorter2/virsorter2_out/GRE.SIPMG.052b4f26.contigs/config.yaml

[2023-11-02 11:44 INFO] Executing: snakemake --snakefile /usr/local/lib/python3.10/site-packages/virsorter/Snakefile --directory /global/cfs/cdirs/m3264/GRE/Kai_work/virsorter2/virsorter2_out/GRE.SIPMG.052b4f26.contigs --jobs 256 --configfile /global/cfs/cdirs/m3264/GRE/Kai_work/virsorter2/virsorter2_out/GRE.SIPMG.052b4f26.contigs/config.yaml --latency-wait 600 --rerun-incomplete --nolock  --conda-frontend mamba --conda-prefix /global/cfs/cdirs/m3264/databases/db/conda_envs --use-conda    --quiet  all   
Job counts:
    count   jobs
    1   all
    1   check_point_for_reclassify
    1   classify
    4   classify_by_group
    4   classify_full_and_part_by_group
    1   extract_feature
    1   extract_provirus_seqs
    1   finalize
    4   hmm_features_by_group
    1   hmm_sort_to_best_hit_taxon
    4   hmm_sort_to_best_hit_taxon_by_group
    1   hmmsearch
    1   hmmsearch_by_group
    1   merge_classification
    1   merge_full_and_part_classification
    4   merge_hmm_gff_features_by_group
    4   merge_provirus_call_by_group_by_split
    1   merge_provirus_call_from_groups
    1   merge_split_hmmtbl
    4   merge_split_hmmtbl_by_group
    1   merge_split_hmmtbl_by_group_tmp
    1   pick_viral_fullseq
    4   split_gff_by_group
    47
cat: iter-0/NCLDV/all.pdg.faa.splitdir/all.pdg.faa.ss.0.split.Viruses.splithmmtbl: No such file or directory
[Thu Nov  2 11:45:02 2023]
Error in rule merge_split_hmmtbl_by_group_tmp:
    jobid: 158
    output: iter-0/NCLDV/all.pdg.Viruses.hmmtbl.tmp
    shell:

        Group_specific_hmmdb=/global/cfs/cdirs/m3264/databases/db/group/NCLDV/customized.hmm
        Rbs_pdg_db=/global/cfs/cdirs/m3264/databases/db/group/NCLDV/rbs-prodigal-train.db
        if [ -s $Rbs_pdg_db ] || [ -s $Group_specific_hmmdb ]; then
            printf "%s
" iter-0/NCLDV/all.pdg.faa.splitdir/all.pdg.faa.ss.1.split.Viruses.splithmmtbl iter-0/NCLDV/all.pdg.faa.splitdir/all.pdg.faa.ss.0.split.Viruses.splithmmtbl | xargs cat > iter-0/NCLDV/all.pdg.Viruses.hmmtbl.tmp
        else
            touch iter-0/NCLDV/all.pdg.Viruses.hmmtbl.tmp
        fi

        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Exiting because a job execution failed. Look above for error message

It seems a necessary file isn't being generated that causes downstream steps to fail. I'm running in this docker container (quay.io/biocontainers/virsorter:2.2.4--pyhdfd78af_1) on a slurm managed server with this command:

virsorter run --working-dir /global/cfs/cdirs/m3264/GRE/Kai_work/virsorter2/virsorter2_out/"$base" --seqfile "$fasta" --keep-original-seq --include-groups dsDNAphage,NCLDV,ssDNA,lavidaviridae --rm-tmpdir --db-dir /global/cfs/cdirs/m3264/databases/db --min-length 5000 --min-score 0.5

I chose to run in docker because of a prior error with the hmmsearch step that you mentioned in another Issue was due to hmmer memory allocation (or something like that) on certain systems.

I've had success running some of my contigs in a conda environment on the server as well as in the docker image I referenced. Only some of the contigs are having the above error. Do you know what's causing this or can you advise how to fix this issue?

Thanks, Ikaia

jiarong commented 10 months ago

Hi, you ran with a container (quay.io/biocontainers/virsorter:2.2.4--pyhdfd78af_1) with --db-dir outside the container, right? Can you check if /global/cfs/cdirs/m3264/GRE/Kai_work/virsorter2/virsorter2_out/GRE.SIPMG.052b4f26.contigs/iter-0/NCLDV/all.pdg.faa.splitdir/all.pdg.faa.ss.0.split.Viruses.splithmmtbl exists after the run failed?

ileleiwi commented 10 months ago

Thanks for the quick response! Yes that file does exist after the failed run. And you're correct, I'm running in that container with the database directory outside the container

Here is the head and tail output of all.pdg.faa.ss.0.split.Viruses.splithmmtbl

head

==> all.pdg.faa.ss.0.split.Viruses.splithmmtbl <==
#                                                                       --- full sequence ---- --- best 1 domain ---- --- domain number estimation ----
# target name        accession  query name                   accession    E-value  score  bias   E-value  score  bias   exp reg clu  ov env dom rep inc description of target
#------------------- ----------         -------------------- ---------- --------- ------ ----- --------- ------ -----   --- --- --- --- --- --- --- --- ---------------------
GRE.SIPMG.052b4f26.contigs_scaffold_84_c2||rbs:NCLDV_3 -          Phage_cluster_10021.ali_faa -            1.8e-15   57.7  47.1   3.6e-15   56.6  47.1   1.4   1   0   0   1   1   1   1 # 1931 # 3247 # -1 # ID=59_3;partial=00;start_type=GTG;rbs_motif=None;rbs_spacer=None;gc_cont=0.670
GRE.SIPMG.052b4f26.contigs_scaffold_3816_c1||rbs:NCLDV_3 -          Phage_cluster_1005.ali_faa -            5.3e-11   43.4   0.0   1.3e-10   42.1   0.0   1.6   1   0   0   1   1   1   1 # 3752 # 5560 # -1 # ID=142_3;partial=00;start_type=ATG;rbs_motif=None;rbs_spacer=None;gc_cont=0.627
GRE.SIPMG.052b4f26.contigs_scaffold_905_c1||rbs:NCLDV_2 -          Phage_cluster_10085.ali_faa -            3.5e-16   60.1   0.0   5.8e-16   59.4   0.0   1.2   1   0   0   1   1   1   1 # 721 # 2772 # 1 # ID=1028_2;partial=00;start_type=ATG;rbs_motif=ATA;rbs_spacer=6bp;gc_cont=0.631
GRE.SIPMG.052b4f26.contigs_scaffold_663_c1||rbs:NCLDV_2 -          Phage_cluster_10106.ali_faa -            6.3e-57  193.0   0.1   1.4e-56  191.8   0.1   1.5   1   1   0   1   1   1   1 # 705 # 1304 # 1 # ID=495_2;partial=00;start_type=ATG;rbs_motif=None;rbs_spacer=None;gc_cont=0.622
GRE.SIPMG.052b4f26.contigs_scaffold_3816_c1||rbs:NCLDV_3 -          Phage_cluster_1012.ali_faa -              2e-15   57.7   0.0   2.1e-08   34.6   0.0   3.1   1   1   1   2   2   2   2 # 3752 # 5560 # -1 # ID=142_3;partial=00;start_type=ATG;rbs_motif=None;rbs_spacer=None;gc_cont=0.627
GRE.SIPMG.052b4f26.contigs_scaffold_3049_c1||rbs:NCLDV_4 -          Phage_cluster_1012.ali_faa -            3.7e-13   50.2   0.0   6.4e-10   39.6   0.0   3.0   1   1   0   1   1   1   1 # 4159 # 6141 # 1 # ID=2438_4;partial=00;start_type=ATG;rbs_motif=None;rbs_spacer=None;gc_cont=0.658
GRE.SIPMG.052b4f26.contigs_scaffold_81_c1||rbs:NCLDV_6   -          Phage_cluster_1012.ali_faa -            2.7e-12   47.4   0.0   8.3e-08   32.6   0.0   2.3   2   1   0   2   2   2   2 # 5796 # 7853 # -1 # ID=109_6;partial=00;start_type=ATG;rbs_motif=ATA;rbs_spacer=15bp;gc_cont=0.633

tail

#
# Program:         hmmsearch
# Version:         3.3.2 (Nov 2020)
# Pipeline mode:   SEARCH
# Query file:      /global/cfs/cdirs/m3264/databases/db/hmm/viral/combined.hmm
# Target file:     /tmp/vs2-GKpCtMChUI1W/all.pdg.faa.ss.0.split
# Option settings: hmmsearch --tblout iter-0/NCLDV/all.pdg.faa.splitdir/all.pdg.faa.ss.0.split.Viruses.splithmmtbl --noali -T 30 --cpu 2 /global/cfs/cdirs/m3264/databases/db/hmm/viral/combined.hmm /tmp/vs2-GKpCtMChUI1W/all.pdg.faa.ss.0.split 
# Current dir:     /global/cfs/cdirs/m3264/GRE/Kai_work/virsorter2/virsorter2_out/GRE.SIPMG.052b4f26.contigs
# Date:            Thu Nov  2 14:13:12 2023
# [ok]

all.pdg.faa.ss.0.split.Viruses.splithmmtbl.log simply contains the following text # HMMER 3.3.2 (Nov 2020); http://hmmer.org/

jiarong commented 10 months ago

It's probably long latency of HPC file system causing this error ( ie. hardware issue). You can simply resubmit the same script, VirSorter2 can pick up where it fail and continue. Also adding --latency-wait 600 (meaning wait 10mins between steps input files of later steps not found) in the end of command (after all) might help.

ileleiwi commented 10 months ago

Thank you, that worked!