Arcadia-Science / prehgt

A pipeline for lightweight screening of Eukaryotic genomes and transcriptomes for recent HGT
MIT License
11 stars 6 forks source link

error exit status 2 #59

Open lagphase opened 1 month ago

lagphase commented 1 month ago

Hello there,

I encountered this when trying the workflow:

nextflow run Arcadia-Science/prehgt -r v1.0.0 -profile conda --outdir fructi --input fructilactobacillus.tsv --blast_db inputs/nr_rep_seq.fasta.gz --blast_db_tax inputs/nr_cluster_taxid_formatted_final.sqlite --ko_list inputs/kofamscandb/ko_list --ko_profiles inputs/kofamscandb/profiles.tar.gz --hmm_db inputs/hmms/all_hmms.hmm

N E X T F L O W ~ version 24.04.2

Launching https://github.com/Arcadia-Science/prehgt [amazing_archimedes] DSL2 - revision: 98f25a7c64 [v1.0.0]


                                    ,--./,-.
    ___     __   __   __   ___     /,-._.--~'

|\ | | / / \ |__) |__ } { | \| | \__, \__/ | \ |___ \-.,--, .,._,' Arcadia-Science/prehgt v1.0dev

Core Nextflow options revision : v1.0.0 runName : amazing_archimedes launchDir : /media/vdpham/Vi_HDD_1/preHGT workDir : /media/vdpham/Vi_HDD_1/preHGT/work projectDir : /home/vdpham/.nextflow/assets/Arcadia-Science/prehgt userName : vdpham profile : conda configFiles : /home/vdpham/.nextflow/assets/Arcadia-Science/prehgt/nextflow.config

Input/output options input : fructilactobacillus.tsv outdir : fructi blast_db : inputs/nr_rep_seq.fasta.gz blast_db_tax: inputs/nr_cluster_taxid_formatted_final.sqlite ko_list : inputs/kofamscandb/ko_list ko_profiles : inputs/kofamscandb/profiles.tar.gz hmm_db : inputs/hmms/all_hmms.hmm

Max job request options max_cpus : 144 max_memory : 384.GB

!! Only displaying parameters that differ from the pipeline defaults !!

If you use Arcadia-Science/prehgt for your analysis please cite:

  • The nf-core framework https://doi.org/10.1038/s41587-020-0439-x

  • Software dependencies https://github.com/Arcadia-Science/prehgt/blob/main/CITATIONS.md

    executor > local (1) [22/dcde21] process > ARCADIASCIENCE_PREHGT:PREHGT:download_reference_genomes (Fructilactobacillus) [100%] 1 of 1, failed: 1 ✔ [- ] process > ARCADIASCIENCE_PREHGT:PREHGT:combine_and_parse_gff_per_genus - [- ] process > ARCADIASCIENCE_PREHGT:PREHGT:build_genus_pangenome - [- ] process > ARCADIASCIENCE_PREHGT:PREHGT:translate_pangenome - [- ] process > ARCADIASCIENCE_PREHGT:PREHGT:blastp_against_clustered_nr - [- ] process > ARCADIASCIENCE_PREHGT:PREHGT:blastp_add_taxonomy_info - [- ] process > ARCADIASCIENCE_PREHGT:PREHGT:blastp_to_hgt_candidates_kingdom - [- ] process > ARCADIASCIENCE_PREHGT:PREHGT:blastp_to_hgt_candidates_subkingdom - [- ] process > ARCADIASCIENCE_PREHGT:PREHGT:compositional_scans_pepstats - [- ] process > ARCADIASCIENCE_PREHGT:PREHGT:compositional_scans_to_hgt_candidates - [- ] process > ARCADIASCIENCE_PREHGT:PREHGT:combine_hgt_candidates - [- ] process > ARCADIASCIENCE_PREHGT:PREHGT:extract_hgt_candidates - [- ] process > ARCADIASCIENCE_PREHGT:PREHGT:kofamscan_hgt_candidates - [- ] process > ARCADIASCIENCE_PREHGT:PREHGT:hmmscan_hgt_candidates - [- ] process > ARCADIASCIENCE_PREHGT:PREHGT:combine_results_genus - [- ] process > ARCADIASCIENCE_PREHGT:PREHGT:combine_results - -[Arcadia-Science/prehgt] Pipeline completed successfully, but with errored process(es) - [22/dcde21] NOTE: Process ARCADIASCIENCE_PREHGT:PREHGT:download_reference_genomes (Fructilactobacillus) terminated with an error exit status (2) -- Error is ignored

Do you have any idea? Thanks.

taylorreiter commented 1 month ago

Hi @lagphase, this is happening because prehgt is not set up to run on bacteria by default.

You can see details here of the couple lines of code that need to change to enable prehgt to run on bacteria: https://github.com/Arcadia-Science/prehgt/issues/56

Please let me know if this doesn't provide you with a workable solution

lagphase commented 1 month ago

Hello Taylor,

Thank you for the prompt response. I have cloned the repo, modify the download_reference_genomes.nf, cd into the cloned prehgt directory and do nextflow run, but I still encountered the same issue. I wonder if I am doing something wrong?

taylorreiter commented 1 month ago

Hmm can you provide a longer log file for this step? That will help me diagnose what might be going on

[22/dcde21] process > ARCADIASCIENCE_PREHGT:PREHGT:download_reference_genomes (Fructilactobacillus) [100%] 1 of 1, failed: 1 ✔
lagphase commented 1 month ago

I copied the log below and also attached the trace files. execution_trace_2024-06-18_12-30-53.txt execution_trace_2024-06-18_12-31-53.txt execution_trace_2024-06-18_12-32-36.txt

[-        ] ARCADIASCIENCE_PREHGT:PREHGT:download_reference_genomes           -
executor >  local (1)
executor >  local (1)                                                                                 
[16/d9995c] ARC…E_PREHGT:PREHGT:download_reference_genomes (Fructilactobacillus) | 1 of 1, failed: 1 ✔
[-        ] ARCADIASCIENCE_PREHGT:PREHGT:combine_and_parse_gff_per_genus         -
[-        ] ARCADIASCIENCE_PREHGT:PREHGT:build_genus_pangenome                   -
[-        ] ARCADIASCIENCE_PREHGT:PREHGT:translate_pangenome                     -
[-        ] ARCADIASCIENCE_PREHGT:PREHGT:blastp_against_clustered_nr             -
[-        ] ARCADIASCIENCE_PREHGT:PREHGT:blastp_add_taxonomy_info                -
[-        ] ARCADIASCIENCE_PREHGT:PREHGT:blastp_to_hgt_candidates_kingdom        -
[-        ] ARCADIASCIENCE_PREHGT:PREHGT:blastp_to_hgt_candidates_subkingdom     -
[-        ] ARCADIASCIENCE_PREHGT:PREHGT:compositional_scans_pepstats            -
[-        ] ARCADIASCIENCE_PREHGT:PREHGT:compositional_scans_to_hgt_candidates   -
[-        ] ARCADIASCIENCE_PREHGT:PREHGT:combine_hgt_candidates                  -
[-        ] ARCADIASCIENCE_PREHGT:PREHGT:extract_hgt_candidates                  -
[-        ] ARCADIASCIENCE_PREHGT:PREHGT:kofamscan_hgt_candidates                -
[-        ] ARCADIASCIENCE_PREHGT:PREHGT:hmmscan_hgt_candidates                  -                                    [-        ] ARCADIASCIENCE_PREHGT:PREHGT:combine_results_genus                   -
[-        ] ARCADIASCIENCE_PREHGT:PREHGT:combine_results                         -       
-[Arcadia-Science/prehgt] Pipeline completed successfully, but with errored process(es) -                             [16/d9995c] NOTE: Process `ARCADIASCIENCE_PREHGT:PREHGT:download_reference_genomes (Fructilactobacillus)` terminated with an error exit status (2) -- Error is ignored
Completed at: 18-Jun-2024 12:47:38
Duration    : 15m 2s
CPU hours   : (a few seconds)
Succeeded   : 0
Ignored     : 1
Failed      : 1
taylorreiter commented 1 month ago

Thanks for providing these. They don't quite provide all of the information I need. There should be a .nextflow.log file in the directory that you executed the command in.

In the meantime, I was able to get the pipeline to run from this branch: https://github.com/Arcadia-Science/prehgt/tree/ter/bacteria. I had to add "bacteria" to two lines.

I've uploaded the results here. They're a TSV file, but I tacked on the "txt" to permit me to attach them to this issue. all_results.tsv.txt

lagphase commented 1 month ago

Thank you for running the program for me. This will do for now.