Arcadia-Science / prehgt

A pipeline for lightweight screening of Eukaryotic genomes and transcriptomes for recent HGT
MIT License
12 stars 6 forks source link

kofamscan error #61

Open onbio opened 1 month ago

onbio commented 1 month ago

I am getting error with kofamscan. I suspect this has something to do with the ko_list file, but I am not sure

The command I used is nextflow run ~/Softwares/prehgt -with-timeline --max_cpus 4 --max_memory 50GB -profile conda --outdir test_out2 -work-dir work2 --input input_1.tsv --blast_db inputs/nr_rep_seq.fasta.gz --blast_db_tax inputs/nr_cluster_taxid_formatted_final.sqlite --ko_list inputs/kofamscandb/ko_list --ko_profiles inputs/kofamscandb/profiles --hmm_db inputs/hmms/all_hmms.hmm

executor > local (1) [43/4d7bc0] process > ARCADIASCIENCE_PREHGT:PREHGT:download_reference_genomes (Bigelowiella) [100%] 1 of 1, cached: 1 ✔ [49/79ea57] process > ARCADIASCIENCE_PREHGT:PREHGT:combine_and_parse_gff_per_genus (Bigelowiella) [100%] 1 of 1, cached: 1 ✔ [e8/2e28e9] process > ARCADIASCIENCE_PREHGT:PREHGT:build_genus_pangenome (Bigelowiella) [100%] 1 of 1, cached: 1 ✔ [cc/4f7fd9] process > ARCADIASCIENCE_PREHGT:PREHGT:translate_pangenome (Bigelowiella) [100%] 1 of 1, cached: 1 ✔ [b0/1e36cd] process > ARCADIASCIENCE_PREHGT:PREHGT:blastp_against_clustered_nr (Bigelowiella) [100%] 1 of 1, cached: 1 ✔ [b6/18afa1] process > ARCADIASCIENCE_PREHGT:PREHGT:blastp_add_taxonomy_info (Bigelowiella) [100%] 1 of 1, cached: 1 ✔ [cb/02807f] process > ARCADIASCIENCE_PREHGT:PREHGT:blastp_to_hgt_candidates_kingdom (Bigelowiella) [100%] 1 of 1, cached: 1 ✔ [dd/111f8d] process > ARCADIASCIENCE_PREHGT:PREHGT:blastp_to_hgt_candidates_subkingdom (Bigelowiella) [100%] 1 of 1, cached: 1 ✔ [ee/07b270] process > ARCADIASCIENCE_PREHGT:PREHGT:compositional_scans_pepstats (Bigelowiella) [100%] 1 of 1, cached: 1 ✔ [21/d33431] process > ARCADIASCIENCE_PREHGT:PREHGT:compositional_scans_to_hgt_candidates (Bigelowiella) [100%] 1 of 1, cached: 1 ✔ [58/d3b87b] process > ARCADIASCIENCE_PREHGT:PREHGT:combine_hgt_candidates (Bigelowiella) [100%] 1 of 1, cached: 1 ✔ [76/30e30d] process > ARCADIASCIENCE_PREHGT:PREHGT:extract_hgt_candidates (Bigelowiella) [100%] 1 of 1, cached: 1 ✔ [d0/8c9b41] process > ARCADIASCIENCE_PREHGT:PREHGT:kofamscan_hgt_candidates (Bigelowiella) [100%] 1 of 1, failed: 1 ✘ [90/0dac19] process > ARCADIASCIENCE_PREHGT:PREHGT:hmmscan_hgt_candidates (Bigelowiella) [100%] 1 of 1, cached: 1 ✔ [- ] process > ARCADIASCIENCE_PREHGT:PREHGT:combine_results_genus - [- ] process > ARCADIASCIENCE_PREHGT:PREHGT:combine_results - Execution cancelled -- Finishing pending tasks before exit -[Arcadia-Science/prehgt] Pipeline completed with errors- ERROR ~ Error executing process > 'ARCADIASCIENCE_PREHGT:PREHGT:kofamscan_hgt_candidates (Bigelowiella)'

Caused by: Process ARCADIASCIENCE_PREHGT:PREHGT:kofamscan_hgt_candidates (Bigelowiella) terminated with an error exit status (1)

Command executed:

mkdir -p tmp gunzip -c ko_list > ko_list tar xf profiles exec_annotation --format detail-tsv --ko-list ko_list --profile profiles --cpu 1 -o Bigelowiella_kofamscan.tsv Bigelowiella_aa.fasta

Command exit status: 1

Command output: (empty)

Command error:

gzip: ko_list: unexpected end of file

I tried using the ko_list.gz, as I suspected that the code is trying to gunzip the uncompressed ko file but still getting error. below is the log when using gz file.

executor > local (1) [43/4d7bc0] process > ARCADIASCIENCE_PREHGT:PREHGT:download_reference_genomes (Bigelowiella) [100%] 1 of 1, cached: 1 ✔ [49/79ea57] process > ARCADIASCIENCE_PREHGT:PREHGT:combine_and_parse_gff_per_genus (Bigelowiella) [100%] 1 of 1, cached: 1 ✔ [e8/2e28e9] process > ARCADIASCIENCE_PREHGT:PREHGT:build_genus_pangenome (Bigelowiella) [100%] 1 of 1, cached: 1 ✔ [cc/4f7fd9] process > ARCADIASCIENCE_PREHGT:PREHGT:translate_pangenome (Bigelowiella) [100%] 1 of 1, cached: 1 ✔ [b0/1e36cd] process > ARCADIASCIENCE_PREHGT:PREHGT:blastp_against_clustered_nr (Bigelowiella) [100%] 1 of 1, cached: 1 ✔ [b6/18afa1] process > ARCADIASCIENCE_PREHGT:PREHGT:blastp_add_taxonomy_info (Bigelowiella) [100%] 1 of 1, cached: 1 ✔ [cb/02807f] process > ARCADIASCIENCE_PREHGT:PREHGT:blastp_to_hgt_candidates_kingdom (Bigelowiella) [100%] 1 of 1, cached: 1 ✔ [dd/111f8d] process > ARCADIASCIENCE_PREHGT:PREHGT:blastp_to_hgt_candidates_subkingdom (Bigelowiella) [100%] 1 of 1, cached: 1 ✔ [ee/07b270] process > ARCADIASCIENCE_PREHGT:PREHGT:compositional_scans_pepstats (Bigelowiella) [100%] 1 of 1, cached: 1 ✔ [21/d33431] process > ARCADIASCIENCE_PREHGT:PREHGT:compositional_scans_to_hgt_candidates (Bigelowiella) [100%] 1 of 1, cached: 1 ✔ [58/d3b87b] process > ARCADIASCIENCE_PREHGT:PREHGT:combine_hgt_candidates (Bigelowiella) [100%] 1 of 1, cached: 1 ✔ [76/30e30d] process > ARCADIASCIENCE_PREHGT:PREHGT:extract_hgt_candidates (Bigelowiella) [100%] 1 of 1, cached: 1 ✔ [5d/6d8f16] process > ARCADIASCIENCE_PREHGT:PREHGT:kofamscan_hgt_candidates (Bigelowiella) [100%] 1 of 1, failed: 1 ✘ [90/0dac19] process > ARCADIASCIENCE_PREHGT:PREHGT:hmmscan_hgt_candidates (Bigelowiella) [100%] 1 of 1, cached: 1 ✔ [- ] process > ARCADIASCIENCE_PREHGT:PREHGT:combine_results_genus - [- ] process > ARCADIASCIENCE_PREHGT:PREHGT:combine_results - Execution cancelled -- Finishing pending tasks before exit -[Arcadia-Science/prehgt] Pipeline completed with errors- ERROR ~ Error executing process > 'ARCADIASCIENCE_PREHGT:PREHGT:kofamscan_hgt_candidates (Bigelowiella)'

Caused by: Process ARCADIASCIENCE_PREHGT:PREHGT:kofamscan_hgt_candidates (Bigelowiella) terminated with an error exit status (2)

Command executed:

mkdir -p tmp gunzip -c ko_list.gz > ko_list tar xf profiles exec_annotation --format detail-tsv --ko-list ko_list --profile profiles --cpu 1 -o Bigelowiella_kofamscan.tsv Bigelowiella_aa.fasta

Command exit status: 2

Command output: (empty)

Command error: tar: profiles: Cannot read: Is a directory tar: At beginning of tape, quitting now tar: Error is not recoverable: exiting now

The error msg in the nextflow.log file is

Jul-25 13:19:42.663 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 13; name: ARCADIASCIENCE_PREHGT:PREHGT:kofamscan_hgt_candidates (Bigelowiella); status: COMPLETED; exit: 1; error: -; workDir: /work2/4e/c167f16e5f256a2c22df3af246dbe7] Jul-25 13:19:42.680 [TaskFinalizer-1] DEBUG nextflow.processor.TaskProcessor - Handling unexpected condition for task: name=ARCADIASCIENCE_PREHGT:PREHGT:kofamscan_hgt_candidates (Bigelowiella); work-dir=/work2/4e/c167f16e5f256a2c22df3af246dbe7 error [nextflow.exception.ProcessFailedException]: Process ARCADIASCIENCE_PREHGT:PREHGT:kofamscan_hgt_candidates (Bigelowiella) terminated with an error exit status (1) Jul-25 13:19:42.734 [TaskFinalizer-1] ERROR nextflow.processor.TaskProcessor - Error executing process > 'ARCADIASCIENCE_PREHGT:PREHGT:kofamscan_hgt_candidates (Bigelowiella)'

Caused by: Process ARCADIASCIENCE_PREHGT:PREHGT:kofamscan_hgt_candidates (Bigelowiella) terminated with an error exit status (1)

Please assist in finding and resolving the error.

taylorreiter commented 3 weeks ago

Hi @onbio, were you able to resolve this? Can you try using this file? ko_list.gz