AnantharamanLab / METABOLIC

A scalable high-throughput metabolic and biogeochemical functional trait profiler
175 stars 44 forks source link

Many issues during runing METABOLIC-C.pl #122

Closed qikongwanli closed 1 year ago

qikongwanli commented 1 year ago

Building a SMALL index readline() on closed filehandle __IN at METABOLIC-C.pl line 1965. rm: cannot remove '/mnt/f/METABOLIC/METABOLIC_test_files/METABOLIC_out/.bam': No such file or directory rm: cannot remove '/mnt/f/METABOLIC/METABOLIC_test_files/METABOLIC_out/.sorted.stat': No such file or directory rm: cannot remove '/mnt/f/METABOLIC/METABOLIC_test_files/METABOLIC_out/*.bai': No such file or directory [2023-02-02 16:18:00] Drawing element cycling diagrams finished [2023-02-02 16:18:00] Drawing metabolic handoff diagrams... mv: cannot stat '/mnt/f/METABOLIC/METABOLIC_test_files/METABOLIC_out/newdir/Bar_plot/bar_plot_input_1.pdf': No such file or directory mv: cannot stat '/mnt/f/METABOLIC/METABOLIC_test_files/METABOLIC_out/newdir/Bar_plot/bar_plot_input_2.pdf': No such file or directory

mv: cannot stat '/mnt/f/METABOLIC/METABOLIC_test_files/METABOLIC_out/Output_energy_flow/Energy_plot/network.plot.pdf': No such file or directory

[2023-02-02 16:56:46] Calculating MW-score ... mkdir: cannot create directory ‘/mnt/f/METABOLIC/METABOLIC_test_files/METABOLIC_out/MW-score_result’: File exists

ChaoLab commented 1 year ago

It seems that the gene files in the folder containing MAGs are not properly provided

qikongwanli commented 1 year ago

It seems that the gene files in the folder containing MAGs are not properly provided Hi Zhou, I used the provided genome files to run METABOLIC-C and got the following issues. No figure was generated.

❯ perl METABOLIC-C.pl -t 10 -in-gn /mnt/f/METABOLIC/METABOLIC_test_files/Guaymas_Basin_genome_files -r /mnt/f/METABOLIC/METABOLIC_test_files/Reads_address.txt -o /mnt/f/METABOLIC/METABOLIC_test_files/METABOLIC_out [2023-02-06 11:24:47] The Prodigal annotation is running... [2023-02-06 11:24:55] The Prodigal annotation is finished [2023-02-06 11:24:55] The hmmsearch is running with 10 cpu threads... [2023-02-06 11:44:27] The hmmsearch is finished [2023-02-06 11:45:31] Generating each hmm faa collection... [2023-02-06 11:45:46] Each hmm faa collection has been made [2023-02-06 11:45:46] The KEGG module result is calculating... [2023-02-06 11:46:43] The KEGG identifier (KO id) result is calculating... [2023-02-06 11:46:43] The KEGG identifier (KO id) seaching result is finished [2023-02-06 11:46:43] Searching CAZymes by dbCAN2... [2023-02-06 12:16:33] dbCAN2 searching is done [2023-02-06 12:16:33] Searching MEROPS peptidase... [2023-02-06 12:18:01] MEROPS peptidase searching is done [2023-02-06 12:18:03] METABOLIC table has been generated [2023-02-06 12:18:03] Drawing element cycling diagrams... Renaming /mnt/f/METABOLIC/METABOLIC_test_files/METABOLIC_out/All_gene_collections.gene.scaffold.3.bt2.tmp to /mnt/f/METABOLIC/METABOLIC_test_files/METABOLIC_out/All_gene_collections.gene.scaffold.3.bt2 Renaming /mnt/f/METABOLIC/METABOLIC_test_files/METABOLIC_out/All_gene_collections.gene.scaffold.4.bt2.tmp to /mnt/f/METABOLIC/METABOLIC_test_files/METABOLIC_out/All_gene_collections.gene.scaffold.4.bt2 Renaming /mnt/f/METABOLIC/METABOLIC_test_files/METABOLIC_out/All_gene_collections.gene.scaffold.1.bt2.tmp to /mnt/f/METABOLIC/METABOLIC_test_files/METABOLIC_out/All_gene_collections.gene.scaffold.1.bt2 Renaming /mnt/f/METABOLIC/METABOLIC_test_files/METABOLIC_out/All_gene_collections.gene.scaffold.2.bt2.tmp to /mnt/f/METABOLIC/METABOLIC_test_files/METABOLIC_out/All_gene_collections.gene.scaffold.2.bt2 Renaming /mnt/f/METABOLIC/METABOLIC_test_files/METABOLIC_out/All_gene_collections.gene.scaffold.rev.1.bt2.tmp to /mnt/f/METABOLIC/METABOLIC_test_files/METABOLIC_out/All_gene_collections.gene.scaffold.rev.1.bt2 Renaming /mnt/f/METABOLIC/METABOLIC_test_files/METABOLIC_out/All_gene_collections.gene.scaffold.rev.2.bt2.tmp to /mnt/f/METABOLIC/METABOLIC_test_files/METABOLIC_out/All_gene_collections.gene.scaffold.rev.2.bt2 [2023-02-06 12:25:09] Drawing element cycling diagrams finished [2023-02-06 12:25:09] Drawing metabolic handoff diagrams... mv: cannot stat '/mnt/f/METABOLIC/METABOLIC_test_files/METABOLIC_out/newdir/Bar_plot/bar_plot_input_1.pdf': No such file or directory mv: cannot stat '/mnt/f/METABOLIC/METABOLIC_test_files/METABOLIC_out/newdir/Bar_plot/bar_plot_input_2.pdf': No such file or directory [2023-02-06 12:25:09] Drawing metabolic handoff diagrams finished [2023-02-06 12:25:09] Drawing energy flow chart... [2023-02-06 12:25:10] INFO: GTDB-Tk v2.1.0 [2023-02-06 12:25:10] INFO: gtdbtk classify_wf --cpus 10 -x fasta --genome_dir /mnt/f/METABOLIC/METABOLIC_test_files/Guaymas_Basin_genome_files --out_dir /mnt/f/METABOLIC/METABOLIC_test_files/METABOLIC_out/intermediate_files/gtdbtk_Genome_files [2023-02-06 12:25:10] INFO: Using GTDB-Tk reference data version r207: /mnt/f/database/gtdbtk/release207_v2 [2023-02-06 12:25:10] INFO: Identifying markers in 2 genomes with 10 threads. [2023-02-06 12:25:10] TASK: Running Prodigal V2.6.3 to identify genes. [2023-02-06 12:25:15] INFO: Completed 2 genomes in 5.04 seconds (2.52 seconds/genome). [2023-02-06 12:25:15] TASK: Identifying TIGRFAM protein families. [2023-02-06 12:25:24] INFO: Completed 2 genomes in 9.11 seconds (4.56 seconds/genome). [2023-02-06 12:25:24] TASK: Identifying Pfam protein families. [2023-02-06 12:25:25] INFO: Completed 2 genomes in 1.07 seconds (1.86 genomes/second). [2023-02-06 12:25:25] INFO: Annotations done using HMMER 3.1b2 (February 2015). [2023-02-06 12:25:25] TASK: Summarising identified marker genes. [2023-02-06 12:25:26] INFO: Completed 2 genomes in 0.23 seconds (8.77 genomes/second). [2023-02-06 12:25:26] INFO: Done. [2023-02-06 12:25:29] INFO: Aligning markers in 2 genomes with 10 CPUs. [2023-02-06 12:25:30] INFO: Processing 2 genomes identified as bacterial. [2023-02-06 12:26:48] INFO: Read concatenated alignment for 62,291 GTDB genomes. [2023-02-06 12:26:48] TASK: Generating concatenated alignment for each marker. [2023-02-06 12:26:48] INFO: Completed 2 genomes in 0.07 seconds (30.23 genomes/second). [2023-02-06 12:26:49] TASK: Aligning 114 identified markers using hmmalign 3.1b2 (February 2015). [2023-02-06 12:26:51] INFO: Completed 114 markers in 2.53 seconds (45.01 markers/second). [2023-02-06 12:26:52] TASK: Masking columns of bacterial multiple sequence alignment using canonical mask. [2023-02-06 12:28:20] INFO: Completed 62,293 sequences in 1.47 minutes (42,248.31 sequences/minute). [2023-02-06 12:28:20] INFO: Masked bacterial alignment from 41,084 to 5,036 AAs. [2023-02-06 12:28:20] INFO: 0 bacterial user genomes have amino acids in <10.0% of columns in filtered MSA. [2023-02-06 12:28:20] INFO: Creating concatenated alignment for 62,293 bacterial GTDB and user genomes. [2023-02-06 12:28:39] INFO: Creating concatenated alignment for 2 bacterial user genomes. [2023-02-06 12:28:39] INFO: Done. [2023-02-06 12:28:40] WARNING: pplacer requires ~50 GB of RAM to fully load the bacterial tree into memory. However, 25.65 GB was detected. This may affect pplacer performance, or fail if there is insufficient swap space. [2023-02-06 12:28:41] TASK: Placing 2 bacterial genomes into backbone reference tree with pplacer using 10 CPUs (be patient). [2023-02-06 12:28:41] INFO: pplacer version: v1.1.alpha19-0-g807f6f3 [2023-02-06 12:30:24] INFO: Calculating RED values based on reference tree. [2023-02-06 12:30:25] INFO: 2 out of 2 have an class assignments. Those genomes will be reclassified. [2023-02-06 12:30:25] TASK: Placing 2 bacterial genomes into class-level reference tree 1 (1/1) with pplacer using 10 CPUs (be patient). [2023-02-06 12:58:19] INFO: Calculating RED values based on reference tree. [2023-02-06 12:58:27] TASK: Traversing tree to determine classification method. [2023-02-06 12:58:27] INFO: Completed 2 genomes in 0.00 seconds (6,978.88 genomes/second). [2023-02-06 12:58:27] TASK: Calculating average nucleotide identity using FastANI (v1.32). [2023-02-06 12:58:28] INFO: Completed 6 comparisons in 0.74 seconds (8.07 comparisons/second). [2023-02-06 12:58:28] INFO: 2 genome(s) have been classified using FastANI and pplacer. [2023-02-06 12:58:28] INFO: Note that Tk classification mode is insufficient for publication of new taxonomic designations. New designations should be based on one or more de novo trees, an example of which can be produced by Tk in de novo mode. [2023-02-06 12:58:28] INFO: Done. [2023-02-06 12:58:29] INFO: Removing intermediate files. [2023-02-06 12:58:29] INFO: Intermediate files removed. [2023-02-06 12:58:29] INFO: Done. mv: cannot stat '/mnt/f/METABOLIC/METABOLIC_test_files/METABOLIC_out/Output_energy_flow/Energy_plot/network.plot.pdf': No such file or directory [2023-02-06 12:58:31] Drawing energy flow chart finished [2023-02-06 12:58:31] Calculating MW-score ... [2023-02-06 12:58:32] Calculating MW-score is done METABOLIC-C was done, the total running time: 01:33:46 (hh:mm:ss)

ChaoLab commented 1 year ago

It seems that you did not successfully obtain the metabolic handoff diagrams. Is there anything wrong with the R conda environment? You can find the corresponding one-liner in the METABOLIC-C.pl script to run the Rscript, and test it within your R conda environment

qikongwanli commented 1 year ago

It seems that you did not successfully obtain the metabolic handoff diagrams. Is there anything wrong with the R conda environment? You can find the corresponding one-liner in the METABOLIC-C.pl script to run the Rscript, and test it within your R conda environment

Thanks Zhichao, I have solved these problems.