Open PaulaCat opened 3 years ago
Hi Paula,
Many thanks for your interest in our software. In your feedback, I am afraid that you did not make a correct "omic_reads_parameters.txt" file in your 2nd test. You can have a look at this instruction: https://github.com/AnantharamanLab/METABOLIC/wiki/METABOLIC-Usage#-metabolic-usage.
The correct "omic_reads_parameters.txt" file should be:
/cluster/work/magna/databases_metabolic/METABOLIC_test_files/METABOLIC_test_reads/SRR3577362_sub_1.fastq,/cluster/work/magna/databases_metabolic/METABOLIC_test_files/METABOLIC_test_reads/SRR3577362_sub_2.fastq
Hope this could solve your problem. Meanwhile, we made some updates to the scripts and databases in the past week. I suggest to re-install METABOLIC if you can, now we have a conda env setting up recipe for all users so that it is no longer painful to install METABOLIC in your server.
Best! Chao
Thank you for your help! I hope you are having a good day :)
I ran METABOLIC-C with the regular installation and using the Guaymas test dataset. First, I provided the paired-end reads and the path to a single MAG. Here is the command: METABOLIC-C.pl -t 1 -in-gn /cluster/work/magna/databases_metabolic/METABOLIC_test_files/Guaymas_Basin_genome_files/Gamma/ -r omic_reads_parameters.txt -o Guaymas_real
The job ran successfully, but when I checked the "Metabolic_energy_flow.pdf" and "CommunityPlot.PDF" files, these were empty. I figured out the reason was the fact that I only provided one MAG.
I then tried providing all the path to the folder with ALL the MAGs from the Guaymas test dataset. Here is the command I ran: METABOLIC-C.pl -t 4 -in-gn /cluster/work/magna/databases_metabolic/METABOLIC_test_files/Guaymas_Basin_genome_files -r omic_reads_parameters.txt -o Guaymas_real_real
As a result, I obtained the following errors: Use of uninitialized value $cat in concatenation (.) or string at /cluster/apps/nss/metabolic/16082021/x86_64/METABOLIC-C.pl line 1513. Use of uninitialized value within %Bin2Cat in concatenation (.) or string at /cluster/apps/nss/metabolic/16082021/x86_64/METABOLIC-C.pl line 1537.
Do you know a strategy to solve the error?
Thanks again!
Paula.
Hi Paula, Can you paste your "omic_reads_parameters.txt" here in GitHub? Did you install METABOLIC by conda or by the regular method? Is the GTDB-Tk working well?
Hi Chao!
Here is the "omics_reads_parameters.txt" file: /cluster/work/magna/databases_metabolic/METABOLIC_test_files/METABOLIC_test_reads/SRR3577362_sub_1.fastq,/cluster/work/magna/databases_metabolic/METABOLIC_test_files/METABOLIC_test_reads/SRR3577362_sub_2.fastq
I am trying the regular installation (I am also trying the conda installation in parallel, but I am still working on that).
Thanks again!
Paula.
Hi Paula, Your "omics_reads_parameters.txt" seems to be good. I realized that you might use an old version of METABOLIC-C.pl before I made several changes to the GitHub repository. I suggest that you can follow the instruction of conda installation to re-install METABOLIC and try again.
Okay, thanks, Chao! I will try the conda installation once I have it working and let you know.
Hi again, Chao! I tried running METABOLIC-C with the conda installation using the following command (the "omics_reads_parameters.txt" file is the same one):
/cluster/project/magna/software/METABOLIC/METABOLIC/METABOLIC-C.pl -t 8 -in-gn /cluster/work/magna/databases_metabolic/METABOLIC_test_files/Guaymas_Basin_genome_files -r omic_reads_parameters.txt -o Guaymas_c
And I obtained the following error:
Error: Failed to open sequence file /cluster/work/magna/databases_metabolic/METABOLIC_test_files/Guaymas_Basin_genome_files/total.faa for reading
Thanks for your help!
Paula.
Can you paste all the pop-ups? Did Prodigal run well?
Sure, here it is: The output (if any) follows:
[2021-09-14 14:36:19] The Prodigal annotation is running... [2021-09-14 14:37:26] The Prodigal annotation is finished [2021-09-14 14:37:27] The hmmsearch is running with 8 cpu threads...
Error: Failed to open sequence file /cluster/work/magna/databases_metabolic/METABOLIC_test_files/Guaymas_Basin_genome_files/total.faa for reading
I also have this experience before. My case is that I did not completely kill all the previous METABOLIC-C runs. I suggest to fully kill/terminate all the METABOLIC runs or related software runs, and make a new run afterward.
Hi Chao, Thanks! I tested your suggestion and here is the outcome:
[2021-09-14 19:21:45] The Prodigal annotation is running... [2021-09-14 19:22:27] The Prodigal annotation is finished [2021-09-14 19:22:28] The hmmsearch is running with 8 cpu threads... [2021-09-14 19:48:02] The hmmsearch is finished [2021-09-14 19:49:39] Generating each hmm faa collection... [2021-09-14 19:49:55] Each hmm faa collection has been made [2021-09-14 19:49:55] The KEGG module result is calculating... [2021-09-14 19:53:27] The KEGG identifier (KO id) result is calculating... [2021-09-14 19:53:28] The KEGG identifier (KO id) seaching result is finished [2021-09-14 19:53:28] Searching CAZymes by dbCAN2... [2021-09-14 19:56:13] dbCAN2 searching is done [2021-09-14 19:56:13] Searching MEROPS peptidase... [2021-09-14 19:57:34] MEROPS peptidase searching is done [2021-09-14 19:57:35] METABOLIC table has been generated [2021-09-14 19:57:35] Drawing element cycling diagrams... Loading required package: shape [2021-09-14 20:01:04] Drawing element cycling diagrams finished [2021-09-14 20:01:04] Drawing metabolic handoff diagrams... [2021-09-14 20:01:09] Drawing metabolic handoff diagrams finished [2021-09-14 20:01:09] Drawing energy flow chart... Use of uninitialized value $cat in concatenation (.) or string at /cluster/project/magna/software/METABOLIC/METABOLIC/METABOLIC-C.pl line 1464 Use of uninitialized value in concatenation (.) or string at /cluster/project/magna/software/METABOLIC/METABOLIC/METABOLIC-C.pl line 1487.
Did you change the shebang line of METABOLIC-C.pl? did you run METABOLIC under the conda environment?
Hi! Yes, here is the shebang line:
###########################
My another guess is that GTDB-Tk has some problems. Did you check whether you can properly call GTDB-Tk software and the GTDB-Tk result is good (located in the "intermediate_results" folder within the output directory)
Please see this: https://github.com/AnantharamanLab/METABOLIC/issues/41 , if some MAGS don't have a GTDB classification it might result in this error you are seeing with $cat
Hi Chao!
I finally fixed the error I reported but now I am getting different errors.
Here is the first one:
[2021-10-22 18:02:11] Drawing energy flow chart... ==> Processed 37/40 genomes (92%) |█████████████▉ | [ 3.59genome/s, ETA 00:00]FATAL: Sequence identifiers must be unique. Your fasta file contains two sequences with the same id (NODE_550_length_25751_cov_6.775685_1) Use of uninitialized value $cat in concatenation (.) or string at /cluster/project/magna/software/METABOLIC/METABOLIC/METABOLIC-C.pl line 1463.
Here is the second one :
Use of uninitialized value in concatenation (.) or string at /cluster/project/magna/software/METABOLIC/METABOLIC/METABOLIC-C.pl line 1486.
Loading required package: ggplot2
Error: Must request at least one colour from a hue palette.
In addition: Warning message:
The parameter infer.label
is deprecated.
Use aes(label = after_stat(stratum))
.
Execution halted
Loading required package: ggplot2
Do you know how can I fix them?
Let me know if you need me to provide you with the submitted scripts again!
Paula.
Hi Chao!
I finally fixed the error I reported but now I am getting different errors.
Here is the first one:
[2021-10-22 18:02:11] Drawing energy flow chart... ==> Processed 37/40 genomes (92%) |█████████████▉ | [ 3.59genome/s, ETA 00:00]FATAL: Sequence identifiers must be unique. Your fasta file contains two sequences with the same id (NODE_550_length_25751_cov_6.775685_1) Use of uninitialized value $cat in concatenation (.) or string at /cluster/project/magna/software/METABOLIC/METABOLIC/METABOLIC-C.pl line 1463.
Here is the second one :
Use of uninitialized value in concatenation (.) or string at /cluster/project/magna/software/METABOLIC/METABOLIC/METABOLIC-C.pl line 1486. Loading required package: ggplot2 Error: Must request at least one colour from a hue palette. In addition: Warning message: The parameter
infer.label
is deprecated. Useaes(label = after_stat(stratum))
. Execution halted Loading required package: ggplot2Do you know how can I fix them?
Let me know if you need me to provide you with the submitted scripts again!
Paula.
So maybe first you need to solve the input fasta sequence id duplication issue.
In order to run METABOLIC-C I am using the flags -in-gn and -r. According to the instructions, I need to provide the path to the paired-end reads. I tested the command: 1) providing a path to the files, and 2) command providing a text file with the path to the files. When providing the path to the files, I tested three different options for the syntax. Here are the scripts:
1) Providing a path to the files :
bsub -n 1 -R "rusage[mem=8000]" METABOLIC-G.pl -in-gn /cluster/work/magna/databases_metabolic/METABOLIC_test_files/Guaymas_Basin_genome_files/Gamma/ -r /cluster/work/magna/databases_metabolic/METABOLIC_test_files/METABOLIC_test_reads/ -o Guaymas_3
2) Providing a text file with the paths to the files (content of the text file below): bsub -n 1 -R "rusage[mem=8000]" METABOLIC-C.pl -t 1 -in-gn /cluster/work/magna/databases_metabolic/METABOLIC_test_files/Guaymas_Basin_genome_files/Gamma/ -r omic_reads_parameters.txt -o Guaymas_9 2.1) /cluster/work/magna/databases_metabolic/METABOLIC_test_files/METABOLIC_test_reads/*.fastq 2.2) /cluster/work/magna/databases_metabolic/METABOLIC_test_files/METABOLIC_test_reads/ 2.3)/cluster/work/magna/databases_metabolic/METABOLIC_test_files/METABOLIC_test_reads/SRR3577362_sub_1.fastq /cluster/work/magna/databases_metabolic/METABOLIC_test_files/METABOLIC_test_reads/SRR3577362_sub_2.fastq
When trying all of the options cited above, I get the following error: Use of uninitialized value in concatenation (.) or string at /cluster/apps/nss/metabolic/16082021/x86_64/METABOLIC-C.pl line 1788, <__IN> line 1. Use of uninitialized value in concatenation (.) or string at /cluster/apps/nss/metabolic/16082021/x86_64/METABOLIC-C.pl line 1804. stat: Bad file descriptor Warning: Could not open read file "-S" for reading; skipping... stat: Bad file descriptor Warning: Could not open read file "Guaymas_11/All_gene_collections_mapped.1.sam" for reading; skipping... Error: No input read files were valid (ERR): bowtie2-align exited with value 1 [E::hts_open_format] Failed to open file "Guaymas_11/All_gene_collections_mapped.1.sorted.bam" : No such file or directory samtools index: failed to open "Guaymas_11/All_gene_collections_mapped.1.sorted.bam": No such file or directory rm: cannot remove 'Guaymas_11/All_gene_collections_mapped.1.sam': No such file or directory rm: cannot remove 'Guaymas_11/.bam': No such file or directory rm: cannot remove 'Guaymas_11/.bai': No such file or directory
Therefore, I would like to ask you how can I fix the issue and provide the correct syntax for the command.
Thank you very much!
Paula.