AnantharamanLab / METABOLIC

A scalable high-throughput metabolic and biogeochemical functional trait profiler
178 stars 45 forks source link

Question: Can I run my own data using "test=true" option? #52

Closed Qicheng-Xu closed 3 years ago

Qicheng-Xu commented 3 years ago

Hi, I have two questions.

  1. Can I run my own data using "test=true" option? When I run the test data, it seems to work smoothly. But when I am running my own data with "perl METABOLIC-C.pl -t 40 -m-cutoff 0.75 -in-gn Genome_files -kofam-db full -r omic_reads_parameters.txt -o METABOLIC_out" option, I had many errors. For example: Traceback (most recent call last): File "/project/qcx/test/METABOLIC_running_folder/METABOLIC/Accessory_scripts/hmmscan-parser-dbCANmeta.py", line 36, in with open('temp') as f: FileNotFoundError: [Errno 2] No such file or directory: 'temp'. I don't know how it happens. So, can I put my data into the test_files directory and run the program with "test= true" option? Will the results be different from the normal way?
  2. Can I directly input several metagenomic data? In your test data, there was only 1 sample data in METABOLIC_test_reads directory (SRR3577362_sub_1 and 2.fastq). Can I input more sample (sample1_1.fastq; sample1_2.fastq; sample2_1.fastq; sample2_2.fastq and so on)? Will they be read automatically? Thank you for your help in advance! Bests, Qicheng
Qicheng-Xu commented 3 years ago

Also, when I run my own data. I have the warnings: Use of uninitialized value $cat in concatenation (.) or string at METABOLIC-C.pl line 1464. Use of uninitialized value in concatenation (.) or string at METABOLIC-C.pl line 1487. Use of uninitialized value $cat in concatenation (.) or string at METABOLIC-C.pl line 1605. Use of uninitialized value $cat in concatenation (.) or string at METABOLIC-C.pl line 1632. The gtdb-TK works well, and the shebang has been changed.

!/project/qcx/.conda/envs/METABOLIC_v4.0/bin/perl # This shebang should be changed to the perl in the METABOLIC_v4.0 conda env

When I run the test files, it's OK

zh8008 commented 3 years ago

I have similar problems!

ChaoLab commented 3 years ago

Hi, I have two questions.

  1. Can I run my own data using "test=true" option? When I run the test data, it seems to work smoothly. But when I am running my own data with "perl METABOLIC-C.pl -t 40 -m-cutoff 0.75 -in-gn Genome_files -kofam-db full -r omic_reads_parameters.txt -o METABOLIC_out" option, I had many errors. For example: Traceback (most recent call last): File "/project/qcx/test/METABOLIC_running_folder/METABOLIC/Accessory_scripts/hmmscan-parser-dbCANmeta.py", line 36, in with open('temp') as f: FileNotFoundError: [Errno 2] No such file or directory: 'temp'. I don't know how it happens. So, can I put my data into the test_files directory and run the program with "test= true" option? Will the results be different from the normal way?
  2. Can I directly input several metagenomic data? In your test data, there was only 1 sample data in METABOLIC_test_reads directory (SRR3577362_sub_1 and 2.fastq). Can I input more sample (sample1_1.fastq; sample1_2.fastq; sample2_1.fastq; sample2_2.fastq and so on)? Will they be read automatically? Thank you for your help in advance! Bests, Qicheng

Hi Qicheng, for the 'temp' problem, it is because the py script ("hmmscan-parser-dbCANmeta.py") is not changed to do the parallel parsing for multiple MAGs. Here, I attached a new one for replacing. And also, I have updated the GitHub deposition "Accessory_scripts.tgz". Hope this will solve the problem.

https://github.com/AnantharamanLab/METABOLIC/files/5605658/hmmscan-parser-dbCANmeta.py.txt

ChaoLab commented 3 years ago

Also, when I run my own data. I have the warnings: Use of uninitialized value $cat in concatenation (.) or string at METABOLIC-C.pl line 1464. Use of uninitialized value in concatenation (.) or string at METABOLIC-C.pl line 1487. Use of uninitialized value $cat in concatenation (.) or string at METABOLIC-C.pl line 1605. Use of uninitialized value $cat in concatenation (.) or string at METABOLIC-C.pl line 1632. The gtdb-TK works well, and the shebang has been changed. ##!/project/qcx/.conda/envs/METABOLIC_v4.0/bin/perl # This shebang should be changed to the perl in the METABOLIC_v4.0 conda env

When I run the test files, it's OK

It seems that your "$cat" is empty which means that one or several of your input MAGs did not get properly assigned phylum level taxonomy. I suggest to check whether your GTDB-Tk works good first, then check whether your input file formats and the usage of the options are properly prepared.

Qicheng-Xu commented 3 years ago

Thank you for your reply. I will try to fix the problem as you suggested. I will tell if it works or not after the try. Bests.

zh8008 commented 3 years ago
  1. Can I directly input several metagenomic data? In your test data, there was only 1 sample data in METABOLIC_test_reads directory (SRR3577362_sub_1 and 2.fastq). Can I input more sample (sample1_1.fastq; sample1_2.fastq; sample2_1.fastq; sample2_2.fastq and so on)? Will they be read automatically?

Could you answer this question please? Thank you.

ChaoLab commented 3 years ago
  1. Can I directly input several metagenomic data? In your test data, there was only 1 sample data in METABOLIC_test_reads directory (SRR3577362_sub_1 and 2.fastq). Can I input more sample (sample1_1.fastq; sample1_2.fastq; sample2_1.fastq; sample2_2.fastq and so on)? Will they be read automatically?

Could you answer this question please? Thank you.

Hi zh8008, one should provide the information of paired reads in "omic_reads_parameters.txt" file. If one has multiple sets of reads, he/she will need to provide the full address to each pair of reads in each line. METABOLIC will not automatically use read files from test reads directory, but it uses the reads assigned by the "-r" option. Meanwhile, the test option is only for testing whether METABOLIC is properly installed or not. It is suggested to use "-in-gn" and "-r" options to point to the address of genomes and reads (for details, please refer to https://github.com/AnantharamanLab/METABOLIC/wiki/METABOLIC-Usage#how-to-run-metabolic).

Best!

Qicheng-Xu commented 3 years ago

The METABOLIC works well now, Thank you for your help! I'll close the issue. Bests, Qicheng.