AnantharamanLab / METABOLIC

A scalable high-throughput metabolic and biogeochemical functional trait profiler
173 stars 43 forks source link

Use of uninitialized value" error METABOLIC-C.pl and METABOLIC-G.pl #111

Open Marlinski95 opened 1 year ago

Marlinski95 commented 1 year ago

Hi, I installed METABOLIC on our server and ran the test data which worked without any issues (I eventually stopped it to run my own data but everything seemed fine). Now, I want to run it on my MAGs but for some reason it is not working and gives me an "Use of uninitialized value" error for every line in the scripts.

Here is what I did:

Test run:

(METABOLIC) perl METABOLIC-C.pl -test true
[2022-10-18 13:38:13] The Prodigal annotation is running...

etc.

My own data:

perl METABOLIC-C.pl -in-gn /path/to/MAG/directory/ -r /path/to/unzipped/reads/CG-1_R1_CLEAN_TrimmedBoth.fastq.00.0_0.cor.fastq, /path/to/unzipped/reads/CG-1_NT/CG 1_R2_CLEAN_TrimmedBoth.fastq.00.0_0.cor.fastq

This is the errors I receive:

print() on closed filehandle OUT at /tools/miniconda3/envs/METABOLIC/METABOLIC_running_folder/METABOLIC/METABOLIC-C.pl line 311, <IN> line 2.
print() on closed filehandle OUT at /tools/miniconda3/envs/METABOLIC/METABOLIC_running_folder/METABOLIC/METABOLIC-C.pl line 311, <IN> line 2.
print() on closed filehandle OUT at /tools/miniconda3/envs/METABOLIC/METABOLIC_running_folder/METABOLIC/METABOLIC-C.pl line 305, <IN> line 2.
print() on closed filehandle OUT at /tools/miniconda3/envs/METABOLIC/METABOLIC_running_folder/METABOLIC/METABOLIC-C.pl line 311, <IN> line 2.
print() on closed filehandle OUT at /tools/miniconda3/envs/METABOLIC/METABOLIC_running_folder/METABOLIC/METABOLIC-C.pl line 311, <IN> line 2.
print() on closed filehandle OUT at /tools/miniconda3/envs/METABOLIC/METABOLIC_running_folder/METABOLIC/METABOLIC-C.pl line 305, <IN> line 2.```

etc. and

Use of uninitialized value $gn_id in hash element at /tools/miniconda3/envs/METABOLIC/METABOLIC_running_folder/METABOLIC/METABOLIC-C.pl line 429, <INN> line 94.
Use of uninitialized value $hmm in hash element at /tools/miniconda3/envs/METABOLIC/METABOLIC_running_folder/METABOLIC/METABOLIC-C.pl line 429, <INN> line 94.
Use of uninitialized value $gn_id in hash element at /tools/miniconda3/envs/METABOLIC/METABOLIC_running_folder/METABOLIC/METABOLIC-C.pl line 431, <INN> line 94.
Use of uninitialized value $hmm in hash element at /tools/miniconda3/envs/METABOLIC/METABOLIC_running_folder/METABOLIC/METABOLIC-C.pl line 431, <INN> line 94.

My data is in fasta format and the headers look like this:

>CG-1_bin.100_k127_889664
AGGTAAGAGTTAGTTATAGTATTTAATCCACAAACAAACTGAATGGTATTTTTAGTATCT
GGATTAAGTTATAGGGTAAAATACGACAACTTACTATCTTCTTCCCACCCCTTTTTTCGC

I am not sure what is causing this issue and would truly appreciate some help with this! Best,

ChaoLab commented 1 year ago

Hi I found a mistake in your input command: image There should be no space between two fastqs.

It seems that your genome IDs have something not OK I think. Please see the requirements on the genome filenames: https://github.com/AnantharamanLab/METABOLIC/wiki/METABOLIC-Usage#all-required-and-optional-flags

Marlinski95 commented 1 year ago

Hi! Thank you so much! I will look into this :)

Marlinski95 commented 1 year ago

Hey, I tested it and I think I figured it out. I removed the description from the fasta files since it was distracting and not needed in my other analyses but it seems like the perl script assumes a description in the fasta header? I ran it on one of my unedited files and it worked on that one....

Could that be? If so, it might be helpful to adjust that somehow to allow for fasta files that "only" had a sequence ID but now following description.

Cheers,

ChaoLab commented 1 year ago

What do you exactly refer to by saying "sequence ID"? Do you mean allowing fasta files to have descriptions in the fasta name or headers within the fasta files? We now only have restrictions on the fasta name