AnantharamanLab / METABOLIC

A scalable high-throughput metabolic and biogeochemical functional trait profiler
178 stars 45 forks source link

-test true: Data test results #198

Open Ryan-OCEA opened 1 month ago

Ryan-OCEA commented 1 month ago

I've performed a couple of runs using METABOLIC-C and I am not convinced that its run properly. I am not certain that the program runs as intended when testing it with "-test true" as a few key figures that I am interested in for my own analyses (such as the functional network diagrams) are blank. Is there a copy of the "METABOLIC_out" folder with all the associated output files/diagrams from the -test true run available to confirm the program is running fully. Alternatively, since I have successfully managed to run the -test true command, is this solely enough to confirm the software was installed and running correctly?

Thank you for any help and/or guidance! Ryan

mgabriell1 commented 1 month ago

Hi, How is your METABOLIC_result.xlsx file looking? I just ran the test as well and I get figures (not empty ones), but that file looks odd given that all the hits are absent in the singular bacteria, but are instead present in the "total" columns of the "HMMHitNum", "FunctionHit" and "KEGGModuleHit" tables. For example: image

On the other hand, in "dbCAN2Hit" and "MEROPSHit" tables I see hits for the genomes, but the "total" column is empty. For example: image

The installation was performed following the instructions on the wiki creating a conda environment and then running the run_to_setup.sh script and the log does not show any odd behaviours

2024-10-23 08:28:34] The Prodigal annotation is running...
[2024-10-23 08:29:12] The Prodigal annotation is finished
[2024-10-23 08:29:13] The hmmsearch is running with 5 cpu threads...
[2024-10-23 09:18:43] The hmmsearch is finished
[2024-10-23 09:36:03] The hmm hit result is calculating...
[2024-10-23 09:36:03] Generating each hmm faa collection...
[2024-10-23 09:36:08] Each hmm faa collection has been made
[2024-10-23 09:36:08] The KEGG module result is calculating...
[2024-10-23 09:37:59] The KEGG identifier (KO id) result is calculating...
[2024-10-23 09:37:59] The KEGG identifier (KO id) seaching result is finished
[2024-10-23 09:37:59] Searching CAZymes by dbCAN2...
[2024-10-23 09:41:37] dbCAN2 searching is done
[2024-10-23 09:41:37] Searching MEROPS peptidase...
[2024-10-23 09:47:28] MEROPS peptidase searching is done
[2024-10-23 09:47:48] METABOLIC table has been generated
[2024-10-23 09:47:48] Drawing element cycling diagrams...
Loading required package: shape
Loading required package: shape
[2024-10-23 09:47:50] Drawing element cycling diagrams finished
METABOLIC-G was done, the total running time: 01:19:17 (hh:mm:ss)

Marco

Ryan-OCEA commented 4 weeks ago

Hey Marco,

Both the "dbCAN2Hit" and "MEROPSHit" tables in my METABOLIC_result.xlsx file appear blank here on my end. Your discrepancy is odd; however, I do not have a "total.Hits" column in any of these sheets in the METABOLIC_result.xlsx file anyways! I had installed the program following the full installation steps, maybe I'll give it a shot with the conda environment.

Function Hit Sheet:

Screenshot 2024-10-24 at 10 53 15 AM

dbCAN2Hit Sheet:

Screenshot 2024-10-24 at 10 59 11 AM

I'll let you know if I can make any progress on this!

Until then, hitting my head against my computer screen in solidarity, Ryan