AnantharamanLab / METABOLIC

A scalable high-throughput metabolic and biogeochemical functional trait profiler
173 stars 43 forks source link

Hmmsearch threshold #128

Closed Taojianchang closed 1 year ago

Taojianchang commented 1 year ago

Hello, I am using the METABOLIC but confused with the threshold of Hmmsearch. The TC and NC scores are not found in the hmm file in the current METABOLIC_hmm_db.tgz like amoA.hmm and PF02406.hmm. I wonder how the threshold to be set ? In addition, the hmm hit PF02406.hmm seemed different for the same ORFs when the whole Pfam database was used.

METABOLIC v.4 Detail command Program: hmmsearch Version: 3.1b2 (February 2015) Pipeline mode: SEARCH Option settings: hmmsearch --tblout drep_95_genomes_metabolic_out/intermediate_files/Hmmsearch_Outputs/PF02406.hmm.total.hmmsearch_result.txt -T 10 --cpu 1 /tiagor_home/jianchang/software/METABOLIC/METABOLIC4.0/METABOLIC/METABOLIC_hmm_db/PF02406.hmm drep_95_genomes/total.faa

PF02406.hmm HMMER3/f [3.1b1 | May 2013] NAME PF02406_full LENG 85 ALPH amino RF no MM no CONS yes CS yes MAP yes DATE Mon Sep 4 18:19:16 2017 NC 10 10; NSEQ 216 EFFN 4.123169 CKSUM 2285224032 STATS LOCAL MSV -9.1256 0.71848 STATS LOCAL VITERBI -9.4707 0.71848 STATS LOCAL FORWARD -3.9801 0.71848

ChaoLab commented 1 year ago

Hi, The threshold types and values for HMM profiles were stored in METABOLIC_template_and_database/hmm_table_template.txt. We did not use the scores within individual HMM. We use the PF02406.hmm within the METABOLIC_hmm_db folder instead of a whole Pfam database (it might be also changed during version update for the whole Pfam database)