EddyRivasLab / hmmer

HMMER: biological sequence analysis using profile HMMs
http://hmmer.org
Other
305 stars 69 forks source link

Parameters for building HMMs #316

Open sanyalab opened 8 months ago

sanyalab commented 8 months ago

Hello,

I have several protein MSA's and I was wondering about the parameters to use while building HMMs. I am aware of the default command, but was curious to know if there is an advantage or use-case on the different parameters. For example, is there a difference in the HMM's created when --wgsc is used as opposed to --wpb. Or any use case of --plaplace

Thanks Abhijit

cryptogenomicon commented 8 months ago

I recommend using the defaults. The options you mention are things that I've benchmarked in the past, in settling on the defaults, and that I don't prefer.

sanyalab commented 8 months ago

Thank you Sean. I had a follow-up question regarding MSA's.

Trimming and realignment (sometimes done manually) performed iteratively is supposed to improve the accuracy of HMM's. But how can one decide when to stop the trimming process? I am constantly debating this, as too much trimming will change the context of HMM and produce erroneous hits when I use these for searching protein DB's. Are there some metrics or tools to determine the maintenance of the general structure of MSA even after trimming?

Best Regards, Abhijit