elizabethmcd / metabolisHMM

Tool for constructing phylogenies and summarizing metabolic characteristics based on curated and custom profile HMMs
GNU General Public License v3.0
17 stars 5 forks source link

Retrieving protein-coding sequence from a giving function #54

Open Streptomyces1 opened 3 years ago

Streptomyces1 commented 3 years ago

Hey @elizabethmcd good afternoon! I have a very newbie issue related to my results: I run the summarize metabolism option on my isolated genomes, and now I´m interested on checking the protein sequence with matches a given functions, carbon monoxide dehydrogenase, let´s say. I se that on the annotation result I have the protein sequences, and on the summary table I know I have at least one copy, so I´d like to know how to identify it among the protein sequences. Should I blast one by one, or there is an easier way? thank you!

elizabethmcd commented 3 years ago

I can't remember if in the out folder if the program saves the output of sequences for each individual hits, or just the HMM outputs. You could search those HMM output files, or use the single-marker-phylogeny with the HMM for the carbon monoxide dehydrogenase, and that will give you the sequence files in both the .faa and alignment files for hits in your genomes. You could just tell it to run with FastTree if you aren't interested in robust phylogenies and just want the sequences for that marker specifically.