SionBayliss / PIRATE

A toolbox for pangenome analysis and threshold evaluation.
GNU General Public License v3.0
89 stars 29 forks source link

Question: Which type of homology is used with a mix of feature types? #30

Closed cizydorczyk closed 4 years ago

cizydorczyk commented 4 years ago

Hi,

When running PIRATE with several features (e.g. CDS,rRNA,tRNA), some of which require nucleotide homology to be used (i.e. rRNA,tRNA), does PIRATE automatically resort to using nucleotide sequence homolgy for all features? Or just the ones requiring it?

I.e. would CDS be analyzed using nucleotide or a.a. homology in this case? If nucleotide (which I suspect from an example on the main page), would it be difficult to implement an option to use a.a. for CDS but nucleotide for non-coding features?

Any response is greatly appreciated!! Thank you, Conrad

SionBayliss commented 4 years ago

Hi Conrad,

When using a mixture of feature types PIRATE defaults to nucleotide identity because all features are combined during clustering. I have considered some different approaches to this but they are currently not instituted in the software. I would recommend running PIRATE separately on CDS features and concatenating the outputs. I will open this issue again if I make any progress on mixed aa/nuc runs.

All the best, Sion