veg / hyphy-analyses

HyPhy standalone analyses
MIT License
37 stars 17 forks source link

Input sequences aligments #20

Open ritafonso opened 2 years ago

ritafonso commented 2 years ago

Hi,

I'm using MEME and FUBAR in a dataset of 6 species, but I'm most interested only in one. I'm wondering if for that species I should include all the individual's sequences that I have or just a consensus. Also, should I use separeted phased alleles or just a consensus sequence for each species?

Thanks in advance, Rita

spond commented 2 years ago

Dear @ritafonso,

MEME and FUBAR are designed to work with fixed differences (not within-population polymorphism). That said, people commonly analyze some types of data (e.g. viruses or bacteria) where each sequence is one individual. Not sure what you mean by separeted phased alleles.

Also, be advised that neither MEME nor FUBAR will work particularly well on small datasets (6 sequences). Consider using a recent modification of FEL.

Best, Sergei

ritafonso commented 2 years ago

Dear @spond ,

thank you very much for your quick response!

By separed phased alleles I mean the two alleles of a diploid sequence of a gene, but I guess that doesn't matter if the software works only with fixed differences.

How many sequence would you consider a large dataset, worth of using MEME or FUBAR?

Best regards,

Rita