davidemms / OrthoFinder

Phylogenetic orthology inference for comparative genomics
https://davidemms.github.io/
GNU General Public License v3.0
673 stars 186 forks source link

Discussion about adding new proteins to existing orthogroups? #691

Open jolespin opened 2 years ago

jolespin commented 2 years ago

I'm working on a project where I get data in series. For each round, I use OrthoFinder which works great and the output is extremely intuitive but takes a long time and uses a lot of compute power.

Is it possible to bypass sequential runs and use some post-hoc method where I could use some metric to append new proteins to existing orthogroups?

For example, what if I run Diamond alignment on a new protein against all proteins with orthogroups then if they have an average percent identity or bitscore above a certain threshold then include them as that cluster? Is it way less straightforward than that?

pierrj commented 2 years ago

Were you able to find a solution for this? I am trying to do exactly the same thing but having a hard time figuring out a good solution

I know there is the option to add species into OrthoFinder and use the previously computed results but I am not looking to add whole new species, just a single protein at a time

pierrj commented 2 years ago

I ended up just using BLASTP on my protein against a database containing all of the proteins I used for my OrthoFinder run and it seemed to work pretty well on a control dataset.

Please do let me know if there are any other solutions out there.