JosephCrispell / homoplasyFinder

A tool to identify and annotate homoplasies on a phylogeny and sequence alignment
GNU General Public License v3.0
19 stars 3 forks source link

Generic gene presence/absence file #15

Closed molleraj closed 3 years ago

molleraj commented 3 years ago

Hi! I would like to use homoplasyFinder with a generic gene presence/absence matrix input, as opposed to just an indel presence/absence matrix. I'd want to find consistency indices for gene presence/absence. Could you either add this feature or give me the Java source so I can modify it accordingly?

Thanks! Jon

JosephCrispell commented 3 years ago

Hi Jon,

Thanks for using homoplasyFinder, you should be able to use homoplasyFinder without any changes on your gene presence/absence matrix. The matrix will need to use following this format:

start,end,isolateA,isolateB,isolateC
34802,35208,0,1,0
39068,39069,0,0,1

I'd suggest using the start and end position of the genes you're interested in but homoplasyFinder doesn't use these columns as numbers so as long as they uniquely identify each gene you can put anything in them. 0 for absent and 1 for present.

Let me know if that will work for you.

The Java source code is available here - I'll add this to the documentation.

molleraj commented 3 years ago

Hi Joseph,

Thanks for all that! Yes, I will enter the start and end parameters for each gene. Thanks also for sharing the Java source code.

Regards, Jon