statonlab / tripal_ortholog

Implements orthologous groups and the chado group module
0 stars 0 forks source link

What input formats will we expect? #1

Open bradfordcondon opened 6 years ago

bradfordcondon commented 6 years ago

I imagine TSV feature name orthogroup name.

Need to make a list of software outputs (IE ORTHOMCL) and see if there are standard formats to support.

bradfordcondon commented 6 years ago

ORTHOMCL.

ORTHOMCL610(4 genes,4 taxa): NC01972(NC) PF02630(PF) TA00120(TA) TG07927(TG)

This cluster, number 610, has 4 genes from 4 taxa, one from each of the species used as input.

so long as the gene matches the feature's uniquename, we'll have no problems and thats lal we'll need to import.

bradfordcondon commented 6 years ago

What about annotating the groups? For example, maybe we pick a representative gene from teh group and use that feature to annotate teh group as a whole

bradfordcondon commented 6 years ago

orthofinder:

OG0000526 gi|284812254|gb|AAP57010.2| gi|3844631|gb|AAC71238.1|

Single copy orthogroups

SingleCopyOrthogroups.txt

this simply a list of newline seperated orthogroups. These groups will want to be tagged as "single copy" in the group props.