eggnogdb / eggnog-mapper

Fast genome-wide functional annotation through orthology assignment
http://eggnog-mapper.embl.de
GNU Affero General Public License v3.0
562 stars 105 forks source link

Finding orthologs between species #306

Closed dkeitley closed 3 years ago

dkeitley commented 3 years ago

Hi,

Apologies if this has been documented/answered elsewhere.

I'm hoping to obtain a list of gene orthologs between the sea urchin (Strongylocentrotus purpuratus) and zebrafish (Danio rerio) and wanted to confirm the approach for doing this with eggNOG.

I've run the eggNOG mapper on the protein sequences for the genes of both species (using the default settings with the taxonomic scope limited to chordates) which gives me output files with a list of eggNOG OGs.

From exploring the different OGs returned, I get the impression that they are ordered from least to most phylogenetically restricted(?). So is the idea then to try and match these most restricted OG IDs (assuming they contain both species) between the outputs from the two species?

And then following on from that, how do I then identify whether the orthologs are one-to-one vs many-to-one etc? Do I count for each gene, the number of genes in the other species with a matching OG ID?

Many thanks,

Dan

Cantalapiedra commented 3 years ago

Hi,

sorry for the delay answering.

Hi,

Apologies if this has been documented/answered elsewhere.

I'm hoping to obtain a list of gene orthologs between the sea urchin (Strongylocentrotus purpuratus) and zebrafish (Danio rerio) and wanted to confirm the approach for doing this with eggNOG.

Using eggNOG/eggNOG-mapper is not the right approach in my opinion, although the emapper results could help confirming the orthology / functional relationship. I am sure there are methods better suited to identify orthologs between 2 genomes (OrthoMCL maybe?).

I've run the eggNOG mapper on the protein sequences for the genes of both species (using the default settings with the taxonomic scope limited to chordates) which gives me output files with a list of eggNOG OGs.

From exploring the different OGs returned, I get the impression that they are ordered from least to most phylogenetically restricted(?). So is the idea then to try and match these most restricted OG IDs (assuming they contain both species) between the outputs from the two species?

And then following on from that, how do I then identify whether the orthologs are one-to-one vs many-to-one etc? Do I count for each gene, the number of genes in the other species with a matching OG ID?

In most recent emapper versions you should be able to get this info from the ".emapper.orthologs" output file, which is obtained using --report_orthologs (or by default in the web version).

Many thanks,

Dan

Best, Carlos

Cantalapiedra commented 3 years ago

Closing this. Please, reopen or reissue if needed.