merenlab / anvio

An analysis and visualization platform for 'omics data
http://merenlab.org/software/anvio
GNU General Public License v3.0
426 stars 145 forks source link

Is it normal to get gen clusters with the same COG id? #1573

Closed Lily-WL closed 3 years ago

Lily-WL commented 3 years ago

Hi, Meren. Thanks for your programe. Following the tutorial of pangenomics, I got a summary *.txt. However, I found that some gene clusters usually correspongded to same COG id, such as the following table. And I found gene cluster that speicial for a group of genomes shared the same COG id with other gene cluster which may special for some other genomes. As a result, although different groups of genome have their own gene cluster, but they all had the same function of gene with the same COG id. Is it normal? Whether these kinds of genes were the "core gene" or not? How can we explain this? Thakns very much.

gene_cluster_id COG_FUNCTION_ACC
GC_00000358 COG0019
GC_00000402 COG0019
GC_00004248 COG0019
GC_00002268 COG0019
GC_00000611 COG0020
GC_00000027 COG0021
GC_00001874 COG0022
GC_00000801 COG0023
GC_00000821 COG0024
GC_00001354 COG0025
GC_00002259 COG0025
GC_00002491 COG0025
GC_00002756 COG0025
GC_00002969 COG0025
GC_00004746 COG0025|COG1226
GC_00004623 COG0026
GC_00000842 COG0027
GC_00000398 COG0028
GC_00006208 COG0028
GC_00004255 COG0028
GC_00006986 COG0028
meren commented 3 years ago

Hi Lily,

Hi, Meren. Thanks for your programe.

Just to make sure it is clear, anvi'o is a team effort and is developed and maintained thanks to the contributions of many people :) Plus, it belongs to the public as per the General Public License: So anvi'o is as much yours as it is anvi'o developers'.

Regarding your question: It is normal! Sequence space is much more diverse than functional space. Thus, two genes with distinct sequences can resolve to the same function (regardless of whether they are core or accessory among a set of genomes). That is pretty much evolution.