microbiomedata / nmdc-aggregator

Scripts that periodically aggregate data related to KEGG search
0 stars 0 forks source link

update generate_functional_agg.py to generate aggregation results for COG and Pfam #12

Closed aclum closed 2 months ago

aclum commented 3 months ago

related to FY25 Q1 milestone https://github.com/microbiomedata/issues/issues/522

For an annotation gff file aggregated results should be added to functional_annotation_agg. Curie prefixes to be used, as defined in the schema, are COG and PFAM respectively.

Example records: { "metagenome_annotation_id": "nmdc:wfmgan-11-ndgg7v31.1", "gene_function_id": "COG.COG0001", "count": 56 }, { "metagenome_annotation_id": "nmdc:wfmgan-11-ndgg7v31.1", "gene_function_id": "PFAM:PF02171", "count": 56 }