pantherdb / pango

BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Add genome coordinates to gene_info.json (for human genes only) #2

Closed dustine32 closed 1 year ago

dustine32 commented 1 year ago

Add genome coordinates to the gene_info.json file. Example:

    {
        "gene": "UniProtKB:P15036",
        "gene_symbol": "ETS2",
        "gene_name": "Protein C-ets-2",
        "coordinates": {
            "chr_num": "21",
            "start": "38805951",
            "end": "38824955",
            "strand": "1"
        }
    },

We can source this data from the Homo_sapiens.chromosomal_location file we produce in the PANTHER build:

HUMAN|HGNC=3489|UniProtKB=P15036        21      38805951        38824955        1
HUMAN|HGNC=3301|UniProtKB=Q9GZV4        3       170894292       170908644       -1
HUMAN|HGNC=30323|UniProtKB=Q8TDB6       3       122564427       122575203       1
dustine32 commented 1 year ago

I just added human genome coordinates to the JSON file. @tmushayahama I forget what exactly this should be used for in the UI. We can ask @thomaspd @huaiyumi.

dustine32 commented 1 year ago

Suggested by @tmushayahama, I'll flatten the coordinates fields so that they are not nested in the gene_info obj. Like:

    {
        "gene": "UniProtKB:P15036",
        "gene_symbol": "ETS2",
        "gene_name": "Protein C-ets-2",
        "coordinates_chr_num": "21",
        "coordinates_start": "38805951",
        "coordinates_end": "38824955",
        "coordinates_strand": "1"
    },