monarch-initiative / monarch-ingest

Data ingest application for Monarch Initiative knowledge graph using Koza
https://monarchinitiative.org
14 stars 1 forks source link

Capture FlyBase Gene to Phenotype associations in a UPHENO connected way #418

Open kevinschaper opened 1 year ago

kevinschaper commented 1 year ago

We won't be able to consider our initial new graph complete without gene to phenotype associations capturing from the major model organisms.

Pulling from Alliance phenotype submission files (https://fms.alliancegenome.org/download/PHENOTYPE_FB.json.gz) we find post composed phenotype entries using FBbt & FBcv, but we need our g2p associations to connect to UPHENO.

Here is an example entry:

    {
      "dateAssigned": "2022-11-15T10:03:53-05:00",
      "evidence": {
        "crossReference": {
          "id": "FB:FBrf0210385",
          "pages": [
            "reference"
          ]
        },
        "publicationId": "PMID:20226662"
      },
      "objectId": "FB:FBgn0011648",
      "phenotypeStatement": "histoblast | somatic clone",
      "phenotypeTermIdentifiers": [
        {
          "termId": "FBbt:00001789",
          "termOrder": 1
        },
        {
          "termId": "FBcv:0000336",
          "termOrder": 2
        }
      ],
      "primaryGeneticEntityIDs": [
        "FB:FBal0044913"
      ]
    },

Using this Alliance file isn't necessary. A best case scenario, it would be great to get single phenotype ontology terms from the Alliance file to bring them in using the same code that captures MGI, RGD, etc. For ZFIN g2p, we pull from a ZFIN specific file to more easily map to ZP terms, and we could definitely do the same here.

matentzn commented 7 months ago

@kevinschaper and I are slowly working on this in https://github.com/monarch-initiative/uphenotizer