ClinGen / gene-and-variant-curation-tools

ClinGen's gene and variant curation interfaces (GCI & VCI). Developed by Stanford ClinGen team.
https://curation.clinicalgenome.org/
MIT License
3 stars 1 forks source link

dbNSFP list order is not maintained #370

Closed ibosdet closed 1 month ago

ibosdet commented 1 month ago

The dbNSFP annotation set will in some cases have values for all or only a subset of the known transcripts. Is there a way to limit retrieval of annotations to a single transcript, or to predict the order of the lists?

For example, for the variant NM_000546.6(TP53):c.833C>G I would like the REVEL and Alphamissense annotations. From the dbNSFP data set we can see the these annotations are available only a different subset of the transcripts - see attached text file dbNSFP_query.txt

I can use this query to retrieve the values: https://myvariant.info/v1/variant/chr17:g.7673787G%3EC?assembly=hg38&fields=dbnsfp.revel,dbnsfp.alphamissense,dbnsfp.ensembl.transcriptid

But the output lists have 19 values (transcripts), 16 values (alphamissense.score) and 6 values (revel.score), without an obvious order:

{
  "_id": "chr17:g.7673787G\u003EC",
  "_version": 2,
  "dbnsfp": {
    "_license": "http://bit.ly/2VLnQBz",
    "alphamissense": {
      "score": [0.9936, 0.9899, 0.9873, 0.9932, 0.9907, 0.9848, 0.9852, 0.9966, 0.9951, 0.9944, 0.9947, 0.9952, 0.9968, 0.9941, 0.9921, 0.9911]
    },
    "ensembl": {
      "transcriptid": [
        "ENST00000359597",
        "ENST00000504290",
        "ENST00000510385",
        "ENST00000504937",
        "ENST00000619186",
        "ENST00000618944",
        "ENST00000610623",
        "ENST00000610292",
        "ENST00000269305",
        "ENST00000620739",
        "ENST00000617185",
        "ENST00000455263",
        "ENST00000420246",
        "ENST00000622645",
        "ENST00000610538",
        "ENST00000445888",
        "ENST00000619485",
        "ENST00000615910",
        "ENST00000509690"
      ]
    },
    "revel": {
      "score": [0.951, 0.951, 0.951, 0.951, 0.951, 0.951]
    }
  }
}

If I want values only for ENST00000269305 (the 9th item in the list) is there a way to limit the search to only that transcript? Alternatively (and ideally) could the list order be maintained with empty fields e.g:

"score": [0.951,0.951,0.951,0.951,0.951,.,.,.,0.951,.,.,.,.,.,.,.,.,.,]