opentargets / issues

Issue tracker for Open Targets Platform and Open Targets Genetics Portal
https://platform.opentargets.org https://genetics.opentargets.org
Apache License 2.0
12 stars 2 forks source link

API does not return `sumStatQCValues` #3630

Open d0choa opened 1 day ago

d0choa commented 1 day ago

The current API is returning null in the sumStatQCValues (a maptype column) for records that have data in the OpenSearch (OS) layer.

query sumstatsQC {
  gwasStudy(studyId: "GCST005921") {
    studyId
    hasSumstats
    sumStatQCValues{
      QCCheckName
      QCCheckValue
    }
  }
}
{
  "data": {
    "gwasStudy": [
      {
        "studyId": "GCST005921",
        "hasSumstats": true,
        "sumStatQCValues": null
      }
    ]
  }
}

This is the schema of the data in freeze6:

In [19]: df.filter(f.col("studyId") == "GCST005921").select("studyId", "hasSumst
    ...: ats", "sumStatQCValues").printSchema()
root
 |-- studyId: string (nullable = true)
 |-- hasSumstats: boolean (nullable = true)
 |-- sumStatQCValues: map (nullable = true)
 |    |-- key: string
 |    |-- value: float (valueContainsNull = true)

This is the data:

In [20]: df.filter(f.col("studyId") == "GCST005921").select("studyId", "hasSumst
    ...: ats", "sumStatQCValues").show(1, vertical = True, truncate = False)
-RECORD 0------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 studyId         | GCST005921
 hasSumstats     | true
 sumStatQCValues | {mean_beta -> -2.0339141E-6, mean_diff_pz -> 3.2054406E-5, se_diff_pz -> 0.003330459, gc_lambda -> 1.04871, n_variants -> 7746640.0, n_variants_sig -> 567.0}

@jdhayhurst confirmed the data is available in the OS layer. Assigning to @jdhayhurst but feel free to share the task