Anthony-Nolan / Atlas

A free & open-source Donor Search Algorithm Service
GNU General Public License v3.0
9 stars 5 forks source link

Validate match prediction using the published consensus dataset #805

Closed zabeen closed 2 years ago

zabeen commented 2 years ago

External partners including NMDP and WMDA have requested that this published consensus dataset (specifically, MVS3) be run through Atlas to ensure that its match predictions are in line with expectations.

Some changes are required to the codebase to allow efficient processing of the 10 million patient-donor-pair (PDP) dataset.

zabeen commented 2 years ago

HLD

After discussion with @benbelow and @luken-an, we have decided to add a new service-bus function to the Match Prediction project that will bulk process messages that each contain a single PDP for which match probabilities will be calculated; results will be stored to a sub-folder of the match-prediction-results container.

Atlas.MatchPrediction.Functions

Manual Testing

zabeen commented 2 years ago

Testing of Match Prediction Function changes

i.e., ability to run a match prediction request outside of search.

Notes

Feature testing ✅

Happy path ✅

HLA run for all requests, both patient and donor:

    "A": {
      "Position1":"*02:XX",
      "Position2": "*02:XX"
    },
    "B": {
      "Position1": "*40:XX",
      "Position2": "*40:XX"
    },
    "C": {
      "Position1":  "*03:XX",
      "Position2": "*03:XX"
    },
    "Dpb1": {
      "Position1": "*02:XX",
      "Position2": "*01:XX"
    },
    "Dqb1": {
      "Position1": "*03:XX",
      "Position2": "*03:XX"
    },
    "Drb1": {
      "Position1": "*04:XX",
      "Position2": "*04:XX"
    }

Exception testing ✅

Regression testing ✅

Search

Haplotype Frequency Set import ✅

TESTING PASSED

zabeen commented 2 years ago

Development notes

Message batch size on service bus triggered functions is determined via host.json property: maxMessageBatchSize, which means at present a code change and release would be required to change the value.

This may be a way to set the value via terraform: https://stackoverflow.com/questions/71935339/set-functiontimeout-using-terraform

I don't have time to test this out now, so will raise a new issue to cover this tech debt.

Update, raised new issue: https://github.com/Anthony-Nolan/Atlas/issues/820

zabeen commented 2 years ago

Validation has been completed & passed