Closed zabeen closed 2 years ago
After discussion with @benbelow and @luken-an, we have decided to add a new service-bus function to the Match Prediction project that will bulk process messages that each contain a single PDP for which match probabilities will be calculated; results will be stored to a sub-folder of the match-prediction-results
container.
New http-triggered function that accepts single PDP request and returns a unique request ID (i.e., search request pattern).
match-prediction-requests
.New function that bulk downloads service bus messages from match-prediction-requests
, and iterates over each request (i.e., donor update pattern)
match-prediction-results
container.match-prediction-only
or match-prediction-without-search
.<match-prediction-request-id>.json
.Manual Testing
to run the validation exercise (i.e., following match prediction verification pattern).Atlas.MatchPrediction.Test.Validation
i.e., ability to run a match prediction request outside of search.
verify
environment (build 20220923.1
)null
for patient and donor metadata to force use of the global HF set that already exists on that env
3330
HLA run for all requests, both patient and donor:
"A": {
"Position1":"*02:XX",
"Position2": "*02:XX"
},
"B": {
"Position1": "*40:XX",
"Position2": "*40:XX"
},
"C": {
"Position1": "*03:XX",
"Position2": "*03:XX"
},
"Dpb1": {
"Position1": "*02:XX",
"Position2": "*01:XX"
},
"Dqb1": {
"Position1": "*03:XX",
"Position2": "*03:XX"
},
"Drb1": {
"Position1": "*04:XX",
"Position2": "*04:XX"
}
1 match prediction request ✅
multiple match prediction requests - submitted 500 identical requests ✅
Logs and diagnostics showed that requests were processed in batches, but there was no scaling out of the functions app, possibly due to the service bus connection string not having Manage
permissions, and not being able to assess message count on the subscription.
Error when processing match prediction request. Login failed for user 'match_prediction'.
✔C*01:02:01:38
in patient which is present in v3490
but absent from v3330
:Conversion of compressed phenotype to target HLA category
✔TESTING PASSED ✅
Message batch size on service bus triggered functions is determined via host.json property: maxMessageBatchSize
, which means at present a code change and release would be required to change the value.
This may be a way to set the value via terraform: https://stackoverflow.com/questions/71935339/set-functiontimeout-using-terraform
I don't have time to test this out now, so will raise a new issue to cover this tech debt.
Update, raised new issue: https://github.com/Anthony-Nolan/Atlas/issues/820
Validation has been completed & passed
External partners including NMDP and WMDA have requested that this published consensus dataset (specifically, MVS3) be run through Atlas to ensure that its match predictions are in line with expectations.
Some changes are required to the codebase to allow efficient processing of the 10 million patient-donor-pair (PDP) dataset.