Anthony-Nolan / Atlas

A free & open-source Donor Search Algorithm Service
GNU General Public License v3.0
9 stars 5 forks source link

Investigate several failed searches in AN WMDA UAT environment #1031

Closed daria-sorokina-da closed 1 year ago

daria-sorokina-da commented 1 year ago

Several searches failed in AN WMDA UAT environment according to the teams channel.

Need to investigate what happened.

Failures on Monday 7 Aug Search ID: 441e824d-58fc-40a6-890a-8c45663f4dc1 Search ID: 5d7ea224-3a3d-4d5a-b945-eb7044aa06da Search ID: 3d0e3c65-4553-422c-85b2-875fecf7e529 Search ID: 72b5c3ee-49f5-468a-8838-171ff382b1c4 Search ID: ad629ac1-4ae3-4e8e-80f1-a3ca72a5c61a

Failures on Tuesday 8 Aug Search ID: e88331ba-b9a8-4ed2-995c-4dc2fcf36ffe

All failures have the following details: Repeat ID (if relevant): Stage Reached: Matching Algorithm ValidationError (if relevant):

zabeen commented 1 year ago

Investigations

Failures caused by *NEW allele in the patient

441e824d-58fc-40a6-890a-8c45663f4dc1 e88331ba-b9a8-4ed2-995c-4dc2fcf36ffe db1bb854-59f5-46c1-a68d-d3cfed62e0a9 38dfe9cf-073e-4192-b5ef-2f631261a5fb

The validation error was reported in the search notification but logic app wasn't grabbing the value to display it in the teams message, that's been fixed now.

Logs show only matching traces, but no exceptions

3d0e3c65-4553-422c-85b2-875fecf7e529 - 07/08/2023 09:30 5d7ea224-3a3d-4d5a-b945-eb7044aa06da - 07/08/2023 09:30 72b5c3ee-49f5-468a-8838-171ff382b1c4 - 07/08/2023 09:37 ad629ac1-4ae3-4e8e-80f1-a3ca72a5c61a - 07/08/2023 09:40 5bad2ac4-ea2d-4872-aef5-14c28010393c - 09/08/2023 14:18 d6a71944-a4e2-4243-b731-cd6db2711bfd - 11/08/2023 07:00 9432ab74-4d94-4001-8e98-69ae45ef02a2 - 11/08/2023 07:02

zabeen commented 1 year ago

Investigations - Part 2

Attempt 1

zabeen commented 1 year ago

Dev

Testing

zabeen commented 1 year ago

@seanmobrien increased the number of vCores on the active matching db from 8 to 20, and that has reduced the number of failed searches at the matching stage, but not completely eliminated them. The same issue has not yet been observed in live-wmda-atlas, so there is a still a question of whether any action needs to be taken beyond tweaking resource configurations and reducing search parallelisation on UAT as much as possible.

zabeen commented 1 year ago

Searches are still occasionally failing on UAT, but thankfully not on Live.

I am closing this ticket for now, as it is not a live issue - it can be reopened or new ticket raised if required.