Anthony-Nolan / Atlas

A free & open-source Donor Search Algorithm Service
GNU General Public License v3.0
9 stars 5 forks source link

Failed searches resulting in dead-letter message #917

Closed seanmobrien closed 1 year ago

seanmobrien commented 1 year ago

Describe the bug Failed searches result in a dead-letter message being added to the search-results-ready/atlas-searchfinished-subscriber subscription

To Reproduce Steps to reproduce the behavior:

  1. Run a search that encounters an error during the MatchingAlgorithm stage
  2. Using service bus explorer, look at the search-results-ready/atlas-searchfinished-subscriber dead letter queue. Observe new message in dead-letter queue. Note the BlobStorageContainerName and ResultsFileName are null and failure info is non-null

Expected behaviour search-results-ready/atlas-searchfinished-subscriber messages are successfully delivered, regardless of whether the search was successful

Inputs/Outputs Sample dead-lettered message: {"FailureMessage":"Search failed at stage: Matching Algorithm. See Application Insights for failure details.","FailureInfo":{"StageReached":"Matching Algorithm","MatchingAlgorithmFailureInfo":{"ValidationError":null,"AttemptNumber":11,"RemainingRetriesCount":0},"WillRetry":false},"MatchingAlgorithmTime":"00:00:00","MatchPredictionTime":"00:00:00","OverallSearchTime":"00:00:00","SearchRequestId":"xxxxxxxxx","WasSuccessful":false,"RepeatSearchRequestId":null,"BlobStorageContainerName":null,"ResultsFileName":null,"NumberOfResults":null,"MatchingAlgorithmHlaNomenclatureVersion":null}

Atlas Build & Runtime Info (please complete the following information):

zabeen commented 1 year ago

@seanmobrien I don't think this is an Atlas backlog issue, as Atlas doesn't consume messages on that sub. This is likely down to the WMDA function that is reading the messages - I can see the delivery count is 11 for last message, further, the message was for a successfully completed search. It's possibly due to messages being raised for searches that no longer exist on wmda side, that were generated during testing. Please could you copy this ticket over to wmda integration backlog?

seanmobrien commented 1 year ago

Thanks @zabeen -

I had copy/pasted the wrong topic / subscriber name...I've recreated this as bug #918 with the correct topic.

One thing I observed - it looks like larger successful search requests are also causing a dead lettered message on the repeat-search subscription - one instance of this that occurred today is message id f51e302a67ce4a0594a6c9d0b54de824 / serach request 9bc76728-c794-47d0-b563-f1a5f5168cba (68621 matches). Possible this is already covered under the OutOfMemoryException ticket, but wanted to get it on your radar Snag_5d1b3e4