Closed ktuite closed 1 month ago
Looking at the timestamps on the audit log, I think two update submissions were processed by different worker threads at the same time. The first update started, and then the second update started, but the first hadn't finished so the second couldn't find the appropriate base version yet.
I'm studying Sadiq's earlier PR about locking entities for update to improve how/where/when I'm locking entities for update (and how to write some tests for this). https://github.com/getodk/central-backend/pull/1033
QA noticed this scenario with an offline create + update + update, where the 3rd update wasn't applied. Even though the submissions came in in the right order, I found the 3rd submission in the backlog
I'm able to reproduce this scenario in testing. I'm also able to reproduce another, related scenario. If two workers concurrently process the first two submissions (the offline create + the first offline update), then the second submission can be sent to the backlog, where it will not be processed until it is force-processed. My first attempt at a solution addressed the scenario described above, but not the related scenario. However, my PR getodk/central-backend#1195 should address both scenarios.
I'm going to go ahead and merge that PR. I'm still working on writing a test that doesn't feel like its own race condition, but I should be able to finish that tomorrow.
I'm still working on writing a test
There's now a test here: getodk/central-backend#1202. I'm going to go ahead and close the issue.
QA noticed this scenario with an offline create + update + update, where the 3rd update wasn't applied. Even though the submissions came in in the right order, I found the 3rd submission in the backlog:
Entity https://staging.getodk.cloud/#/projects/101/entity-lists/movies/entities/7f362b7f-3780-44f0-bbb3-97d9770abcdf
Submission 1: (applied) row 18 in other form - 5:08:09 subId 3618 https://staging.getodk.cloud/v1/projects/101/forms/movie%20entities/submissions/uuid%3Aed096cd0-202c-431a-8cea-624a8ceb19b0.xml
<entity dataset="movies" id="7f362b7f-3780-44f0-bbb3-97d9770abcdf" create="1" trunkVersion="" branchId="" baseVersion="">
Submission 2: (applied) row 35 - 5:08:09 - subId 3620 https://staging.getodk.cloud/v1/projects/101/forms/movies_update/submissions/uuid%3Abb8bbdb8-2f4d-4b30-9c3b-3a62c4e93a2f.xml
<entity dataset="movies" id="7f362b7f-3780-44f0-bbb3-97d9770abcdf" update="1" baseVersion="1" trunkVersion="" branchId="10138667-8a81-4721-aad5-f3316de1bc26">
Submission 3: (not applied) row 37 - 5:08:10 - subId 3622 https://staging.getodk.cloud/v1/projects/101/forms/movies_update/submissions/uuid%3Ae0b9644e-0905-4ea0-98e9-0388b5a4ecbb.xml
<entity dataset="movies" id="7f362b7f-3780-44f0-bbb3-97d9770abcdf" update="1" baseVersion="2" trunkVersion="" branchId="10138667-8a81-4721-aad5-f3316de1bc26">
Submission 3 is in the backlog. It likely got processed after the other two submissions, and it should have been able to find the right base version, so why did it get held?