bihealth / sodar-server

SODAR: System for Omics Data Access and Retrieval
https://github.com/bihealth/sodar-server
MIT License
14 stars 3 forks source link

Project lock failure can update status of finished landing zone with rapid request spamming #1909

Closed mikkonie closed 4 months ago

mikkonie commented 5 months ago

I've witnessed the following on our production server:

Looking at the timeline for that zone, it seems we have received multiple requests for moving the zone after it has already finished.

It seems that when setting the failed status due to project lock, we don't look into the zone status?

No taskflow should be attempted on a finished zone to begin with and the UI/API view should catch that before the lock is even checked. But maybe there just have been so many near-simultaneous requests, that at the point where we fail the lock check, the zone status has not yet been updated anyway. This may be a tricky one. Maybe a queue solution like proposed in #1910 could help...

Current plan to tackle this:

mikkonie commented 4 months ago

This should be, at least mostly, fixed. It's still possible that if there is an ongoing non-finished taskflow and near-simultaneous spam is done, we temporarily get a lock failure as the zone status. But even then, the ongoing task will eventually submit the correct status.

That would seem good enough, but if we identify cases where this still remains a problem, I'll reopen the ticket.