Closed io7m closed 1 month ago
The problem with this is that, because the tasks can't write to the database, they also can't record the fact that an error occurred.
This seems to be caused by multiple tasks all starting at exactly the same time and trying to acquire a database lock. They also retry in lock step.
This could be mitigated by staggering task startup times, and also retrying after random pauses.
This is still a problem.
The upload
method needs to take a parameter that represents a delay to wait before starting. Manually started uploads always have a delay of zero, but time scheduled uploads should be staggered fairly heavily (a random value between a second and a minute).
For what it's worth, this problem does only happen once: When a set of uploads run, a lot of them fail immediately due to the above issue. This then means that their "last completed" time is very different to those uploads that did run to completion. This effectively puts all of the upload tasks out of sync and mean that this problem is averted the next time they run.
Still needs fixing though.
This seems to be done.
On a Pixel 8a running Graphene, uploads triggered on a time schedule seem to be failing. It seems to be an SQLite issue: