Empowering People Ethically with the leading open source alternative to Google Analytics that gives you full control over your data. Matomo lets you easily collect data from websites & apps and visualise this data and extract insights. Privacy is built-in. Liberating Web Analytics. Star us on Github? +1. And we love Pull Requests!
When executing archiving in a more concurrent way, we experience a significant amount of deadlocks compared to when we run it without concurrency. This behavior can occur on a very low amount of concurrency, and forces us to prevent multiple websites from being archived simultaneously.
In an effort to improve concurrency we allowed multiple websites to be processed at the same time however this produced so many deadlock errors in our logs that it became untenable to continue. The following stack traces were common:
no.1 The updateArchiveAsInvalidated error (~300 occurences):
When multiple processes are attempting to update the same table there should be some durability to ensure that information is written when under high concurrency. While deadlocks are probably something natural to high concurrency, it seems that the system isn't handling such conditions very well.
How can this be reproduced?
To reproduce this I would try to invoke high concurrency conditions by:
having a lot of websites to archive. This become more clear when there is maybe hundreds to archive.
start processing of each website simultaneously with individual processes of core:archive with argument force-id-sites (obviously worst case for recreation, we rate limit it normally)
What happened?
When executing archiving in a more concurrent way, we experience a significant amount of deadlocks compared to when we run it without concurrency. This behavior can occur on a very low amount of concurrency, and forces us to prevent multiple websites from being archived simultaneously.
In an effort to improve concurrency we allowed multiple websites to be processed at the same time however this produced so many
deadlock
errors in our logs that it became untenable to continue. The following stack traces were common:no.1 The updateArchiveAsInvalidated error (~300 occurences):
no.2: the insertNumericRecord error (~ 180 events)
no. 3 the updateRangeArchiveAsInvalidated error (~1000 events)
I believe this is related to https://github.com/matomo-org/matomo/issues/21749 but that one seems to have a different direction, yet commenters have similar symptoms and causes.
What should happen?
When multiple processes are attempting to update the same table there should be some durability to ensure that information is written when under high concurrency. While deadlocks are probably something natural to high concurrency, it seems that the system isn't handling such conditions very well.
How can this be reproduced?
To reproduce this I would try to invoke high concurrency conditions by:
core:archive
with argumentforce-id-sites
(obviously worst case for recreation, we rate limit it normally)Matomo version
5.2
PHP version
8.1
Server operating system
Linux
What browsers are you seeing the problem on?
Not applicable (e.g. an API call etc.)
Computer operating system
N/A
Relevant log output
No response
Validations