matomo-org / matomo

Empowering People Ethically with the leading open source alternative to Google Analytics that gives you full control over your data. Matomo lets you easily collect data from websites & apps and visualise this data and extract insights. Privacy is built-in. Liberating Web Analytics. Star us on Github? +1. And we love Pull Requests!
https://matomo.org/
GNU General Public License v3.0
19.91k stars 2.65k forks source link

Make log_action deletion require no table locking #13872

Open diosmosis opened 5 years ago

diosmosis commented 5 years ago

Currently when purging dangling references from the log_action table, we lock it in order to make sure we don't accidentally delete a row that ends up being referenced during tracking. This is problematic, since purging log_action can take a long time, and blocks tracking while it is running.

The proposed solution is to remove the need for locking by:

Issues w/ concurrency must be reviewed carefully when implementing. Eg, we must ensure situations like the following do not occur:

or:

EreMaijala commented 5 years ago

Which issue is this a duplicate of?

tsteur commented 5 years ago

@EreMaijala unfortunately I can't find the issue right now.

@diosmosis do you maybe remember?

diosmosis commented 5 years ago

@tsteur No, it was mentioned in slack but we don't have a paid account so we can't look that far back.

EreMaijala commented 5 years ago

@diosmosis, @tsteur Well, may I suggest reopening this one until the actual duplicate issue is found?

tsteur commented 5 years ago

Sure 👍

mattab commented 4 years ago

Regularly people report issues with this locking mechanism. It would make Matomo that bit more stable if log_action deletion required no table locking. The workaround currently is to run the process less often for example once a year with delete_logs_unused_actions_schedule_lowest_interval = 365 or once a month. But that only postpones the problem or make it happen less often. (Also deleting unused actions less often means that the table has more data, and it takes longer to insert into it, update indexes, makes tracking a little slower)