matomo-org / matomo

Empowering People Ethically with the leading open source alternative to Google Analytics that gives you full control over your data. Matomo lets you easily collect data from websites & apps and visualise this data and extract insights. Privacy is built-in. Liberating Web Analytics. Star us on Github? +1. And we love Pull Requests!
https://matomo.org/
GNU General Public License v3.0
19.91k stars 2.65k forks source link

Archiving task fails when tracking failures contain large number of URL params (notify tracking failures) #20422

Open tagleeson opened 1 year ago

tagleeson commented 1 year ago

Expected Behavior

Archive task should complete successfully

Current Behavior

Archive task fails due to the following: parse_str(): Input variables exceeded 1000

Possible Solution

https://github.com/matomo-org/matomo/blob/f29b9903b318eb6128cd935ff2dac29d8ba0f197/plugins/CoreAdminHome/Tasks.php#L216 <- this code could short-circuit to check the configuration before attempting to find the failures.

Steps to Reproduce (for Bugs)

  1. Create a tracking failure on a URL with > 1000 parameters
  2. Attempt to run archiving task

Context

Sometimes there are malicious attempts to access URLs with many parameters, which causes the archiving task to fail when it comes to report the tracking failures. Note that we do not want to increase the parameter limit as suggested, since this is a malicious attempt.

We currently have the configuration setting for notifying tracking failures turned off, but this doesn't fix the issue, as noted above, the code attempts to get the failures before checking the config. It would be nice as a stopgap to avoid the issue by having the configuration switched off.

Your Environment

sgiehl commented 1 year ago

@tagleeson You could try if adding an error suppression (@) on this line would solve the issue: https://github.com/matomo-org/matomo/blob/c973567705a0065fdd7d7c7b11b80f1f0f1be350/core/Tracker/Failures.php#L165

The problem is that parse_str uses the php ini config max_input_vars and if the number of variables in the parsed string exeeds this number a warning is triggered and additional variables will be truncated. As this doesn't matter in this case, simply ignoring the warning might be fine.