matomo-org / matomo

Empowering People Ethically with the leading open source alternative to Google Analytics that gives you full control over your data. Matomo lets you easily collect data from websites & apps and visualise this data and extract insights. Privacy is built-in. Liberating Web Analytics. Star us on Github? +1. And we love Pull Requests!
https://matomo.org/
GNU General Public License v3.0
19.87k stars 2.65k forks source link

Performance: Faster algorithm to count unique visitors #3120

Open mattab opened 12 years ago

mattab commented 12 years ago

Suggested in forum post

Currently we count the number of distinct idvisitor, to get unique per week/month/etc.

Instead, we could count each time a visitor is new (he didn't have the idcookie or wasn't match to recent visit), set a flag. Then count the number of times the flag was set. This will be slightly more inaccurate but much faster than running a new SELECT count(disctinct) on very large datasets.

anonymous-matomo-user commented 11 years ago

I started to look into this bug and found the following: http://bugs.mysql.com/bug.php?id=21849 It looks like it might have been fixed in MySQL 5.5.0 and/or we can replace the code with something like: "select count(*) from (select distinct somefield from sometable group by somefield) as somelabel;" which has much better performance on mysql pre-5.5.0.