matomo-org / matomo

Empowering People Ethically with the leading open source alternative to Google Analytics that gives you full control over your data. Matomo lets you easily collect data from websites & apps and visualise this data and extract insights. Privacy is built-in. Liberating Web Analytics. Star us on Github? +1. And we love Pull Requests!
https://matomo.org/
GNU General Public License v3.0
19.36k stars 2.59k forks source link

Visitors overview is slow when requesting a big date range #9532

Closed quba closed 7 months ago

quba commented 8 years ago

Visitors overview is most likely the slowest to load report in Piwik. It's beacause this reports loads after all data is there (main chart and sparklines for all others).

My idea is to load each chart separately. In this case only a few reports might timout and the user experience would be much better.

gaumondp commented 8 years ago

Having Sparkling-free widgets would cut the processing in half, right?

I proposed "Text-only" Visits Overview in #9433.

2016-01-15 8:03 GMT-05:00 Kuba Bomba notifications@github.com:

Visitors overview is most likely the slowest to load report in Piwik. It's beacause this reports loads after all data is there (main chart and sparklines for all others).

My idea is to load each chart separately. In this case only a few reports might timout and the user experience would be much better.

— Reply to this email directly or view it on GitHub https://github.com/piwik/piwik/issues/9532.

tsteur commented 8 years ago

My idea is to load each chart separately.

I'm not really sure what you mean here? Are you suggesting we should load all the other widgets first and once others are loaded the overview?

Also I presume we're talking here about a big date range that is not pre-archived right?

quba commented 8 years ago

Also I presume we're talking here about a big date range that is not pre-archived right?

Yes. The thing is that the whole dashboard loads within a few seconds. Only Visitors->Overview report is slow. There are too many sparklines or too much data to load or something is just broken.

I'm not really sure what you mean here? Are you suggesting we should load all the other widgets first > and once others are loaded the overview?

I don't know what's the issue. It was only my suggestion. I presume that something is wrong with the logic there. Maybe there are some locks while accessing numeric data from the DB for too many charts at the same time?

tsteur commented 8 years ago

Yes. The thing is that the whole dashboard loads within a few seconds. Only Visitors->Overview report is slow. There are too many sparklines or too much data to load or something is just broken.

If range archives are persisted it should still be fast but it might be related to https://github.com/piwik/piwik/issues/8444

I don't know what's the issue. It was only my suggestion.

Sweet. I thought you were suggesting a solution with "My idea is to load each chart separately." and I didn't get what you meant there :) So it's just about investigating why it is slow

mattab commented 8 years ago

Visitors overview is most likely the slowest to load report in Piwik. It's beacause this reports loads after all data is there (main chart and sparklines for all others).

There are too many sparklines or too much data to load or something is just broken.

fyi: Sparklines shouldn't make the UI slower to load because Sparklines are loaded in separate HTTP requests that will load their own data. (sparklines images are loaded only after the main report page was loaded in user's browser).

@quba Could you double check that there is not a performance issue somewhere else?

quba commented 8 years ago

@mattab just try to compare these 2 URLs:

quba commented 8 years ago

At least the main chart with visits overview should load almost instantly as it's been already archived while requesting same date range while loading the dashboard and visits overview widget.

tsteur commented 8 years ago

Range dates in general can be sometimes still slow depending on which date range is selected and which periods we can aggregate. Eg 2008-12-30,2015-01-03 is faster than 2008-03-03,2015-11-13 because the first one can mainly use year archives. We could also create archives for 4months or 6 months aggregated so sometimes it would make it faster but there's a chance they are still never used. We did a lot of improvements there, and maybe we can have a look there again at some point, but most likely there's not much we can do in general.

Re visits overview: It does load last and we can investigate what is happening there. From the looks it appears like all the other widgets simply only work on one archive for a fixed period which is much faster in general while the visits overview is the only one that does load each archive individually to render a history. It's rendering like thousands of points there and for such a big date range, the widget could do this probably more efficiently by only looking at much less archives / periods

quba commented 8 years ago

@tsteur but visits overview is quite fast (I mean the widget Visits Over Time) with the same date range.

tsteur commented 8 years ago

Now I get what you mean. This must be caused by the metrics below. They should load fast but it doesn't seem to be the case. The metrics below call several API's eg VisitsSummary.get and Actions.get. Very likely it is maybe caused by Actions.get that makes it slow

BTW: This might be already bit faster in Piwik 3.0 branch. Hope we can have a version of Piwik 3.0 deployed to a server in a few weeks so we could easily compare.

gaumondp commented 8 years ago

Sparklines are loaded in separate HTTP requests that will load their own data.

Understood but it is still additional queries to DB and some crunching to produce each of them (10 on Visitors Overview). Or does the image is built using some cached data ?

In fact, I just almost killed my MySQL server with a segment and a date-range (deadly mix) trying to get some numbers... (dedicated Apache and MySQL servers each with 4 CPUs and 8 GB RAM)

tsteur commented 8 years ago

I'm curious now, can you maybe for a test replace this line https://github.com/piwik/piwik/blob/2.15.0/plugins/VisitsSummary/Controller.php#L182 with if (0) { and see if it is still very slow?

gaumondp commented 8 years ago

Not better with my IP-heavy segment (with "All visits" it's under 20 sec.).

2min. 43 seconds for this (I saw the 4 CPUs in MySQL melting at 95% each) :

http://stats.mysite.com/index.php?module=CoreHome&action=index&idSite=5&period=range&date=2015-03-05,2015-06-05&segment=visitIp%3C215.136.3.0%2CvisitIp%3E215.136.3.255%3BvisitIp!%3D214.37.177.178%3BvisitIp!%3D64.181.161.166#?module=VisitsSummary&action=index&idSite=5&period=range&date=2015-03-05,2015-06-05&segment=visitIp%3C215.136.3.0,visitIp%3E215.136.3.255%3BvisitIp!%3D214.37.177.178%3BvisitIp!%3D64.181.161.166

image

mattab commented 8 years ago

@tsteur maybe this fix https://github.com/piwik/piwik/pull/9992 also improve performance as described in this issue?

tsteur commented 8 years ago

Yes it could make it faster as well. If there are many sites on the system it might likely get faster as well.

tsteur commented 8 years ago

@gaumondp @quba maybe you can check after 2.16.1 release whether it became faster

quba commented 8 years ago

Nope, check this one: http://demo.piwik.org/index.php?module=CoreHome&action=index&idSite=1&period=range&date=2008-01-15,2015-12-23#/?module=VisitsSummary&action=index&idSite=1&period=range&date=2008-01-15,2015-12-23

tsteur commented 8 years ago

It possibly got a bit faster but since it's still takes so long it's not much noticable

tsteur commented 8 years ago

We could maybe look at some point at an xhprof profile again to see if anything changed there and whether we can maybe do something

quba commented 8 years ago

@mattab I don't think it's fixed. It's one of the most important reports in Piwik and its performance is poor.

mattab commented 8 years ago

Good point, it may be greatly improved in the future :+1:

EugenVau commented 6 years ago

Is there any progress on this issue? The performance is still very poor with the "deadly mix" of segments and custom date ranges. I had to set fastcgi_read_timeout on nginx to 5 minutes for not getting timeout errors anymore.

Regards

mattab commented 7 months ago

This issue covers a lot of ground, and it's a bit too broad for us to tackle effectively as is. So, we're going to close it for now.

-> our Matomo Performance guide is at https://matomo.org/subcategory/improve-performance/