matomo-org / matomo

Empowering People Ethically with the leading open source alternative to Google Analytics that gives you full control over your data. Matomo lets you easily collect data from websites & apps and visualise this data and extract insights. Privacy is built-in. Liberating Web Analytics. Star us on Github? +1. And we love Pull Requests!
https://matomo.org/
GNU General Public License v3.0
19.66k stars 2.62k forks source link

Estimate how long it will take to run core:update console command #9048

Open mattab opened 8 years ago

mattab commented 8 years ago

When running core:update (or when updating via the user interface), Piwik admins do not know what to expect: maybe the update will take 1 second or maybe it will take 40 hours, depending on the Upgrades as well as the size of the DB.

The goal of this issue is to display relevant information to user regarding how long the update may take:

when running core:update we could display:

SQL ALTER TABLE statements were found and affect the following database tables:

 * log_visit: 4,030,104, 4.5Gb
 * log_link_visit_action: 11,560,055 records, 15.5Gb

Warning: some of these tables contain a lot of data, the update process will likely take a long time.

Also: wondering when to show such WARNING (eg. if one of the table is > 1 Gb? or > 4 Gb? what does our experience tell us or what would you suggest?)

gaumondp commented 8 years ago

We know we can't be precise (CPU, Memory, shared hosting limits) but on the other hand "long time" is not really helping. I don't know if we can get some numbers from Piwik Cloud and other users to write something like :

Warning: some of these tables contain a lot of data, the update process will likely take a long time. (about XX minutes for each GB of data on a dedicated server, more on a shared server)

Our past numbers on a dedicated MySQL server (2 CPU, 8 GB) : 11 GB = 15 minutes (2.4 => 2.7) 14 GB = 36 minutes (2.7 => 2.10) 34 GB = 22 minutes (2.10 => Piwik 2.13.b2) (Suffering from #7181 Report archives have tripled in size)

bartek85 commented 8 years ago

On 4CPU and 80GB RAM with good mysql tuning it took more than 12 hours for one update query (add column in piwik_log_visit). The table was 40+GB. Switched to TokuDB to solve this, have no idea if it's even possible to speed it up without huge amounts of RAM. There were an ideas thread with solutions like creating few tables for tables piwik_log_visit andpiwik_log_visit_action. Making them for one year like archive tables (piwik_log_visit_2014, piwik_log_visit_2015 etc) would be huge improvement for large deployments and real maintanance help. Backuping, restoring, moving such big tables it's real PITA.

gaumondp commented 8 years ago

@bartek85 Did you already switched to TokuDB ? I got complains with my webmasters about Piwik speed (I still got all my data since 2013) and DB is 34 GB...

If I can't get Piwik Interface with Segments running fast on TokuDB I may be forced to go back to something else like Sawmill.net ...

quba commented 8 years ago

I personally recommend switching to MariaDB + TokuDB (of course with enabled compression).

gaumondp commented 8 years ago

Not possible anymore to stay with MariaDB...

TokuDB is now requiring Percona Server. :(

From an email with a Percona account manager: "TokuTek storage engine is not available separately for MySQL Community Edition - you will be required to move to Percona Server in order to take advantage of this storage engine. There is a MariaDB build that is also available with an older version of Toku but there are some issues with that product."

2015-10-22 9:21 GMT-04:00 Kuba Bomba notifications@github.com:

I personally recommend switching to MariaDB + TokuDB (of course with enabled compression).

— Reply to this email directly or view it on GitHub https://github.com/piwik/piwik/issues/9048#issuecomment-150221773.

quba commented 8 years ago

We are using MariaDB+TokuDB and don't have issues with that.

bartek85 commented 8 years ago

@gaumondp we migrated to percona and all went quite good so no worries here. What performance issues do you have? (we have even servers with 1TB database so I know a bit of a pain with piwik performance)

gaumondp commented 8 years ago

Since Github is not the forum, maybe we should continue at Knowledge Base Piwik Forum. I'm pretty sure many people would like to have information on TokuDB and Piwik.

I'm starting a generic thread over there : http://forum.piwik.org/read.php?6,130045