matomo-org / matomo

Empowering People Ethically with the leading open source alternative to Google Analytics that gives you full control over your data. Matomo lets you easily collect data from websites & apps and visualise this data and extract insights. Privacy is built-in. Liberating Web Analytics. Star us on Github? +1. And we love Pull Requests!
https://matomo.org/
GNU General Public License v3.0
19.36k stars 2.59k forks source link

Google Cloud SQL compatibility: do not call 'SHOW COLUMNS' often #5992

Open sbesnaud opened 9 years ago

sbesnaud commented 9 years ago

Hello,

I'm working with Google Cloud, 3 Virtual Machines as trackers (Google Compute Engine), working with Google Cloud SQL as database.

I'm encountering an issue which prevent me completely to use Piwik : my trackers are calling the requests "SHOW COLUMNS FROM piwik_log_visit / piwik_log_link_visit_action [...]" I absolutely don't know why this request is called, but this query seems to be very slow and is threaded to a thousand of threads. By threading it with the thread_handling parameter in SQL (set to one_thread_per_connection) grows to the max_connections instantly => Every next requests encounter the "Too many connection" error from MySQL.. Usually, it doesn't last more than 30 minutes (which is HUGE amount of data lost), but I've been encountering this issue for more than 4 hours sometimes, and some other times it seems not to finish at all. My database contains about 40G of data, and I'm tracking more than 600k visits per day, so maybe that's why it takes so long, but it doesn't explain why this request is called..

I'm losing a lot of data each time it occurs so I'm really looking for a solution to this. The last thing I've been trying is disabling the "DBStats" plugin. It did not seem to solve the problem. I'm using Piwik 2.3.0 atm.

Do you know why this request is called?

Thanks a lot for your help.

mattab commented 9 years ago

Thanks for the report!

Likely this is called in order to know how many Custom Variables are currently setup on your piwik. Because more custom variables can be added, we need to dynamically fetch how many there are. Likely this cannot be fixed on our end. Is there any chance this could be worked around on your end?

sbesnaud commented 9 years ago

Yes there is of course. If I just need to modify some code in order to make this call not necessary (meaning I will never change the custom variables's number's value), I can. However, I will need to care everytime I make a Piwik update right?

sbesnaud commented 9 years ago

Do you know how can I fix it on my end?

sbesnaud commented 9 years ago

I think I misunderstood a bit, I thought you were going to tell me how to fix it on my end. Do I have to search for the file doing this work or do you know it? Thanks.

mattab commented 9 years ago

@sbesnaud what you did was create an issue in the issue tracker. As you can see there are 630 other issues already opened. The way we work is pick the most important issues and work on them in the next release. So far you're only user to report a problem with SHOW COLUMNS so investigating this is not yet high priority. If you have more info feel free to post here! to learn more see: http://developer.piwik.org/guides/core-team-workflow

sbesnaud commented 9 years ago

Just to keep you posted, I found where the request was called (core/Db.php) and I wrote in plain text my columns names in a table in order to return them directly when the function is called instead of querying the database. It seems to fix my issue for now. I did it for 4 tables : log_link_visit_action, log_visit, log_action, log_conversion