Open aidanm1999 opened 9 months ago
Temporary improvement
If you are using Vercel and are on a Pro/Enterprise account, you can have up to 5 mins/300 seconds of serverless function execution. Add below to a vercel.json
file and push up to your repo:
{
"functions": {
"src/pages/api/**/*": {
"maxDuration": 300
}
}
}
This seems to be a issue with our self-hosted DB options (psql/mysql) and higher traffic websites. Probably need to revisit the indexing to see if there can be improvements there. For rolling data into hours, days, etc we looked into timescaleDB as a psql extension, but I guess there should be a way to implement materialized views in our current options. Also we are looking to add ClickhouseDB support to our self-hosted options. We use the same database for our cloud product and it has no problems querying tables with billions of records.
the requests fail
I have over 100K visits every day, and it takes tens of seconds to open the weekly report (on a 4 cores Ryzen 7950x server). Shows errors like yours too. Therefore, I had to stop using Umami because it writes millions of records in the database and does not regularly clean up and summarize them.
@franciscao633 is this triaged or planned for development? Would be good to know a schedule of release for this bug fix, thanks!
Planetscale released a new feature today for database insights. Check out below.
Interesting that it believes there are so many redundant indexes, worth investigating if this is also impacting database memory and performance. @franciscao633 please let me know if you want me to share more details of each recommendation.
Interesting. In terms of app performance I'm mostly interested about indexing on session / website_event. Any noticable improvements on your end from the changes? I agree with @yuanfanzz, in that we won't see huge performance leaps until we start aggregating the data and running queries on those instead of the entire table. Also finally getting around to supporting ClickhouseDB for those who want to self-host high traffic websites.
Not had a chance yet to make a dump of the database, apply and compare the changes but hopefully later in the week my schedule will clear and I can give you a breakdown of before and after.
Data aggregation is a good idea, there are pros and cons for it and lots of open question for consideration around cost of background processes, data increase size, data loss on query, etc...
Managed to get round to some light tests, TL;DR they are better, but not definitive. I made two branch dumps on PlanetScale and applied the recommendations listed above to one of them, built and served the site from my machine and queried the page view endpoint 6 times for the applied recommendations and for the original (current) database schema.
Current migration | Recommendations applied |
---|---|
13.89s | 12.89s |
18.18s | 12.79s |
12.37s | 12.87s |
19.22s | 12.91s |
9.57s | 12.84s |
13.3s | 13.22s |
On average the current migration took 14.42s and with the recommendations the average query took 12.92s.
I did however test the difference between using a specific AWS endpoint on PlanetScale (eu-west.connect.psdb.cloud
) vs the auto-resolver (aws.connect.psdb.cloud
) and it increased speeds from ~22s to the figures shown above. Not important for most users but could cause some performance improvements for PlanetScale users.
This issue is stale because it has been open for 60 days with no activity.
Not stale
Aggregated data for quick queries would be excellent!
@franciscao633 In the meantime: Is a timeout set for the queries in the frontend of Umami? I couldn't find anything like that in the source code, but I can execute a query via API for 130 seconds without any problems, while in the frontend the connection is terminated after only 10 seconds with a 500 error (but unfortunately without any further information about why the status 500 is returned).
This issue is stale because it has been open for 60 days with no activity.
Not stale
This issue is stale because it has been open for 60 days with no activity.
not stale
Describe the Bug
When you request a period of a week, month, or year on some websites, the requests fail. See the screenshots below for one of our websites with a query for "All time" and "24th of Jan 2024"
Failing request
Successful request
This is a fairly persistent issue, meaning we can only request very small data samples, or we have to make API calls directly to the database.
The reason why this happens is that Vercel and Netlify both have small timeouts for cloud functions (around 20 seconds I believe Vercel source and Netlify source), which means you cannot complete the request. The screenshot below shows the slowest queries over the past 24 hours in Planetscale:
As you can see some of the queries take over 6 minutes, which means there is no time for the cloud function to:
This is when it usually times out. The first step we took to mitigate this issue was to move the Netlify and Vercel cloud functions to the same AWS datacentre as the Planetscale database, slightly increasing performance.
A suggestion might be to move the web app and database to a cloud server so there are no timeout constraints, however, it would be better if our clients receive the result in under 20 seconds.
So this leads to how the Umami cloud function requests and parses data. It should be investigated how to optimise these queries, with differing strategies. One potential solution could be to create a rollup table where a function periodically fetches data from the previous hour or day, greatly reducing the quantity of data retrieved. Another potential solution could be improving the indexing in the database as @pottiepierre suggests in #2412.
Database
MySQL
Relevant log output
No response
Which browser are you using? (if relevant)
N/A
How are you deploying your application? (if relevant)
Vercel & Netlify with PlanetScale