Closed unnamedfeeling closed 3 years ago
Hi @unnamedfeeling,
If you're already using the Pageviews Cache and the Data Caching option + page/database/object caching then there's nothing else the plugin can do to help with performance.
Also there are many 5xx errors on the endpoint.
What errors specifically?
502 and 504 errors are present.
Yeah, those error codes are usually server related (eg. timeout) which makes sense considering that you're running stress tests on it.
We have double-checked - this endpoint will not hold even if we give it more mysql and CPU power.
I have checked that endpoint code here - https://github.com/cabrerahector/wordpress-popular-posts/blob/master/src/Rest/ViewLoggerEndpoint.php#L36-L201
Here are some propositions:
Some additions: If I am not mistaken, there are actually:
Yes, if there's an object cache available then the plugin will first write views data to memory (Redis in your case) and will not write to the database until at least two minutes have passed since the last batch update.
Once two (or more) minutes have passed, WPP will fetch the view data from memory (Redis), do a batch insert into the database (to _popularpostsdata and _popularpostssummary), then reset its memory cache to start caching new views data for the next 2 minutes or so.
Not sure what your point is though :P
Oh, I missed the comment above. Sorry.
change that one from POST to GET and give it some variable ttl (say 10-20 seconds or user-defined value) - in our case it will give true sample-rating and for others it will not count double views for others
Not going to happen. GET requests should be used to retrieve data from the server, not to create it.
maybe rewrite this method to make single write to mysql because even if caching is used, it will make 2 * (amount of views per 2 minutes) to database - in my test case it is about 300-500 writes to DB and it will lock php for some significant amount of time
The code executes two queries, one to update the _popularpostsdata table and another to update the _popularpostssummary table (as explained above).
maybe defer writes to wp-cron say for every 15 minutes
This I could consider. What worries me though is that the longer the wait between updates the longer the batch insert queries will be, and this can cause issues on some server setups (keep in mind that not everyone is using dedicated servers like you) because there's a limit in the length of the query that the server can handle (see Packet Too Large). The current two-minutes approach exists for this very reason.
I could add a filter hook to modify the wait time from 2 minutes to, for example, 15, but then you'd need to watch out for Packet Too Large errors.
Okay. Maybe then allow developers to override this method at all (that's what I am considering now). Here's the plan:
Personally I plan to hook up to wpp_pre_update_views action and send json response from there early (but this is a dirty hack).
I'm a bit confused now. How would all that help? How do you plan to use it?
My plan is to override the default method, write all data to cache only and then update plugin data from cron. WP cron is disabled for user-accessible part of my project. So I use simple cron to run wp cron and recurring tasks using wp-cli comands. I will test and come back with results. Hopefully it will resolve my problem.
Hmm alright. I'll keep these suggestions into consideration.
Hello there Hope that we fixed the problem. There are 2 fixes actually:
Wrapping up, WP with plugins will take too much resources. On my local machine TTFB dropped from 300-500 ms to 20-60. In the end, I have a proposition to implement such custom override - I've got the best experience this way.
In our case the scenario is as follows:
The problem: POST /wp-json/wordpress-popular-posts/v1/popular-posts will hold on until ~35-40 requests per second. Then it will fail to work correctly. Upon load continues - the ttfb will grow until I start getting 5xx errors.
Description: When I am trying to load test the site the mentioned endpoint will be very laggy. TTFB without any traffic is 600-900 ms. Under relatively heavy load it will bump up to 60000 ms. I have read the faq and especially the part with performance and implemented everything I could find from there (see screenshots). I can confirm that redis cache works - there are get and set requests sent to cache backend. When I start testing my containers peak at maximum 20 containers constantly and ttfb starts growing. Also there are many 5xx errors on the endpoint.
Stack description: Infra is based on AWS and consists of scalable docker containers (2-20) (based on php:7.3-fpm-alpine), RDS as mysql provider, Elasticache as Rds cache provider. All of this lives behind Load balancer and Cloudfront. I have installed the W3 Total Cache plugin for static page, database and object caches.
Question: What did I do wrong? How can I fix anything to get more performance?