Closed roborourke closed 2 years ago
Thoughts on this method? I'm seeing multiple db queries on an initial population of the jobs, then a single query for most page views thereafter.
I'd say it makes more sense to be querying this data within MySQL; I'm not sure why doing this in PHP is preferential?
Would love to see some benchmarks, as well as peak memory usage with large jobs tables.
@rmccue this is how I interpreted your comment here:
https://github.com/humanmade/Cavalcade/issues/117#issuecomment-1237335606
Fetching / populating the full array of jobs once, then querying from the non-persistent cache.
I'll try and figure out a way to benchmark this.
Fetching / populating the full array of jobs once, then querying from the non-persistent cache.
Yes, but I wouldn't necessarily implement that within this function since this is a generic querying function.
I've updated it to just do the generic query on the pre_get_scheduled_event
hook instead.
It's not really a solid benchmark test but using ab
locally it seems the change (applied before the 2nd run) has some improvement:
# ~/projects/altis-dev on git:master x [14:33:09] C:22
$ ab -n 100 -c 3 https://altis-dev.altis.dev/
This is ApacheBench, Version 2.3 <$Revision: 1879490 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking altis-dev.altis.dev (be patient).....done
Server Software: nginx
Server Hostname: altis-dev.altis.dev
Server Port: 443
SSL/TLS Protocol: TLSv1.2,ECDHE-RSA-CHACHA20-POLY1305,2048,256
Server Temp Key: ECDH X25519 253 bits
TLS Server Name: altis-dev.altis.dev
Document Path: /
Document Length: 44002 bytes
Concurrency Level: 3
Time taken for tests: 89.420 seconds
Complete requests: 100
Failed requests: 0
Total transferred: 4434100 bytes
HTML transferred: 4400200 bytes
Requests per second: 1.12 [#/sec] (mean)
Time per request: 2682.591 [ms] (mean)
Time per request: 894.197 [ms] (mean, across all concurrent requests)
Transfer rate: 48.43 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 5 11 5.8 10 56
Processing: 1177 2556 1161.0 2809 5732
Waiting: 1174 2554 1160.6 2807 5730
Total: 1186 2567 1163.9 2820 5788
Percentage of the requests served within a certain time (ms)
50% 2820
66% 2867
75% 2876
80% 2893
90% 4205
95% 5385
98% 5422
99% 5788
100% 5788 (longest request)
# ~/projects/altis-dev on git:master x [14:35:37]
$ ab -n 100 -c 3 https://altis-dev.altis.dev/
This is ApacheBench, Version 2.3 <$Revision: 1879490 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking altis-dev.altis.dev (be patient).....done
Server Software: nginx
Server Hostname: altis-dev.altis.dev
Server Port: 443
SSL/TLS Protocol: TLSv1.2,ECDHE-RSA-CHACHA20-POLY1305,2048,256
Server Temp Key: ECDH X25519 253 bits
TLS Server Name: altis-dev.altis.dev
Document Path: /
Document Length: 44002 bytes
Concurrency Level: 3
Time taken for tests: 80.575 seconds
Complete requests: 100
Failed requests: 0
Total transferred: 4434100 bytes
HTML transferred: 4400200 bytes
Requests per second: 1.24 [#/sec] (mean)
Time per request: 2417.259 [ms] (mean)
Time per request: 805.753 [ms] (mean, across all concurrent requests)
Transfer rate: 53.74 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 6 10 2.8 10 30
Processing: 1169 2344 838.4 2788 3878
Waiting: 1169 2343 838.5 2788 3878
Total: 1178 2354 838.9 2798 3892
Percentage of the requests served within a certain time (ms)
50% 2798
66% 2818
75% 2831
80% 2839
90% 2881
95% 3873
98% 3886
99% 3892
100% 3892 (longest request)
Thanks for looking into this. Would be curious if there's a way to profile the change before and after but that sounds positive!
@kadamwhite I've done a basic benchmark locally but I'm not completely sure on the best way to do this. Would be good to figure out some test with a baseline performance fixture to compare changes against... Something I need to look into, benchmarking isn't something I see done that often in WP plugin land.
Keen to get back to this one, especially while we're looking for every performance improvement we can get.
@svandragt it was the same number of requests for each run, the first one is indeed before making the change so, with this change it completed 100 requests 9 seconds faster.
I'll have a look into how we can put some benchmarks into the CI tests.
Had a chat through with @kovshenin who ran some deeper analysis of the db queries and edge cases we could expect.
There's a balance to be had between the indexes used, the number of queries run, and the amount of data a query may return over the network.
Front loading all the results in the way this PR does is could hit problems in the following scenarios:
LIMIT
we set) jobs for a given hook with different args
results in wp_next_scheduled()
checks returning false negativesargs
is largenextrun
and we ORDER BY nextrun
MySQL has to scan way more rows than are needed and apply the LIMIT
afterwards, using WHERE
and filesort which is less efficientOn balance the current approach is the best we'll get. Given the primary problem we're aiming to have an impact on is TTFB for end user page requests there is an alternative approach we could take which would be to use the pre_get_scheduled_event
filter to just return true
on any non-admin requests, provided they are in the init
hook, as init
will run in the admin context.
Ah, nuts, had been hoping this would scale.
Given the primary problem we're aiming to have an impact on is TTFB for end user page requests there is an alternative approach we could take which would be to use the pre_get_scheduled_event filter to just return true on any non-admin requests, provided they are in the init hook, as init will run in the admin context.
That makes sense, and feels like a good compromise.
It is a good compromise, though I'd be careful returning true for all events. Instead maybe have an explicit list of events that you expect to be scheduled at all times, and let the checks for those only run within wp-admin.
Will note on the Altis Cloud issue I raised here https://github.com/humanmade/altis-cloud/issues/702
Fixes #117
The DB query fetches pretty much everything, just varying on statuses and site ID which won't change unless someone needs to query the jobs directly.
The resulting array is then filtered in PHP instead so the trade off is an 0(n) process in PHP rather than additional requests to the db each time
wp_next_scheduled()
is called.