afragen / git-updater

This WP plugin will update GitHub, Bitbucket, GitLab, and Gitea hosted plugins and themes
https://git-updater.com
MIT License
3.17k stars 462 forks source link

Slow AJAX performance for admin users #723

Closed figureone closed 5 years ago

figureone commented 5 years ago

We are noticing GitHub Updater causing serious performance impacts on the front-end, on the order of 1-10 seconds per request, including AJAX requests. We have traced it to get_remote_repo_meta() being called on every request, in the following code path:

https://github.com/afragen/github-updater/blob/d9f07d76b1140ee408746316f9c61ba4a2d58785/src/GitHub_Updater/Init.php#L80

https://github.com/afragen/github-updater/blob/d9f07d76b1140ee408746316f9c61ba4a2d58785/src/GitHub_Updater/Base.php#L130-L172

https://github.com/afragen/github-updater/blob/d9f07d76b1140ee408746316f9c61ba4a2d58785/src/GitHub_Updater/Base.php#L174-L181

https://github.com/afragen/github-updater/blob/d9f07d76b1140ee408746316f9c61ba4a2d58785/src/GitHub_Updater/Plugin.php#L180-L224

https://github.com/afragen/github-updater/blob/d9f07d76b1140ee408746316f9c61ba4a2d58785/src/GitHub_Updater/Plugin.php#L198

This seems very wrong; get_remote_repo_meta() seems like it should only be called in wp_cron scheduled events, correct?

afragen commented 5 years ago

If you look here in the load() function, https://github.com/afragen/github-updater/blob/d9f07d76b1140ee408746316f9c61ba4a2d58785/src/GitHub_Updater/Base.php#L139-L141

You see the check to can_update() which doesn’t run in the frontend. https://github.com/afragen/github-updater/blob/d9f07d76b1140ee408746316f9c61ba4a2d58785/src/GitHub_Updater/Init.php#L109-L144

It could only run on the frontend if one of the options in the Remote Management screen are checked and then only for and admin user.

Also, these should only mostly be done in the background unless you have the override filter in place.

How are you confirming that this is happening? Cause it shouldn’t work this way.

figureone commented 5 years ago

Confirmed this only happens for admin users (lower-privilege user and anonymous users are okay), thanks! None of the Remote Management options are checked.

I think this is specifically an issue with AJAX requests--in can_update(), admin-ajax.php is in the list of admin pages. admin-ajax.php is the handler for all WordPress AJAX requests coming from Javascript in client browsers, regardless of whether those are fired from the frontend or backend.

This is often seen as a gotcha in the WordPress is_admin() function (see the 2nd note): https://codex.wordpress.org/Function_Reference/is_admin

I'd suggest removing admin-ajax.php from the $admin_pages array in can_update() but let me know your thoughts!

afragen commented 5 years ago

I really can't remove admin-ajax.php or the background updating won't work.

The interesting part is that if a cached value is present and hasn't expired (less than 12 hours old), then hitting any of those functions should exit early without doing much of anything. Every API call which has a wp_remote_get() checks to see if a cached value is present and valid before doing that expensive call.

Can you verify that this takes a long time after all the background updating is completed. There will be the admin notice present when background updating is not completed. There may also be a warning icon next to the plugin/theme in the GitHub Updater Settings sub tabs.

figureone commented 5 years ago

Caching definitely isn't working, I get the expensive call on every AJAX request, and every wp-admin page load. (The "GitHub Updater Information Please be patient while WP-Cron finishes making API calls" notification is always present in the dashboard.)

I'll do some investigating and see if I can figure out what's causing that.

afragen commented 5 years ago

That notice is present when not all the API calls have completed. I’ve written a GitHub Updater Support plugin. Upon activation it will install a couple of other plugins that can help figure out the problem. It also turns off the background updating to assist in troubleshooting.

figureone commented 5 years ago

Thanks for building the support plugin! FYI, sorry I forgot to mention it before, but we're running a Pressbooks stack, which is built on top of WordPress multisite for distributing open educational resources. Pressbooks relies on GitHub Updater to update their theme and plugin dependencies, since they don't publish to wordpress.org.

Our textbook generates a fair amount of AJAX requests as students interact with it, which is where we have been noticing performance problems. However we did confirm that only admins see the slow performance during AJAX requests, so the impact is limited.

For debugging, we also used some tooling by one of the core devs: https://gist.githubusercontent.com/Viper007Bond/5192117/raw/c7d68b0905bef808b94c7f77451b484fcae95089/whatissoslow.php This times each filter/action in WordPress, and it's how I initially zeroed in on GitHub Updater hooks in init being the chief source of the slow AJAX requests.

Using your support plugin, I tracked down a few failing GitHub calls:

With those issues out of the way, the admin notice and any warning icons go away.

Back to the performance issues, here are a few observations. Note that these metrics are from a local vagrant VM running on a laptop, so they will be somewhat slower than a production server:

  1. The github_updater_add_settings hook takes ~2 seconds to finish on every admin page (with 11 github repos installed): https://github.com/afragen/github-updater/blob/025e171d85bea04313c15e0899a386a35a07c90b/src/GitHub_Updater/Settings.php#L327-L334 Maybe this only needs to fire on the github-updater settings page?

  2. AJAX requests for admins take 5-10 seconds to complete (with 11 github repos installed). As explained in the original post, 95% of this is consumed in get_meta_plugins() and get_meta_themes() in load(). https://github.com/afragen/github-updater/blob/d9f07d76b1140ee408746316f9c61ba4a2d58785/src/GitHub_Updater/Base.php#L168-L169 Drilling down, most of that is consumed in get_remote_repo_meta(), and I don't see any caching wrapped around those calls. With GitHub repos as our examples, each plugin/theme takes between 0.25 - 1 second in my testing, so those quickly stack up. And again, this is on each AJAX request. A frontend page on our textbook typically fires 3 AJAX requests when loaded, and then another for each action someone performs. So we see web server resources consumed quickly when an admin is interacting with the textbook. Back to the original suggestion, why would removing admin-ajax.php from the list of admin pages prevent background updating? If it's the Heartbeat API, maybe we can restrict to just heartbeat AJAX requests?

afragen commented 5 years ago

Interesting. Let me digest this information. But for reference get_remote_repo_meta() calls all individual API requests or these requests return cached data without making the API call.

FWIW, I'm using Local by Flywheel for local dev and with 2-3x the number of plugins using GHU when the caches are being hit my load time is always < 1 sec.

figureone commented 5 years ago

Interesting! We're seeing the same behavior on the production server (Ubuntu 16.04, 16 cpu / 32GB RAM), so I don't think it's the server setup. Maybe something odd with multisite? If you can point me to where the transients get cached for get_remote_repo_meta(), I can peek at the database and see if the lookup is failing.

afragen commented 5 years ago

The data is saved in the options table with a timeout.

https://github.com/afragen/github-updater/blob/9e5c0ef73aa566c837fcdd4026a831c4e5f3d07c/src/GitHub_Updater/Traits/GHU_Trait.php#L74-L86

Everything in the database options table is prefixed with ghu-.

afragen commented 5 years ago

Just checked my local multisite runs 2.5 sec with cached data.

afragen commented 5 years ago

My test server, multisite on Siteground, loads < 1 sec when cached.

afragen commented 5 years ago

FYI, github_updater_add_settings hook only runs on settings 3 pages.

https://github.com/afragen/github-updater/blob/025e171d85bea04313c15e0899a386a35a07c90b/src/GitHub_Updater/Settings.php#L100-L103

figureone commented 5 years ago

Thanks for clarifying the admin page loads, I think all that looks good and acceptable for expectations when viewing specific dashboard pages.

Focusing back on the AJAX performance, here's some high-level data from the local vagrant test:

Front-end AJAX request (page loaded in ~4 seconds, then fires an AJAX request to record that the user viewed the page; AJAX request takes ~8 seconds): image And this is the corresponding debug logs showing where the majority of the time was spent:

[15-Oct-2018 19:27:33 UTC] ### Slow hook: init (6.90 seconds) (/wp-admin/admin-ajax.php).
[15-Oct-2018 19:27:33 UTC]   # 6.724 seconds: Fragen\GitHub_Updater\Init (load)

Second, here is a backend AJAX example (loading a custom Dashboard page at wp-admin/admin.php?page=visualizations that shows instructors data about their classroom; the page loaded in ~4 seconds, and then you can see Heartbeat API pings every minute and take 6 - 12 seconds each to complete): image And here is the corresponding debug logs showing where the majority of the time was spent:

[15-Oct-2018 19:08:31 UTC] ### Slow hook: init (6.69 seconds) (/wp-admin/admin-ajax.php).
[15-Oct-2018 19:08:31 UTC]   # 6.508 seconds: Fragen\GitHub_Updater\Init (load)
[15-Oct-2018 19:09:30 UTC] ### Slow hook: init (6.29 seconds) (/wp-admin/admin-ajax.php).
[15-Oct-2018 19:09:30 UTC]   # 6.112 seconds: Fragen\GitHub_Updater\Init (load)
[15-Oct-2018 19:10:30 UTC] ### Slow hook: init (6.09 seconds) (/wp-admin/admin-ajax.php).
[15-Oct-2018 19:10:30 UTC]   # 5.904 seconds: Fragen\GitHub_Updater\Init (load)
[15-Oct-2018 19:11:30 UTC] ### Slow hook: init (6.25 seconds) (/wp-admin/admin-ajax.php).
[15-Oct-2018 19:11:30 UTC]   # 6.082 seconds: Fragen\GitHub_Updater\Init (load)
[15-Oct-2018 19:12:35 UTC] ### Slow hook: init (10.90 seconds) (/wp-admin/admin-ajax.php).
[15-Oct-2018 19:12:35 UTC]   # 10.711 seconds: Fragen\GitHub_Updater\Init (load)
[15-Oct-2018 19:12:35 UTC] ### Slow hook: requests-request.progress (0.45 seconds) (/wp-admin/admin-ajax.php).
[15-Oct-2018 19:13:31 UTC] ### Slow hook: init (6.52 seconds) (/wp-admin/admin-ajax.php).
[15-Oct-2018 19:13:30 UTC]   # 6.336 seconds: Fragen\GitHub_Updater\Init (load)
[15-Oct-2018 19:13:31 UTC] ### Slow hook: http_request_args (0.11 seconds) (/wp-admin/admin-ajax.php).
[15-Oct-2018 19:14:31 UTC] ### Slow hook: init (6.91 seconds) (/wp-admin/admin-ajax.php).
[15-Oct-2018 19:14:31 UTC]   # 6.720 seconds: Fragen\GitHub_Updater\Init (load)
[15-Oct-2018 19:15:31 UTC] ### Slow hook: init (6.72 seconds) (/wp-admin/admin-ajax.php).
[15-Oct-2018 19:15:31 UTC]   # 6.530 seconds: Fragen\GitHub_Updater\Init (load)
[15-Oct-2018 19:16:31 UTC] ### Slow hook: init (6.46 seconds) (/wp-admin/admin-ajax.php).
[15-Oct-2018 19:16:30 UTC]   # 6.281 seconds: Fragen\GitHub_Updater\Init (load)
[15-Oct-2018 19:17:31 UTC] ### Slow hook: init (6.63 seconds) (/wp-admin/admin-ajax.php).
[15-Oct-2018 19:17:31 UTC]   # 6.436 seconds: Fragen\GitHub_Updater\Init (load)
[15-Oct-2018 19:18:36 UTC] ### Slow hook: init (11.31 seconds) (/wp-admin/admin-ajax.php).
[15-Oct-2018 19:18:35 UTC]   # 11.127 seconds: Fragen\GitHub_Updater\Init (load)
[15-Oct-2018 19:18:36 UTC] ### Slow hook: requests-request.progress (0.42 seconds) (/wp-admin/admin-ajax.php).
[15-Oct-2018 19:19:31 UTC] ### Slow hook: init (6.55 seconds) (/wp-admin/admin-ajax.php).
[15-Oct-2018 19:19:31 UTC]   # 6.366 seconds: Fragen\GitHub_Updater\Init (load)
[15-Oct-2018 19:20:31 UTC] ### Slow hook: init (7.09 seconds) (/wp-admin/admin-ajax.php).
[15-Oct-2018 19:20:31 UTC]   # 6.897 seconds: Fragen\GitHub_Updater\Init (load)

More testing on the production server shows frontend AJAX requests complete in the 0.5 - 2 second range when GitHub Updater is enabled, and 100 - 150 ms when the plugin is disabled. That's much better than the local vagrant, which makes me think disk I/O is responsible for a lot of that, and not remote calls (so I think I was barking up the wrong tree w.r.t. caching and remote calls).

Just to clarify, I'm not really concerned about performance during Heartbeat AJAX calls. It's when AJAX is used on the frontend, and I'd like to see if we can restrict the amount of processing that GitHub Updater does during those calls.

afragen commented 5 years ago

I really strive to make this plugin as performant as possible.

Can you try something for me?

Change the following line to the code below. Let me know if you see any difference.

https://github.com/afragen/github-updater/blob/23d6db7a23c5c5b9a0991c7606b7c4b29f06a10d/src/GitHub_Updater/Init.php#L67

if ( ! static::is_heartbeat() ) { $this->load_hooks(); }

figureone commented 5 years ago

With the code change, heartbeat AJAX calls on the Dashboard (screenshot below is from /wp-admin/ on a site in the network) dropped from ~6-14 seconds per request to ~1 second per request (the first 4 admin-ajax.php entries in the list are before the change, and the rest are after): image A slow hook (any hook taking more than 0.5 seconds) only triggered once, and the GitHub Updater callback didn't trigger at all.

[Mon Oct 15 14:31:43.005482 2018] [:error] [pid 22150] ### Slow hook: init (0.67 seconds) (/wp-admin/admin-ajax.php).

However, frontend AJAX calls are unchanged (which is expected since they are not heartbeats): image

[Mon Oct 15 14:32:28.084799 2018] [:error] [pid 14414] ### Slow hook: init (10.90 seconds) (/wp-admin/admin-ajax.php).
[Mon Oct 15 14:32:27.832397 2018] [:error] [pid 14414]   # 10.746 seconds: Fragen\\GitHub_Updater\\Init (load)

Do you know what specific AJAX calls need load_hooks() to be run? Maybe we could restrict to just those, to keep it off all custom AJAX hooks that plugins/themes might register.

afragen commented 5 years ago

So here's the interesting part. That code will totally prevent GHU from running during a heartbeat.

Out of curiosity, what does the profile look like when GHU is deactivated?

I'm going to install whatisslow and see what it shows on my end.

figureone commented 5 years ago

Not sure if it was unclear, but WordPress has an extensible AJAX API that's used for more than just Heartbeat API calls; any theme or plugin can add their own AJAX actions to do any communication between server and client browser (thus avoiding a full page load): https://developer.wordpress.org/plugins/javascript/ajax/#why-use-ajax It's specifically any frontend or user-created AJAX calls that I'd like to free up from any GHU processing.

With GHU deactivated, Heartbeat API AJAX requests on the Dashboard take around 0.5 - 1 second to complete (682ms - 733ms in this example): image

Similar for other AJAX calls on the frontend (723ms in this example): image

Finally, re: whatisslow, I made a few mods to make it more descriptive. Here's the modified gist, with installation instructions in a comment at the top (for getting it to report on individual hooks): https://gist.github.com/figureone/f83108f6c6fc4ad61bc9f5d3d88e56c3

afragen commented 5 years ago

I've installed your modified version, whatisslow-extended, and I don't see anything in the logs pointing to GitHub Updater. I installed it as a mu-plugin and I'm not really going to modify core for this.

[16-Oct-2018 01:50:57 UTC] ### Slow hook: init (1.19 seconds) (/wp-admin/options-general.php).
[16-Oct-2018 01:51:59 UTC] ### Slow hook: init (0.80 seconds) (/wp-admin/admin-ajax.php).
[16-Oct-2018 01:52:08 UTC] ### Slow hook: init (1.13 seconds) (/wp-admin/options.php).
[16-Oct-2018 01:52:11 UTC] ### Slow hook: init (1.46 seconds) (/wp-admin/options-general.php).
[16-Oct-2018 01:52:34 UTC] ### Slow hook: init (1.86 seconds) (/wp-admin/options-general.php).
[16-Oct-2018 01:52:53 UTC] ### Slow hook: init (3.55 seconds) (/wp-admin/options-general.php).
[16-Oct-2018 01:52:58 UTC] ### Slow hook: init (3.33 seconds) (/wp-admin/options-general.php).
[16-Oct-2018 01:53:15 UTC] ### Slow hook: ghu_get_remote_plugin (65.84 seconds) (/wp-cron.php).
[16-Oct-2018 01:53:15 UTC] ### Slow hook: http_request_args (1.77 seconds) (/wp-cron.php).
[16-Oct-2018 01:53:43 UTC] ### Slow hook: init (31.29 seconds) (/wp-admin/options-general.php).
[16-Oct-2018 01:53:50 UTC] ### Slow hook: ghu_get_remote_theme (39.75 seconds) (/wp-cron.php).
[16-Oct-2018 01:54:15 UTC] ### Slow hook: init (5.08 seconds) (/wp-admin/options-general.php).
[16-Oct-2018 01:54:49 UTC] ### Slow hook: ghu_get_remote_theme (7.27 seconds) (/wp-cron.php).
[16-Oct-2018 01:54:51 UTC] ### Slow hook: init (8.40 seconds) (/wp-admin/options-general.php).
[16-Oct-2018 01:55:03 UTC] ### Slow hook: init (1.70 seconds) (/wp-admin/options-general.php).
[16-Oct-2018 01:55:13 UTC] ### Slow hook: init (1.89 seconds) (/wp-admin/options-general.php).
[16-Oct-2018 01:55:29 UTC] ### Slow hook: init (1.22 seconds) (/wp-admin/options-general.php).

The ghu_get_remote_plugin and ghu_get_remote_theme are wp-cron jobs that ran after I hit refresh cache and in wp-cron.php not in admin-ajax.php. They did not affect page load to a significant degree.

I don't see anything pointing to a specific GHU function.

If you click on one of these admin-ajax.php calls in Dev Tools, what do the specifics say about the call? Specifically, Headers > Form Data

afragen commented 5 years ago

GHU absolutely requires the admin-ajax calls that it makes as updating from the plugin and theme pages happens via AJAX.

figureone commented 5 years ago

In Dev Tools, Form Data is just particular to the AJAX call. So, for my frontend example above, the action is xapi_event, with another statement param containing the JSON blob for the xAPI statement. That specific one is registered by another plugin here: https://github.com/tunapanda/wp-h5p-xapi/blob/master/wp-h5p-xapi.php#L104-L105 Of course, we have a bunch of other registered AJAX actions that run on the frontend, and there's nothing in particular that makes that one different than the others w.r.t. GHU.

On the backend, the example above shows Heartbeat API AJAX calls. Their form data:

interval: 60
_nonce: 2459d1ab7b
action: heartbeat
screen_id: dashboard
has_focus: false

Again, nothing out of the ordinary, and I'm not trying to say that specifically Heartbeat API AJAX calls are slow, just the GHU runs its I/O-heavy init workload on every AJAX request fired by an admin user.

When you say:

GHU absolutely requires the admin-ajax calls

do you mean that GHU needs to initialize on every registered AJAX call, or just that GHU registers its own AJAX hooks and needs to initialize during those? Seems like the latter to me, and if that's the case, you can restrict GHU to loading during DOING_AJAX only when the GHU-specific AJAX actions are running (similar to your is_heartbeat() function).

Regardless, if you're still not convinced there are performance issues, I'm happy to just leave the plugin deactivated and only enable it during scheduled maintenance windows. That's my current solution to restore our frontend AJAX performance for admin users. Feel free to close out this issue if you don't want to press it any further.

afragen commented 5 years ago

If this is an issue, I really want to fix it.

GHU only registers 2 AJAX calls and they are used for RESTful updates.

What I’m having trouble understanding is what in GHU is being called during what AJAX call? How is an AJAX call initializing GHU? Is there some code I can run locally to cause this? Also, why does your installation not seem to retrieve the cached GHU data?

GHU shouldn’t run on the fronted at all unless you have one of the Remote Management boxes checked.

afragen commented 5 years ago

You can use the filter below to remove admin-ajax.php from the list of pages where GHU runs and see what happens.

https://github.com/afragen/github-updater/blob/6c4b3748e2d3d1cba152ac5a1e964d514421b640/src/GitHub_Updater/Init.php#L143

It seems the bigger issue is, why your site doesn’t seem to be utilizing cached data? If it was then there would be no API calls for 12 hours.

Can you check the database to see if the options table contains anything starting with ghu-?

afragen commented 5 years ago

OK, I think I've solved this. It seems I only need to run on admin-ajax.php if attempting sequential shiny updates.

Please update to the latest develop branch or use the branch switcher to switch to the latest develop branch. (You will need to switch back to the master branch to receive updates from that branch, otherwise you will get updates from the develop branch.)

Please let me know.

figureone commented 5 years ago

Sorry, this thread has gotten a bit long and some details may have been buried. I've confirmed that remote calls are being cached, and they are not the cause of the slow performance; I suspect disk I/O as GHU iterates through all of the installed plugins and themes and gathers data.

I set up a fresh vagrant install with vanilla WordPress, and installed GHU. With just GHU installed, load() took 0.790 seconds to finish during an AJAX call.

[16-Oct-2018 23:58:28 UTC]   # 0.790 seconds: Fragen\GitHub_Updater\Init (load)

I then searched through github.com for WordPress plugins that use GHU functionality, and installed (did not activate) about 25 of them: https://github.com/search?utf8=%E2%9C%93&q=%22GitHub+Plugin+URI%3A+%22+extension%3Aphp+path%3A%2F&type=Code&ref=advsearch&l=&l= With all of these present, load() took 12.758 seconds to finish during an AJAX call.

[17-Oct-2018 00:03:40 UTC]   # 12.758 seconds: Fragen\GitHub_Updater\Init (load)

So it looks like around 400ms per plugin/theme present that gets added to the init action by GHU. This might be why you're not seeing the same type of performance impact that I am (our production Pressbooks server has 14 GHU-aware plugins/themes, all from Pressbooks itself).

Stepping back, a few clarifications about AJAX in WordPress:

https://developer.wordpress.org/plugins/javascript/ajax/

Now, it's important to note that the /wp-admin/admin-ajax.php endpoint is shared by all AJAX requests; it's just the action param that differentiates them. Since GHU hooks into init, which fires on every AJAX request, and since can_update() allows the load() function to call get_meta_plugins() and get_meta_themes() for admin users visiting admin-ajax.php, we see the slow performance for admin users for every single AJAX call they create (which includes all AJAX calls from the frontend).

figureone commented 5 years ago

Will check out the develop branch and let you know. Thanks!

afragen commented 5 years ago
  • GHU's remote management features could be rewritten to use the actual REST API instead of AJAX, but I'm not sure there's a good reason to do that.

I know it's not technically REST, and I simply haven't gotten a handle on the REST API yet to make this work without admin-ajax, yet.

BTW, I test with 27 plugins/themes that hook into GHU and use 5 different git servers. 😉

figureone commented 5 years ago

On develop, I confirmed that the admin-ajax.php requests finish in ~0.8 seconds now instead of ~12 seconds in the environment above (25 GHU-aware plugins/themes present on vanilla WordPress in a local vagrant).

Also confirmed in the Pressbooks environment, frontend and backend AJAX calls dropped from ~7 seconds each to ~800ms each (700ms with GHU deactivated). Sounds like a success to me!

I also tested a shiny update (clicking "update now" on the Plugins page for a GHU-aware plugin with pending updates), and it worked fine.

Re: REST, if it were me I wouldn't bother trying to create a real REST endpoint...the admin-ajax.php one you built works just fine, and I don't think you can get away from the AJAX dependency because of the shiny update system in WordPress core.

Thanks for your time on this one!

afragen commented 5 years ago

🙌🏼 I'm glad we were able to find a solution to this. Thanks for reporting it and for all the work and testing you've done.

BTW, it's the second shiny update without a page load in between that requires the run on admin-ajax.php 😉