apache / incubator-pagespeed-mod

Apache module for rewriting web pages to reduce latency and bandwidth.
http://modpagespeed.com
Apache License 2.0
696 stars 158 forks source link

High TTFB with PageSpeed #2089

Open AnonDev1312 opened 2 years ago

AnonDev1312 commented 2 years ago

Hi,

TTFB is much higher when PageSpeed is enabled and I don't understand why

264ms TTFB with PageSpeed=off https://speedvitals.com/report/www.keinegotterkeinemeister.de/eEuqEG/

1100ms TTFB with PageSpeed ON https://speedvitals.com/report/www.keinegotterkeinemeister.de/tg1zwa/

Pagespeed config: https://pastebin.com/fLqVcP1M

jmarantz commented 2 years ago

Thanks for the data including the config; it's very helpful.

It does look like the performance is pretty good even from the US, despite the TTFB. I assume the server is in Germany with no US CDN.

Of course TTFB will never get faster with PageSpeed, only slower. The question is whether the optimizations it makes during that time provide a better user experience.

I visited view-source:https://www.keinegotterkeinemeister.de/tshirt-sabotage-black-cat-D01004174288P0210/?PageSpeedFilters=+debug to see what timing info PageSpeed injected at the end, and found this:

#NumFlushes            0
#EndDocument after     5906us
#Total Parse duration  5685us
#Total Render duration 411563us
#Total Idle duration   221us
No critical images detected.
The following filters were disabled for this request:
CriticalSelectorFilter: No critical selector info in cache
DelayImages

So it looks like your server spent 5.7 seconds in PageSpeed parsing your HTML -- that's mostly CPU bound. Then it spent 452ms rendering it -- that may be a combination of image optimization, network operations, waiting for asynchronous optimizations and URL fetches (depending on the cache setup) That leaves 1100-270= 378ms unaccounted for.

I'm guessing that most of this comes from the file cache. It's possible your disk is I/O bound. My main suggestion is to try deploying memcached or redis as an asynchronous cache. That should be faster than the file-cache.

AnonDev1312 commented 2 years ago

Hi, thanks for the reply

The server is a dedicated server in OVH Datacenter in Beauharnois, Quebec, Canada Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz (16 core(s)) - 2499 MHz 64 GB Ram NVME SSD 1GBPS network

CPU load is always under 25% https://prnt.sc/Xte_RHbEGSgR

We also integrated a benchmark into the website to calculate the page load speed of the different PHP modules and calculate SQL queries execution time. You can see it on the bottom of the page when adding ?benchmark=1 to the URL https://www.keinegotterkeinemeister.de/tshirt-sabotage-black-cat-D01004174288P0210/?benchmark=1

Currently, it takes 0.2110 seconds to generate the page with PHP on the first page load. But out of this 0.21 seconds, 0.1865 is due to calling an external geolocation API. This is just because Cloudflare is currently disabled for our testing phase. When Cloudflare is enabled, we get the geolocation from $_SERVER["HTTP_CF_IPCOUNTRY"] so we avoid calling the external API and page load time goes down to around 0.01s

TTFB is not the only problem, we also have high LCP and it throws warnings in Google Search Console: https://prnt.sc/2PjB74ekUgdR Our search ranking dropped a lot and we lost 50% of our search traffic since June 2021 and we suspect it could be partly due to that

How can I confirm whether the disk is I/O bound or not? I followed Plesk guide and here are the results: https://support.plesk.com/hc/en-us/articles/360006437794-Plesk-for-Linux-server-is-running-slow-too-much-IO-interactivity

5000+0 records in
5000+0 records out
2560000 bytes (2.6 MB, 2.4 MiB) copied, 13.634 s, 188 kB/s
[root@ns508279 ~]# dd if=/dev/zero of=/root/testfile bs=512 count=5000 oflag=direct
5000+0 records in
5000+0 records out
2560000 bytes (2.6 MB, 2.4 MiB) copied, 14.1703 s, 181 kB/s
[root@ns508279 ~]# dd if=/dev/zero of=/root/testfile bs=512 count=5000 oflag=direct
5000+0 records in
5000+0 records out
2560000 bytes (2.6 MB, 2.4 MiB) copied, 14.0026 s, 183 kB/s

This seems pretty slow. Could it be because we run pagespeed from too many domains?

Latency seems fine:

[root@ns508279 ~]# dd if=/dev/zero of=/root/testfile bs=1024M count=1 oflag=direct
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 2.05548 s, 522 MB/s

however iotop looks perfectly fine. Monitored it for around 10 minutes and IO never went above 1% https://prnt.sc/F65mP1MkTeBC

also no high %WA with top: https://prnt.sc/Oe1hhgcqrNDC

jmarantz commented 2 years ago

I was suspicious there may be spinning-disk latency but as you have an SSD I'm not sure. Nevertheless it might be worth trying redis/memcached, as we never really tuned the file-cache for SSD.

AnonDev1312 commented 2 years ago

There could be an IO issue because according to the Plesk guide, the first test I ran should get a speed of 142 MB/s but I got only 183 kB/s

Memcached is now installed but I'm not sure how to check if it's correctly configured https://www.keinegotterkeinemeister.de/phpinfo.php I enabled statistics and I can see some data related to Memcached so it seems to be working: https://www.keinegotterkeinemeister.de/mod_pagespeed_statistics?memcached

ModPagespeedMemcachedServers localhost:11211
ModPagespeedCreateSharedMemoryMetadataCache "/var/cache/mod_pagespeed/" 51200

Is 51200 enough? We have plenty of memory on the server and the websites are big with thousands of images

It seems to have lowered TTFB, great! Now it is almost the same as with Pagespeed disabled. Thanks!

Anything suggestions to improve LCP?

jmarantz commented 2 years ago

The shared-memory metadata cache does not get populated with images. It just gets populated with metadata mapping the original image name and context into which it is to be rendered into an optimized resource name. The images are stored in memcached in your configuration. You can see the hierarchy pictured here:

https://www.keinegotterkeinemeister.de/pagespeed_admin/cache#cache_struct

It look like you've got Prioritize Critical CSS enabled. I haven't thought about this space for quite some time but I don't think we have much else in PageSpeed I'd recommend. I"d try disabling defer_javascript -- it's possible it's doing more harm than good to that metric.

Good luck!

On Mon, Apr 11, 2022 at 11:52 PM Ungov77 @.***> wrote:

There could be an IO issue because according to the Plesk guide, the first test I ran should get a speed of 142 MB/s but I got only 183 kB/s

Memcached is now installed but I'm not sure how to check if it's correctly configured https://www.keinegotterkeinemeister.de/phpinfo.php I enabled statistics and I can see some data related to Memcached so it seems to be working: https://www.keinegotterkeinemeister.de/mod_pagespeed_statistics?memcached

ModPagespeedMemcachedServers localhost:11211 ModPagespeedCreateSharedMemoryMetadataCache "/var/cache/mod_pagespeed/" 51200

Is 51200 enough? We have plenty of memory on the server and the websites are big with thousands of images

It seems to have lowered TTFB, great! Now it is almost the same as with Pagespeed disabled. Thanks!

Anything suggestions to improve LCP?

— Reply to this email directly, view it on GitHub https://github.com/apache/incubator-pagespeed-mod/issues/2089#issuecomment-1095951580, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAO2IPIHW2BALPOWN7TII53VETXQJANCNFSM5TEGVEXQ . You are receiving this because you commented.Message ID: @.***>

jmarantz commented 2 years ago

One more thing -- you probably should control access to /pagespeed_admin and /mod_pagespeed_statistics with allow/deny directives per https://www.modpagespeed.com/doc/admin#handlers

AnonDev1312 commented 2 years ago

Thanks, we disabled defer_javascript and some other filters

However today we noticed that images are not getting converted to webp anymore after switching to Memcached When commenting out Memcached in the config, the images are converted again to webp

jmarantz commented 2 years ago

My first suspicion is that they just haven't been (re)converted into webp yet. By switching to memcached you effectively are flushing your cache. As you may have noticed, PageSpeed optimizes resources in the background when it learns they need to be optimized, but may serve the first few requests unoptimized. Is it possible that if you refresh your page a few times the images will start coming in as webp?

AnonDev1312 commented 2 years ago

I tried CTRL+F5 many many times but I didn't see any webp images in the code. As soon as I turn off Memcached, I see webp images after 2 refreshes. We let Memcached be enabled for 18 hours and didn't see any webp images on the homepage.

Memcached disabled, many webp images in the code: view-source:https://www.keinegotterkeinemeister.de/

Memcached enabled for 18 hours with zero webp in the code: view-source:https://www.ni-dieu-ni-maitre.com/

AnonDev1312 commented 2 years ago

I see this error in /mod_pagespeed_statistics?cache#physical_cache Error getting memcached server status for localhost:11211

I temporarily re-enabled public statistics on both domains if you want to check

[root@ns508279 ~]# service memcached status
Redirecting to /bin/systemctl status memcached.service
● memcached.service - memcached daemon
   Loaded: loaded (/usr/lib/systemd/system/memcached.service; disabled; vendor preset: disabled)
   Active: inactive (dead)

edit: got it enabled now, but still confused about the last line, inactive/dead

[root@ns508279 ~]# systemctl enable memcached
Created symlink /etc/systemd/system/multi-user.target.wants/memcached.service → /usr/lib/systemd/system/mem                                           cached.service.
[root@ns508279 ~]# systemctl start memcached.service
[root@ns508279 ~]# service memcached status
Redirecting to /bin/systemctl status memcached.service
● memcached.service - memcached daemon
   Loaded: loaded (/usr/lib/systemd/system/memcached.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2022-04-12 22:39:25 UTC; 3s ago
 Main PID: 805728 (memcached)
    Tasks: 10 (limit: 409644)
   Memory: 3.6M
   CGroup: /system.slice/memcached.service
           └─805728 /usr/bin/memcached -p 11211 -u memcached -m 64 -c 1024 -l 127.0.0.1,::1

Apr 12 22:39:25 ns508279.ip-167-114-64.net systemd[1]: Started memcached daemon.

Now the error is gone in /mod_pagespeed_statistics?cache#physical_cache

edit: it's fixed, webp images are working now

edit2: TTFB is now back above 1200ms 😭😭😭 https://speedvitals.com/report/www.keinegotterkeinemeister.de/B4rUYX/

jmarantz commented 2 years ago

It's possible the reason that TTFB was good for a while is that the memcached code is good at failing-fast if it looks like memcached is not working.

I can think of three reasons generally that PageSpeed would make TTFB noticeably worse:

AnonDev1312 commented 2 years ago

Hi,

Makes sense. I didn't understand how it was faster with Memcached failing lol

I don't think the rewrite deadline is being overwritten. I have not modified any conf file, the entire config is in Plesk GUI except for ModPagespeedCreateSharedMemoryMetadataCache

I have looked at the histograms but I'm not sure how to interpret the data image

You can check the admin section, I put it public temporarily again https://www.keinegotterkeinemeister.de/pagespeed_admin/

Lofesa commented 2 years ago

Hi @Ungov77 @jmarantz

Take a look on this:

https://www.keinegotterkeinemeister.de/pagespeed_admin/cache#physical_cache

See evictions and current objects stored:

curr_items: 53457 evictions: 25654735

I think is a big amount of evictions.

You are using only a 64Mb memcache database

bytes: 63936177

In the config file for memcached (normaly /etc7memcached.conf or /etc/system/memcached.conf) you can increase the size of RAM memcached uses:

change m 64 whit the size you will use.

Think you are using memcached to store those object stored in disk before.

AnonDev1312 commented 2 years ago

Hi,

The config file aas in /etc/sysconfig/memcached

What size should we increase it to? We have 64gb of ram on this server I increased it from 64 to 6400 for now

edit: 1 hour later, Evictions are still at zero. Seems to be fixed now. Just tested TTFB again, ran 3 tests 2.4s https://speedvitals.com/report/www.keinegotterkeinemeister.de/CmxbQ4/ 670ms https://speedvitals.com/report/www.keinegotterkeinemeister.de/jXjxJs/ 660ms https://speedvitals.com/report/www.keinegotterkeinemeister.de/FB2FrO/

Lofesa commented 2 years ago

hi @Ungov77 Take account that your filesystem cache was near of 40GB, if I remember +30 (Yes, I´m Longinos in the goolge group email list)