statamic / ssg

The official Statamic Static Site Generator
236 stars 24 forks source link

Multisite SSG is taking an exponential amount of time to generate #116

Closed tao closed 2 years ago

tao commented 2 years ago

I'm almost done working on an update to my website with more languages and I noticed SSG started taking an exponential amount of time to generate the site. At the moment I can get to 14 or 15 pages before it freezes. I also tried this on my linux desktop and had the same problem.

Screenshot 2022-10-13 at 11 56 16

I'm trying to figure out what's going on, the update itself didn't include much change except adding more translated content, using the language files more and the {{ trans }} tag, and using Statamic navigation... however I tried caching the nav tags and that didn't seem to help.

If I only generate English files it seems fine:

Screenshot 2022-10-13 at 12 08 03

I thought it might be an antlers bug by using too many partials but even if I use an empty layout it still slows down.

<!DOCTYPE html>
<html lang="{{ locale }}" dir="{{ site:direction ?? 'ltr' }}">
<head>
    <meta charset="UTF-8">
</head>
<body>
    <!-- empty body -->
</body>
</html>

And here's the result, which is the same... 😕

Screenshot 2022-10-13 at 13 46 11

jasonvarga commented 2 years ago

What version of Statamic are you on? Are you using S3?

tao commented 2 years ago
Environment
Laravel Version: 9.34.0
PHP Version: 8.0.21
Composer Version: 2.0.11
Environment: local
Debug Mode: ENABLED
URL: localhost
Maintenance Mode: OFF

Cache
Config: NOT CACHED
Events: NOT CACHED
Routes: NOT CACHED
Views: CACHED

Drivers
Broadcasting: pusher
Cache: statamic
Database: sqlite
Logs: stack / single
Mail: smtp
Queue: sync
Scout: elasticsearch
Session: file

Statamic
Addons: 3
Antlers: runtime
Version: 3.3.43 PRO

Statamic Addons
statamic/collaboration: 0.4.0
statamic/ssg: 1.2.0

I'm trying to debug it more now and see if I can pin down where the delay is coming from.

tao commented 2 years ago

We're using S3 for Assets & Glide but these specific pages didn't include images. These routes mentioned before /fr/channeling/xxx are custom routes using Statamic::route so I'm trying to see now if that might be causing the delays.

tao commented 2 years ago

I removed all the custom routes and ran into the same problem, it just took a little longer and I managed to generate ~115 pages this time.

$ php please ssg:generate --url="/fr/"
You may be able to speed up site generation significantly by installing spatie/fork and using multiple workers (requires PHP 8+).

[✔] Gathered content to be generated
Generating 229 content files...

[1] Generate /fr/books/secrets-of-the-ufo    (746ms)
[2] Generate /fr/books/the-law-of-one-book-1    (107ms)
[3] Generate /fr/books/the-law-of-one-book-2    (110ms)
[4] Generate /fr/books/the-law-of-one-book-3    (99ms)
[5] Generate /fr/books/the-law-of-one-book-4    (122ms)
...
[40] Generate /fr/channeling/ra-contact/66    (4718ms)
[41] Generate /fr/channeling/ra-contact/67    (4232ms)
[42] Generate /fr/channeling/ra-contact/68    (4481ms)
[43] Generate /fr/channeling/ra-contact/69    (4301ms)
[44] Generate /fr/channeling/ra-contact/70    (4470ms)
[45] Generate /fr/channeling/ra-contact/71    (4397ms)
...
[80] Generate /fr/channeling/2007/0218    (15943ms)
[81] Generate /fr/channeling/2007/0318    (25443ms)
[82] Generate /fr/channeling/2007/0325    (17699ms)
[83] Generate /fr/channeling/2007/0408    (19645ms)
[84] Generate /fr/channeling/2007/0414    (17545ms)
[85] Generate /fr/channeling/2007/0422    (13041ms)
...
[110] Generate /fr/channeling/2009/0127    (25282ms)
[111] Generate /fr/channeling/2009/0203    (28178ms)
[112] Generate /fr/channeling/2009/0214    (21935ms)
[113] Generate /fr/channeling/2009/0314    (26261ms)
[114] Generate /fr/channeling/2009/0328    (26559ms)
jasonvarga commented 2 years ago

Try upgrading to 3.3.44 and see if there's any difference.

tao commented 2 years ago

I did try updating but had no luck. It might not be a multisite issue. I've done a couple hours of debugging and I think it's these Route::statamic paths that are causing the issue.

I have some routes like this for pagination:

Route::statamic('{locale}/channeling/{year}/{current_page}', 'channeling.years.show');

And even if the layout file is totally blank it causes a slow down... If I switch that same function to a normal route it works without any delay.

Route::get('{locale}/channeling/{year}/{current_page?}', function() {
    return 'test';
});

But this Route::statamic route of mine uses the view channeling.years.show which extends a blank layout, so I can't see why anything being rendered would cause a slow down as it's just outputs a blank page like this:

<!DOCTYPE html>
<html lang="fr" dir="ltr">
<head>
    <meta charset="UTF-8">
</head>

<body>
    <!-- body -->
</body>
</html>
tao commented 2 years ago

I'm going to work on this a bit more this weekend and see if I can replicate it in a new site or see if it's something strange in my own code that's causing the issue.

tao commented 2 years ago

@jasonvarga I spent the last two days debugging this. It's related to the Statamic update from 3.3.31 to 3.3.32 and the pull request #6539. The changes made to the AddViewPaths middleware specifically is causing the problem on my Statamic::routes.

        foreach ($finder->getHints() as $namespace => $paths) {
            foreach ($paths as $path) {
                $finder->prependNamespace($namespace, $path.'/'.$site);
            }
        }

@edalzell Do you have any ideas on how we could fix this loop in your pull request? 🤷🏻‍♂️

You can see it goes from around 3s at line [12], to 10s to 65s to 309s ... I'm using the exact same pagination system as you've shown in this example so there's nothing strange going on with these routes themselves.

Screenshot 2022-10-16 at 21 17 18

When I uncomment this new code it's fine.

Screenshot 2022-10-16 at 21 22 21

If I update to Statamic 3.3.45 and uncomment this section of code then it's also fine.

When I added some logging it started doing this kind of stuff:

        foreach ($finder->getHints() as $namespace => $paths) {
            \Log::info('namespace: ' . $namespace);
            foreach ($paths as $path) {
                \Log::info('    path: ' . $path);
                $finder->prependNamespace($namespace, $path.'/'.$site);
            }
        }

It started like this:

namespace: graphql
    path: /Users/daniel/Code/llresearch/cms/vendor/rebing/graphql-laravel/src/../resources/views/en
    path: /Users/daniel/Code/llresearch/cms/vendor/rebing/graphql-laravel/src/../resources/views
namespace: statamic
    path: /Users/daniel/Code/llresearch/cms/vendor/statamic/cms/src/Providers/../../resources/views/extend/en
    path: /Users/daniel/Code/llresearch/cms/vendor/statamic/cms/src/Providers/../../resources/views/en
    ...

And ended up doing this after the second loop:

namespace: graphql
    path: /Users/daniel/Code/llresearch/cms/vendor/rebing/graphql-laravel/src/../resources/views/en
    path: /Users/daniel/Code/llresearch/cms/vendor/rebing/graphql-laravel/src/../resources/views/en/en
    path: /Users/daniel/Code/llresearch/cms/vendor/rebing/graphql-laravel/src/../resources/views/en
    path: /Users/daniel/Code/llresearch/cms/vendor/rebing/graphql-laravel/src/../resources/views
namespace: statamic
    path: /Users/daniel/Code/llresearch/cms/vendor/statamic/cms/src/Providers/../../resources/views/extend/en
    path: /Users/daniel/Code/llresearch/cms/vendor/statamic/cms/src/Providers/../../resources/views/en
    path: /Users/daniel/Code/llresearch/cms/vendor/statamic/cms/src/Providers/../../resources/views/en/en
    path: /Users/daniel/Code/llresearch/cms/vendor/statamic/cms/src/Providers/../../resources/views/extend/en/en
    path: /Users/daniel/Code/llresearch/cms/vendor/statamic/cms/src/Providers/../../resources/views/extend/en
    path: /Users/daniel/Code/llresearch/cms/vendor/statamic/cms/src/Providers/../../resources/views/en
    path: /Users/daniel/Code/llresearch/cms/vendor/statamic/cms/src/Providers/../../resources/views
    path: /Users/daniel/Code/llresearch/cms/vendor/statamic/cms/src/Providers/../../resources/views/extend

And eventually started getting out of control:

path: /Users/daniel/Code/llresearch/cms/vendor/rebing/graphql-laravel/src/../resources/views/en/en/en/en/en/en  
path: /Users/daniel/Code/llresearch/cms/vendor/rebing/graphql-laravel/src/../resources/views/en/en/en/en/en/en/en  
path: /Users/daniel/Code/llresearch/cms/vendor/rebing/graphql-laravel/src/../resources/views/en/en/en/en/en/en/en/en  
path: /Users/daniel/Code/llresearch/cms/vendor/rebing/graphql-laravel/src/../resources/views/en/en/en/en/en/en/en  
path: /Users/daniel/Code/llresearch/cms/vendor/rebing/graphql-laravel/src/../resources/views/en/en/en/en/en/en  
path: /Users/daniel/Code/llresearch/cms/vendor/rebing/graphql-laravel/src/../resources/views/en/en/en/en/en  
path: /Users/daniel/Code/llresearch/cms/vendor/rebing/graphql-laravel/src/../resources/views/en/en/en/en/en/en  
...

I also tracked the number of lines growing as it generated pages, you can see it's doubling on each new page and grows quickly.

[✔] Gathered content to be generated    =>   11

[1] Generate /podcasts/archive    (5090ms).   =>    29 lines
[2] Generate /podcasts/archive/1    (2419ms).   =>    56 lines
[3] Generate /podcasts/archive/2    (2495ms).   =>    103 lines
[4] Generate /podcasts/archive/3    (2239ms).   =>    190 lines
[5] Generate /podcasts/archive/4    (2572ms).   =>    357 lines
[6] Generate /podcasts/archive/5    (2735ms).   =>    684 lines
[7] Generate /podcasts/archive/6    (2501ms).   =>    1,331 lines
[8] Generate /podcasts/archive/7    (2398ms).   =>    2,617 lines
[9] Generate /podcasts/archive/8    (2749ms).   =>    4,976 lines
[10] Generate /podcasts/archive/9    (2586ms).   =>    6,484 lines
[11] Generate /podcasts/archive/10    (3723ms).   =>    12,278 lines
[12] Generate /podcasts/law-of-one    (6078ms).   =>    21,620 lines
[13] Generate /podcasts/law-of-one/1    (14217ms).   =>    43,934 lines
[14] Generate /podcasts/law-of-one/2    (60728ms).   =>    82,384 lines
jasonvarga commented 2 years ago

Good find! We can figure something out.

Maybe we can reset the paths to the original state in the middleware after the request ends.

tao commented 2 years ago

It seems like this middleware is the largest issue but it looks like there's something else causing a slow down recently too. It seems like it takes longer after each new page.

The first few pages take less than 1s to generate... Around page 1000 it takes about 3s to generate each one. Around page 3000 it takes about 7s And by page 5000 it takes about 20s

So I'm still struggling to generate my ~6000 page site. Besides this exponential time increase with the AddViewPaths middleware is there anything else you can think of that would be slowing the site down from recent updates?

globalexport commented 2 years ago

Hi @tao !

Did you try to reproduce performance problems without the multi-site feature?

I recently locally upgraded two single-site instances (one with approx. 40 pages, one with 267 pages). While the former is working nicely after the updates, the latter shows a similar regression in performance coming to a full stop after approx. half of the pages (tried 1 and 4 workers, does not seem to be related to the concurrency mode).

The following is a page per minute (ppm) summary during the process:

This was not happening before the updates.

php please support:details (output before update):

Statamic 3.3.26 Pro
Laravel 8.83.25
PHP 8.0.23
doublethreedigital/duplicator 2.3.1
optimoapps/statamic-bard-text-align 1.0.2
spatie/statamic-responsive-images 2.13.0
statamic/ssg 1.1.0
withcandour/aardvark-seo 2.0.28

php please support:details (output after updates):

Environment
Application Name: Statamic
Laravel Version: 9.37.0
PHP Version: 8.0.23
Composer Version: 2.3.10
Environment: local
Debug Mode: ENABLED
URL: my.local.host/
Maintenance Mode: OFF

Cache
Config: NOT CACHED
Events: NOT CACHED
Routes: NOT CACHED
Views: NOT CACHED

Drivers
Broadcasting: log
Cache: statamic
Database: mysql
Logs: stack / single
Mail: smtp
Queue: sync
Session: file

Statamic
Addons: 8
Antlers: regex
Version: 3.3.49 PRO

Statamic Addons
doublethreedigital/duplicator: 2.3.2
optimoapps/statamic-bard-text-align: 1.0.2
spatie/statamic-responsive-images: 2.14.2
withcandour/aardvark-seo: 2.0.30
statamic/ssg: 1.2.0

Edit: The missing addons are my own (not public).

globalexport commented 2 years ago

@jasonvarga I spent the last two days debugging this. It's related to the Statamic update from 3.3.31 to 3.3.32 and the pull request #6539.

I can confirm this with my instance.

jasonvarga commented 2 years ago

@globalexport If you use Statamic 3.3.31, it's back to normal?

globalexport commented 2 years ago

@jasonvarga Yes, I just downgraded to 3.3.31 and the process completed. I tested 3.3.32 afterwards and it got stuck again.

jasonvarga commented 2 years ago

Would you be able to give me a copy of the site? Either zip it and send it to support@statamic.com or grant access to the github repo?

globalexport commented 2 years ago

This is not possible for me, as the sites are customer projects. I am sorry.

jasonvarga commented 2 years ago

Can you show your custom routes and ssg.php config file, at least?

globalexport commented 2 years ago

Hi @jasonvarga!

Did you mean this?

ssg.php:

<?php

return [

  /*
    |--------------------------------------------------------------------------
    | Base URL
    |--------------------------------------------------------------------------
    |
    | This informs the generator where the static site will eventually be hosted.
    | For instance, if you are relying on absolute URLs in your app, this one
    | will be used. It should be an absolute URL, eg. "http://my-app.com"
    |
    */

  'base_url' => config('app.url'),

  /*
    |--------------------------------------------------------------------------
    | Destination Directory
    |--------------------------------------------------------------------------
    |
    | This option defines where the static files will be saved.
    |
    */

  'destination' => storage_path('app/static'),

  /*
    |--------------------------------------------------------------------------
    | Files and Symlinks
    |--------------------------------------------------------------------------
    |
    | You are free to define a set of directories to be copied along with the
    | generated HTML files. For example, you may want to link your CSS,
    | JavaScript, static images, and perhaps any uploaded assets.
    | You may choose to symlink rather than copy.
    |
    */

  'copy' => [
    public_path('css') => 'css',
    public_path('js') => 'js',
    public_path('img') => 'img',
    public_path('fonts') => 'fonts',
    public_path('assets') => 'assets'
  ],

  'symlinks' => [
    // public_path('css') => 'css',
    // public_path('js') => 'js',
  ],

  /*
    |--------------------------------------------------------------------------
    | Additional URLs
    |--------------------------------------------------------------------------
    |
    | Here you may define a list of additional URLs to be generated,
    | such as manually created routes.
    |
    */

  'urls' => [],

  /*
    |--------------------------------------------------------------------------
    | Exclude URLs
    |--------------------------------------------------------------------------
    |
    | Here you may define a list of URLs that should not be generated.
    |
    */

  'exclude' => [
    //
  ],

  /*
    |--------------------------------------------------------------------------
    | Glide
    |--------------------------------------------------------------------------
    |
    | Glide images are dynamically resized server-side when requesting a URL.
    | On a static site, you would just be serving HTML files without PHP.
    | Glide images will be pre-generated into the given directory.
    |
    */

  'glide' => [
    'override' => false,
    'directory' => 'img',
  ],

  'failures' => false, // 'errors' or 'warnings'

];

Routes:

web.php:

<?php
Route::statamic('blog', 'with-side-bar', ['load' => '3fab1586-951b-40e6-904d-e8b3ede6b6e4']);
Route::statamic('blog/page/{page}', 'with-side-bar', ['load' => '3fab1586-951b-40e6-904d-e8b3ede6b6e4']);
Route::statamic('press', 'without-teaser-with-sidebar', ['load' => '8bdfe06a-1a06-46e1-94d2-591c8f5faece']);
Route::statamic('press/page/{page}', 'without-teaser-with-sidebar', ['load' => '8bdfe06a-1a06-46e1-94d2-591c8f5faece']);
Route::statamic('author/{author_slug}', 'author-info');
Route::statamic('author/{author_slug}/page/{page}', 'author-info');
tao commented 2 years ago

@jasonvarga Here's a demo based on the Cool Runnings starter-kit without multiple sites and ~5000 articles. https://github.com/tao/ssg-slow-mo

Just do a composer install then replace the SSG Generator class with the one attached, so you can see the time each page takes.

Screenshot 2022-11-01 at 10 15 35

Generator.php.zip

tao commented 2 years ago

What's the best way to try approach fixing this? I need to get my website updating again... Is this a pull request needed on the cms or ssg code to fix things? Is there a strategy that might work that I could try?

tao commented 2 years ago

It looks like this was an issue from long ago but this method to reset the view paths isn't working correctly anymore? https://github.com/statamic/ssg/commit/34255f0376885f2c88d539f84fe89f6f661fe71d

tao commented 2 years ago

It looks like it's similar but we need to reset the hints / namespace after each generated page...

globalexport commented 2 years ago

@jasonvarga Thank you very much! It is working and I am back to 40 ppm. :)