dai-shi / waku

⛩️ The minimal React framework
https://waku.gg
MIT License
3.91k stars 102 forks source link

SSG: high volume performance #668

Closed pmelab closed 2 weeks ago

pmelab commented 3 weeks ago

The problem

HTML payloads of static pages grow linearly with the number of them. This is already evident in the 03_demo example. Each pokemon has an entry in __WAKU_ROUTER_PREFETCH__ that points to the identical component. The /SHOULD_SKIP entry also grows. With a larger number of pages, the build breaks due to memory issues or node simply not being able to JSON.stringify the build config in: https://github.com/dai-shi/waku/blob/6550a0d26e7830527c767fcda8eb906aeb0f1266/packages/waku/src/lib/builder/build.ts#L692

Also, when generating the map for __WAKU_ROUTER_PREFETCH__, pages are rendered non-concurrently, which means blocking operations can't run in parallel.

Solutions

  1. I switched router prefetching to a pattern instead of path-based model. That way each static page needs only one entry. Not one for each static path.
  2. The shouldSkipObj grows with each page entry. To be completely honest, I don't know enough about the rendering process to judge if each page would need skip lists for each other page component. But that growth during rendering felt weird, so i just moved the object one scope level down to reset it for each page. It's a complete shot in the dark, but it did hit something. https://github.com/dai-shi/waku/blob/6550a0d26e7830527c767fcda8eb906aeb0f1266/packages/waku/src/router/define-router.ts#L96
  3. Collection of modules for __WAKU_ROUTER_PREFETCH__ has been changed to run in parallel. This is also necessary to enable in-component optimizations, like database request batching or connection pooling.

Notes

vercel[bot] commented 3 weeks ago

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated (UTC)
waku ✅ Ready (Inspect) Visit Preview May 2, 2024 11:43am
codesandbox-ci[bot] commented 3 weeks ago

This pull request is automatically built and testable in CodeSandbox.

To see build info of the built libraries, click here or the icon next to each commit SHA.

dai-shi commented 3 weeks ago

The problem

Nice catch. I didn't know the exact problem, but I felt like the approach had some limits. I haven't looked into the code, I will later. You know how CIs are failing? Having a test is great, because eventually we would like to revisit shouldSkip mechanism.

pmelab commented 3 weeks ago

You know how CIs are failing?

Playwright browsers fail to install there:

E: The repository 'https://packages.microsoft.com/ubuntu/22.04/prod jammy InRelease' is no longer signed.

Having a test is great, because eventually we would like to revisit shouldSkip mechanism.

Absolutely. The change was more based on gut feeling rather than confident understanding.

I traced down the use of /SHOULD_SKIP into getSkipList ...

https://github.com/dai-shi/waku/blob/6550a0d26e7830527c767fcda8eb906aeb0f1266/packages/waku/src/router/client.ts#L213-L242

... which filters components of the current route path during fetching and pre-fetching of RSC payloads ...

https://github.com/dai-shi/waku/blob/6550a0d26e7830527c767fcda8eb906aeb0f1266/packages/waku/src/router/client.ts#L281-L297

https://github.com/dai-shi/waku/blob/6550a0d26e7830527c767fcda8eb906aeb0f1266/packages/waku/src/router/client.ts#L325-L332

I did not see how components of other routes could have an effect here, but I might be missing something. If you can help me to understand what it is actually supposed to do, I would be able to implement a test.

dai-shi commented 3 weeks ago

If you can help me to understand what it is actually supposed to do, I would be able to implement a test.

The current design is the client side doesn't (shouldn't) know about the server-side route configs. What the client knows is if it should skip certain route (or maybe tell what it has in its cache to the server, and the server decides what to respond).

But, e2e tests shouldn't depend on implementation details and what should be tested is route: 'static' and route: 'dynamic' behavior.

I really wish to refactor the internal mechanism. Hopefully some time soonish.

pmelab commented 3 weeks ago

But, e2e tests shouldn't depend on implementation details and what should be tested is route: 'static' and route: 'dynamic' behavior.

You mean render: 'static' and render: 'dynamic'? This is covered by a lot of integration tests already, no?

dai-shi commented 3 weeks ago

You mean render: 'static' and render: 'dynamic'?

You are right. My bad.

This is covered by a lot of integration tests already, no?

I don't remember if we have such tests that they ensure static pages are really "static". I don't think so.

pmelab commented 3 weeks ago

Ok! Now i get it. I will think of something.

pmelab commented 3 weeks ago

I found another issue: While generating the build config, all routes are rendered for collecting modules for prefetching: https://github.com/dai-shi/waku/blob/6550a0d26e7830527c767fcda8eb906aeb0f1266/packages/waku/src/router/define-router.ts#L155-L163

This runs sequentially and currently blocks parallel rendering of pages. Thats probably not the best solution, since rendering all static instances of the same route to collect modules feels wrong, but fixing that would be a bigger endeavour. Parallelizing at least makes things work for now. Data requests should implement good caching though, since they run 3 times per page.

pmelab commented 3 weeks ago

@dai-shi : Is that what you had in mind? https://github.com/dai-shi/waku/pull/668/commits/a4d88782b02712f42ca1901d1da65f27fc4478a2

dai-shi commented 3 weeks ago

@dai-shi : Is that what you had in mind? a4d8878

Thanks. Hm, it looks good for the start, but what I meant with really "static" is that the static page has something like Date.now(), and if you reload the page, the date won't change.

To be honest, it's not fully implemented:

dai-shi commented 3 weeks ago

Re: https://github.com/dai-shi/waku/pull/668#issuecomment-2077905809

See also #543. We would like to change unstable_collectClientModue so that it statically analyzes the code, instead of executing it. https://github.com/dai-shi/waku/pull/573#discussion_r1514091960

pmelab commented 3 weeks ago

See also https://github.com/dai-shi/waku/issues/543. We would like to change unstable_collectClientModue so that it statically analyzes the code, instead of executing it. https://github.com/dai-shi/waku/pull/573#discussion_r1514091960

That would have been my next question. Thanks!

Thanks. Hm, it looks good for the start, but what I meant with really "static" is that the static page has something like Date.now(), and if you reload the page, the date won't change.

So we want to test when components are executed. I changed the testcase to cover that (I hope 🤞 ).

Unfortunately I have no clue where the test fails come from. It only uses tsconfig.e2e.json for typechecks?

dai-shi commented 3 weeks ago

It only uses tsconfig.e2e.json for typechecks?

You need to add new tests in "references".

pmelab commented 3 weeks ago

Ok, apparently using createPages in a fixture also leads to type errors. I switched it to file based routing instead.

dai-shi commented 3 weeks ago

Ok, apparently using createPages in a fixture also leads to type errors.

That sounds somewhat unexpected. Can you elaborate? (Sorry, I haven't had time to look into it yet.)

pmelab commented 3 weeks ago

After adding the test package tsconfig.json files to references the type errors reduced, but some remained.

https://github.com/dai-shi/waku/actions/runs/8847605068/job/24295760378

Declaration files for react-server-dom-webpack could not be found. I noticed that no other fixtures use createPages and I wanted to know if there is a connection. The history is in my last 3 commits.

pmelab commented 2 weeks ago

I feel a little dumb now 😬 Thank you for fixing it!

dai-shi commented 2 weeks ago

I feel a little dumb now 😬 Thank you for fixing it!

tbh, I don't know how it fixes it the problem. import { createPages } from 'waku' should work in the app code. There must be something different in e2e/fixtures setup. (cc @himself65 who should be more familiar with this.)

dai-shi commented 2 weeks ago

Please check ci error.

By the way, would you be able to join my discord server? We have #waku channel there and we can communicate about the development. https://discord.gg/MrQdmzd

pmelab commented 2 weeks ago

tbh, I don't know how it fixes it the problem. import { createPages } from 'waku' should work in the app code. There must be something different in e2e/fixtures setup. (cc @himself65 who should be more familiar with this.)

I think the issues is related to this: https://github.com/dai-shi/waku/blob/c0eb2c9260a931d20d4087948ffa3ef9cd70d87e/tsconfig.json#L21-L22

The waku import only includes main.ts and the missing types are in types.d.ts. waku/* contains everything. In areal application this is not an issue because waku itself is not typechecked in this case.

dai-shi commented 2 weeks ago

I think the issues is related to this:

https://github.com/dai-shi/waku/blob/c0eb2c9260a931d20d4087948ffa3ef9cd70d87e/tsconfig.json#L21-L22

The waku import only includes main.ts and the missing types are in types.d.ts. waku/* contains everything. In areal application this is not an issue because waku itself is not typechecked in this case.

Hm, isn't that types only? But, yeah something like that causes not resolving to main.react-server.ts.

Please feel free to tackle it separately (if anyone wants). I don't think it's super important for tests. But, we actually missed this bug as we didn't have tests, so it's nice to have.

pmelab commented 2 weeks ago

Hm, isn't that types only? But, yeah something like that causes not resolving to main.react-server.ts.

The problem is just types. I appears as soon as a fixture imports something from waku, and not waku/[whatever]. I will open another PR, along with another typechecking issue that I've seen.