iFixit / react-commerce

A work in progress prototype for iFixit e-commerce functionalities.
https://react-commerce.vercel.app
2 stars 0 forks source link

Adapt product template to app router #2221

Closed dhmacs closed 7 months ago

dhmacs commented 7 months ago

closes https://github.com/iFixit/ifixit/issues/51165

This PR ships the app router product page, effectively replacing the "pages" product page.

deploy_block 🔢 on verifying that CDN-Cache-Control header is forwarded by Cloudfront.

QA

Perform a smoke test of the product page to verify that it behaves the same as production.

vercel[bot] commented 7 months ago

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
react-commerce ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 29, 2024 3:40pm
react-commerce-prod ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 29, 2024 3:40pm
ghost commented 7 months ago
👇 Click on the image for a new way to code review #### [![Review these changes using an interactive CodeSee Map](https://s3.us-east-2.amazonaws.com/maps.codesee.io/images/github/iFixit/react-commerce/2221/2653bcb6/1f9c69de4a01190f0b0fb6461274cfa76a1cb7bb.svg)](https://app.codesee.io/r/reviews?pr=2221&src=https%3A%2F%2Fgithub.com%2FiFixit%2Freact-commerce) #### Legend CodeSee Map legend
github-actions[bot] commented 7 months ago

📦 Next.js Bundle Analysis for @ifixit/commerce-frontend

This analysis was generated by the Next.js Bundle Analysis action. 🤖

🎉 Global Bundle Size Decreased

Page Size (compressed)
global 336.36 KB (🟢 -6.25 KB)
Details

The global bundle is the javascript bundle that loads alongside every page. It is in its own category because its impact is much higher - an increase to its size means that every page on your website loads slower, and a decrease means every page loads faster.

Any third party scripts you have added directly to your app using the <script> tag are not accounted for in this analysis

If you want further insight into what is behind the changes, give @next/bundle-analyzer a try!

Six Pages Changed Size

The following pages changed size from the code in this PR compared to its base branch:

Page Size (compressed) First Load
/Parts 95.16 KB (🟡 +535 B) 431.52 KB
/Parts/[...deviceHandleItemType] 95.18 KB (🟡 +537 B) 431.54 KB
/Shop/[handle] 95.16 KB (🟡 +536 B) 431.52 KB
/Tools 95.16 KB (🟡 +536 B) 431.52 KB
/Tools/[handle] 95.16 KB (🟡 +535 B) 431.52 KB
/Troubleshooting/[device]/[problem]/[wikiid] 32.29 KB (🟡 +1.13 KB) 368.65 KB
Details

Only the gzipped size is provided here based on an expert tip.

First Load is the size of the global bundle plus the bundle for the individual page. If a user were to show up to your website and land on a given page, the first load size represents the amount of javascript that user would need to download. If next/link is used, subsequent page loads would only need to download that page's bundle (the number in the "Size" column), since the global bundle has already been downloaded.

Any third party scripts you have added directly to your app using the <script> tag are not accounted for in this analysis

Next to the size is how much the size has increased or decreased compared with the base branch of this PR. If this percentage has increased by 20% or more, there will be a red status indicator applied, indicating that special attention should be given to this.

masonmcelvain commented 7 months ago

Cross posting https://github.com/iFixit/react-commerce/pull/2208#issuecomment-1910816174

erinemay commented 7 months ago

Viewing and using the following on the product page looked and behaved the same as live. Checked on desktop and mobile.

I checked various products with different sets of features and I couldn't tell the difference between the prod preview and ifixit.com. I tested functionality that doesn't work in the prod preview (like adding to cart and checkout) in the non-prod preview. Everything looked and behaved the same as production with no new errors. qa 🍰

sterlinghirsh commented 7 months ago

CR :zap: though the deploy block remains until we make sure the headers are set up properly. Also I don't want to merge this at the end of the day just in case.

masonmcelvain commented 7 months ago

deploy_block 👍🏻 Pulldasher does not pick up deploy blocks in the PR comment apparently.

danielbeardsley commented 7 months ago

I'm gonna spend some time evaluating how this acts with regards to caching / headers.

danielbeardsley commented 7 months ago

Research and eval of Cache Headers

verifying that CDN-Cache-Control header is forwarded by Cloudfront.

I'm not sure if it's forwarded (I doubt it), but it sure isn't payed attention to. I've scoured Google and have found no evidence that Cloudfront will listen to the Cdn-Cache-Control header, only the normal Cache-Control header.

https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Expiration.html#expiration-individual-objects

Cdn-Cache-Control sure seems like a Cloudflare thing: https://developers.cloudflare.com/cache/concepts/cdn-cache-control/ and it seems they will give it higher priority:

[an origin can] Return the CDN-Cache-Control response header which Cloudflare evaluates to make caching decisions. Cache-Control, if also returned by the origin, is proxied as is and does not affect caching decisions made by Cloudflare. Additionally, CDN-Cache-Control is proxied downstream in case there are other CDNs between Cloudflare and the browser.

That said, we don't expect cloudfront to cache next.js pages right now, we're just using it as a proxy.

Here's a comparison of this branch vs prod: image

Given that Cloudfront ignores cdn-cache-control I presume this header change won't have any effect on the CDN caching performance. It does however seem to send cache-control: private now meaning browsers won't cache the html page at all.

Response Shape

This pull means that we are now using streaming. The last ~20% of the response bytes are pieces that look like: <script>self.__next_f.push([1,"3533,\"chunks\":[\"272:st...

The page looks identical when rendered without javascript, so that's good. The post-gzip size is almost the same, 77kb vs 74kb, though when decompressed, it's 10% bigger (521kb vs 474kb). I'm not exactly sure what to make of that.

Server Response performance

Updated After about 200 requests from around the globe, this is our performance (with caching disabled). App Router: image

Pages Router: image

jarstelfox commented 7 months ago

this is our performance (with caching disabled)

My takeaway is the App router is moderately slower. This was my gut when thinking about these changes. I do not think the degradation is enough for us to be too concerned and I would be willing to ship it.

Does any one else have some thoughts? @danielbeardsley?

danielbeardsley commented 7 months ago

My takeaway is the App router is moderately slower.

I'm not 100% sure why, but it appears about 50-100ms slower over all (after updating the graphs with ~200 requests). This could be because of the size difference (I'm not sure if Alerta asks for the gzip encoded version).

danielbeardsley commented 7 months ago

Interesting that the app router seems to have broken Vercel's logging such that on this branch vercel interprets each line of output as a separate request: image

It ends up this way when passed to Datadog as well, making the datadog integration rather useless. I vote for not merging for the moment on account of the broken logs and the small performance regression. At the moment, I don't see any Pros, yet there are a few Cons.

jarstelfox commented 7 months ago

At the moment, I don't see any Pros

One pro: App router is maintained while Pages router is unclear

I suspect Pages router will be deprecated at some point (who knows when)

danielbeardsley commented 7 months ago

I've reported the bug to vercel, we'll see what we get: https://github.com/vercel/next.js/issues/61240

dhmacs commented 7 months ago

After about 200 requests from around the globe, this is our performance

@danielbeardsley what tool are you using to measure this?

(with caching disabled)

CDN cache or all caches?

danielbeardsley commented 7 months ago

what tool are you using to measure this?

alertra.com. We use this to monitor our up-time and conveniently they hit your configured urls from ~20 different (non amazon) places around the globe.

CDN cache or all caches?

All caches.

Now, looking at the last 24 hours (1 req / minute = 1440 requests) the server-side performance numbers are nearly identical between the two routers (only looking at delivering the total response bytes, not the browser performance).

danielbeardsley commented 7 months ago

Notably, when looking at the bundle javascript (ignoring piwik and other external JS) delivered to a product page, the total bytes sent by the app router are a fair amount lower.

App Router: 1,486kb (475kb gzipped): :+1: Pages Router: 1,720kb (530kb gzipped)

danielbeardsley commented 7 months ago

It seems the new app router always returns cache-control: private, no-store, no-cache, max-age=0, must-revalidate to browsers.

Is this expected? It doesn't seem desired. Hard to say how much this will affect clients, but certainly navigating back to a product page will be slower with this cache-control value set.

I thought for a moment it was caused by the fact that one is a preview deployment, but I've found no evidence anywhere that vercel does this and other preview urls still respond with public, max-age=300:

image

Now I'm more confused because it seems that vercel is saying the behavior we are seeing is impossible:

For server responses to be successfully cached with Vercel's Edge Network, the following criteria must be met: ... Response does not contain the private, no-cache or no-store directives in the Cache-Control header.

The app router definitely responds with those values in the cache-control header and yet we've seen x-vercel-cache: HIT in the response too.

sterlinghirsh commented 7 months ago

This could be because we're using a "dynamic function" by accessing headers / get params. I think that may force the private no-cache behavior we're seeing. @dhmacs had previously looked into whether we could avoid making these routes dynamic so they could be more cacheable, but it would come with some tradeoffs. It's also possible that we could use a middleware to convert the relevant header and get params into path segments to prevent the lookups from making those pages dynamic.

dhmacs commented 7 months ago

It seems the new app router always returns cache-control: private, no-store, no-cache, max-age=0, must-revalidate to browsers.

Is this expected?

Yes, in app router it is by design. As @sterlinghirsh commented, whenever we use dynamic functions on a page, the page becomes dynamic, which means that Next.js wants to serve the page fresh each time. The new api takes over the cache control header, it's set automatically based upon whether the page is static or dynamic.

The app router definitely responds with those values in the cache-control header and yet we've seen x-vercel-cache: HIT in the response too.

The reason for this behaviour is what I've introduced in https://github.com/iFixit/react-commerce/pull/2216, namely I found a way to overcome this and still use the CDN by using the CDN-Cache-Control header

danielbeardsley commented 7 months ago

Dynamic functions like cookies and headers, and the searchParams prop in Pages depend on runtime incoming request information. Using them will opt a route out of the Full Route Cache, in other words, the route will be dynamically rendered.

Does this mean that by setting the Cdn-Cache-Control header we are somehow breaking a convention or functionality? If next.js is saying "you can't cache this page" and yet we are circumventing that rule by sending a different header (caching it on the CND but not the browser), are we breaking some important assumption?

depend on runtime incoming request information

This seems like a ridiculous statement to me, "incoming request information" is everything about the request. The url is part of "incoming request information" too, so it seems an incorrect way to describe what excludes a request from caching. Also, the query params are part of the url as well.

Turns out when you try to re-invent the wheel, people will want your new wheel to be as capable as the old wheel and will be frustrated when it underperforms.

sterlinghirsh commented 7 months ago

@dhmacs didn't say "incoming request information", he said "runtime incoming request information" and he's correct so don't tell him it's a ridiculous statement. I agree it's a weird design decision but it seems like they're trying to make it so you can cache by path only rather than letting you pick a set of headers or get params that would be part of the cache. So on the one hand, it's annoying that you can't check the headers or get params without opting out of caching. On the other hand, it eliminates a class of caching bug from making your page depend on values that are outside the cache key. As I mentioned, if we want to cache based on a subset of get params or headers, we should be able to use a middleware to convert them to path elements, which would mean the route is no longer "dynamic".

dhmacs commented 7 months ago

Does this mean that by setting the Cdn-Cache-Control header we are somehow breaking a convention or functionality?

Not really, we're just replicating the same behaviour that we had before. The point is that the app router wants to rely heavily on server caches to take advantage of RSC to mix up static and dynamic parts of the same page. I think their goal is something they call "Partial Prerendering". Unfortunately given that we're relying on headers() and searchParams to support dev features, using CDN-Cache-Control seems to be the only way.

Turns out when you try to re-invent the wheel, people will want your new wheel to be as capable as the old wheel and will be frustrated when it underperforms.

I'm with you, I think we don't need all these clever (not so clever) caching strategies. We stand to gain from adopting RSC mainly because of the massive reduction of client JS that we can achieve once we unlock the full potential of RSC. As we've seen in the product list demo that we've made last september, we can dramatically reduce time to interactive, even for slower devices.

By the way, I suspect we won't get answers any time soon on your bug report https://github.com/vercel/next.js/issues/61240 so we need to think how we want to move forward. I think it's desirable to finish this work now and migrate to the app router as quickly as possible in order to minimize the migration cost. Soon we probably want to add features such as translations, and having pages still around will mean that we'll have to implement a system that support both pages and the app router. The alternative is to keep implementing on pages, which doesn't seem like a wise choice long term. With that in mind, do you think there's no way to make logs usable as they are right now? I've tried to search for this issue, but I haven't find anything so far

danielbeardsley commented 7 months ago

he's correct so don't tell him it's a ridiculous statement

Sorry to be unclear!, I was quoting the docs, not dhmacs, I should have specified the source. I said it's ridiculous because they are inventing a new concept (caching based on just part of the url) and aren't explaining that anywhere other than co-opting and overloading other generic terms "static" and "dynamic".

HTTP caches have been around for decades and have generally always treated the whole url as the cache key; sometimes with the option of adding more parts of the request to the cache key.

With that in mind, do you think there's no way to make logs usable as they are right now? I've tried to search for this issue, but I haven't find anything so far

Not as they are currently on this branch, it's broken enough to make them unusable. Perhaps there's something about the app router that means we need to be using some other logging mechanism? Maybe we're supposed to be using some next.js built-in logging library instead of console.log? Might need some experimentation.

danielbeardsley commented 7 months ago

Regarding app router vs pages router support, the app router only became stable in May 2023. It seems likely that there will be more bugs and frustrations in using something relatively new than there will be with continuing to use something that's been around for several years.

i.e. The argument of "We need to migrate because the pages router might eventually stop being maintained" seems spurious. Yes, it may eventually go un-maintained, but I think it'll be a while before the pain of using the pages router is greater than using something new that's still quite in flux / has rough edges. I'm not trying to say we shouldn't ditch the pages router, but that it's potential lack of future support isn't a valid reason to.

masonmcelvain commented 7 months ago

Not as they are currently on this branch, it's broken enough to make them unusable.

To be clear, it's only console.log()'s (and other console methods) that are detached from the main request, one entry per log, making them useless in DataDog. I see that the request for the page (and all other network requests) look fine in DataDog.

Presently we're using console.log() to log timings and cache hits on this page, AFAIK we don't use it for anything else. I haven't ever used DataDog to look for the result of console.warn or console.error since we have Sentry.

2 thoughts:

dhmacs commented 7 months ago

i.e. The argument of "We need to migrate because the pages router might eventually stop being maintained" seems spurious. Yes, it may eventually go un-maintained, but I think it'll be a while before the pain of using the pages router is greater than using something new that's still quite in flux / has rough edges. I'm not trying to say we shouldn't ditch the pages router, but that it's potential lack of future support isn't a valid reason to.

I understand your concerns, and I wish the migration was easier/smoother than it turned out to be. Most of the rough edges / bugs are in the migration path. For example when we tried to upgrade to Next.js 14 we found out that pages 404 breaks. When we developed the product list demo from scratch with app router I did not find any particular issue other than the inflexible caching approach (which is by design, not a bug), and the gains in terms of DX and performance were obvious.

As @masonmcelvain commented, the additional cost of waiting is high, especially after we introduce important features such as localization, which is handled differently between pages and the app router.

sterlinghirsh commented 7 months ago

I'm not inclined to wait on the overall migration long-term, but if we can find a good way to keep the logging working today then I'm interested in implementing it. Maybe there's a way to have all the logs go onto a single line or something. If we really can't get anywhere with logging then maybe we can live with it being weird at least for a bit. But we've definitely used the logs to diagnose bugs and performance issues in the past, and especially with the app router's many unknowns and surprises, it feels like a bad time to destroy our logging ability.

danielbeardsley commented 7 months ago

Agreed, let's not let the logs thing prevent our progress. I'd probably vote to disable logging for the time being:

https://vercel.com/ifixit/~/support/cases/00193756

sterlinghirsh commented 7 months ago

I'd probably keep the logs on since getting info from multiple lines is still probably better than not having it at all.

sterlinghirsh commented 7 months ago

un_deploy_block :zap: