remix-run / remix

Build Better Websites. Create modern, resilient user experiences with web fundamentals.
https://remix.run
MIT License
29.29k stars 2.46k forks source link

v2 - HMR does not work, probably race condition (error loading dynamically imported module) #7466

Closed akomm closed 2 weeks ago

akomm commented 1 year ago

What version of Remix are you using?

2.0.0 update: 2.0.1 tested, still present

Are all your remix dependencies & dev-dependencies using the same version?

Steps to Reproduce

Just install this template as in README.md:

https://github.com/remix-run/remix/tree/main/templates/remix

Try change root.tsx in dev mode

Expected Behavior

content is reloaded

Actual Behavior

Uncaught (in promise) TypeError: error loading dynamically imported module: http://localhost:3000/build/root-JFW4ZMF3.js?t=1695047812275

The file hash and timestamp obviously vary.

Tested with: Firefox 106 Chromium 116

akomm commented 1 year ago

note: hashes obviously vary as file changes.

Running dynamic import("http://localhost:3000/build/root-JFW4ZMF3.js?t=1695047812275") in browser console did result in the same error.

I've verified that /public/build/root-JFW4ZMF3.js is actually present during the above import attempt in the console.

After a hard-reload the page, the above dynamic import with exactly the same URL in the console starts to work.

Removing this line fixes the issue and HMR starts to work.

I also tried to delay it via setTimeout, it fixed the issue but something is weird, because the (async) broadcast call seem to never resolve or reject, so its probably same effect as removing the line. It feels like there is some race condition going on.

Though its not because import is attempted on the file that was deleted (previous hash), because it does exist as requested. The file name match with content in public/build, but it does fail to dynamically load it (fetch works) until hard-reload.

When I use custom server (express) as in the official docs, the HRM works too.

pcattori commented 1 year ago

Cannot reproduce locally on MacOS 13.4.1, Node 18, Chrome 117.

@akomm does this always happen when editing root.tsx? What specific edit are you making? Ideally, a reproduction on StackBlitz or CodeSandbox would be great, but any other details you can provide would also be useful.

lucasferreira commented 1 year ago

Same case happened here after upgrade to v2, but not happens every time, I got this for a couple of hours than restart my Macbook and the import error disappeared until now.

akomm commented 1 year ago

@pcattori It happens always.

I've tried this in root, but also sub routes. Simply changing text is enough. Also tried simply changing text in root.tsx without Outlet - all tests, same result.

Though Uncaught (in promise) TypeError: Failed to fetch dynamically imported module seem to be paired with a ERR_CONNECTION_REFUSED in network on both chromium and firefox.

I've disabled ublock on localhost and I've temporarily disabled any "privacy" and other browser policies with no change in the above error. I've tried also without CSP and with the most lax setting, no change.

Consider that using custom server with express or removing the mentioned line from remix-serve fix the issue and leave HMR in a functioning state. That leaves me assuming remix-serve / remix-dev does something that causes the browser to block the dynamic module import. After hard reload the dynamic module import on same URL is not blocked. It reminds me a bit the activation logic for auto-play and similar.

Running on both default and custom server on same ports - server 3000 and dev 3001. Nothing else blocks port. Just verified, even though it should still work with some fallback random alternative port or not start the server at all.

btw. ERR_CONNECTION_REFUSED is actually from the remote side, confused it with BLOCKED or similar...

brophdawg11 commented 1 year ago

@pcattori I think I've seen something similar to this on a project of mine. Let me know if you want to try and reproduce together to debug.

pcattori commented 12 months ago

@brophdawg11 and I took a look at this. Seems like the issue happens when the browser is trying to process HMR updates for build version X, but the app server is already on a newer version Y of the app.

  1. Build/rebuild happens, producing build version X
  2. App server picks up server changes for X
  3. HMR message for X is sent to the browser
  4. Before the browser processes those HMR messages, a new rebuild happens for version Y (wiping out assets for version X)
  5. Now the browser decides to ask for updated modules for X

Normally, if (5) happens before (4), everything works.

Unfortunately, there's no way to guarantee that the browser won't ask for outdated versions. While we can catch the TypeError and handle it gracefully, you'd still get some GET 4xx errors for the old assets in the console logs.

One option would be to keep around old assets for a bit during dev, but this is trickier than it seems. Still thinking about the best way to solve it.

That said, this error doesn't break HMR so a refresh or the subsequent HMR for version Y would put the app back into working order. So at a minimum we can catch the TypeError and show a warning along the lines of "HMR update failed: references outdated assets. Refresh the page or make another change to restore HMR updates".

akomm commented 12 months ago

@pcattori I'm not sure if that is the problem. Do the following points not contradict the above scenario?

update 4. done before 5. would then depend on how fast the machine and how large the codebase is. So in certain scenarios you would never see the problem - like slower PC or large codebase? And if you are somewhere in between, then you might sometimes see the error, but not always. That mirrors the experience from both of you and me... This maybe hint to reproduction?

akomm commented 12 months ago

I've tested it on another machine yesterday (windows 10 WSL) and it worked there first. Then I removed --manual from the template and it stopped working. Also cleared .cache build and public/build before doing so, just in case.

What's interesting: https://remix.run/docs/en/main/guides/manual-mode

You need to broadcast yourself. There is the broadcast line I've commented out and which fixed the HMR as well. So I guess adding --manual has indirectly same effect. I just wonder what is even the point? I see the ws messages being sent to client and client fetching the file. What is the point of the broadcast? It does not seem to affect anything but in my case somehow breaking the hmr.

Is it evtl. a redundant thing after switching to the experimental_ dev server under the hood?

Judging by this: https://youtu.be/zTrjaUt9hLo?feature=shared&t=411 Here we see that the browser only get notified by dev server when the app server notifies it is ready. But with the broad cast line removed as explained above, the dev server still notifies the browser of an update, because I do get HMR in the browser. That means, the broadcast there is probably wrong / redundant?

Either the broadcast already happen somewhere else or the dev server notifies regardles of the ready state of the app server.

samadadi commented 12 months ago

I have the same issue. When I start dev server, it is ok but when i change the component file every time, I am getting this error in browser console. https://github.com/samadadi/remix-v2-hmr Screenshot from 2023-09-21 21-58-43

akomm commented 12 months ago

As per documentation it does not make any sense, but adding --manual to the dev command fixes the issue for me. Probably, as explained above, because the broadcast that somehow causes the issue is not sent, or its not sent twice if that is the problem. HMR still works though.

heivo commented 12 months ago

I have the same issue but only after upgrading to v2.0.1.

v2.0.0 works for me.

samadadi commented 12 months ago

Still have the same issue event after upgrading to v2.0.1

akomm commented 12 months ago

Do you experience the same "fix" with dev --manual?

samadadi commented 12 months ago

Do you experience the same "fix" with dev --manual?

Yes I do but without using @zag-js componen library. But when I use a component from @zag-js, It gets worse. I uploaded a simple project to github. You can test it. https://github.com/samadadi/remix-v2-hmr

JasonColeyNZ commented 11 months ago

I too have the same issue, and the dev --manual has fixed it for me. Most changes I would make would throw the error, --manual works for me.

gitaugakwa commented 11 months ago

I also had the same problem switched my dev command to remix dev --manual and now it works. I'll watch the issue for when it's fixed.

davehowson commented 11 months ago

Same here. Happened after upgrading to 2.0.1 and remix dev --manual makes it work.

greg-hoarau commented 11 months ago

+1 after upgrading to 2.1.0.

I've reproduced the "bug" on 3 different computers (ubuntu, windows, macOs,...) with both our development project and a standard 'Remix' template.

Works with remix dev --manual.

heivo commented 11 months ago

Upgrading to 2.1.0 fixed the issue for me, only happened with 2.0.1

nmackey commented 11 months ago

Just upgraded from 1.19.3 to 2.1.0 and ran into this issue.

using --manual also fixes it

It does seem like this is caused by a race condition. Without using the --manual HMR would sometimes work.

When it did work the order of console messages in my terminal was

app:dev: [info] rebuilding... (~ app/routes/someroute.tsx)
app:dev: [info] rebuilt (946ms)
app:dev: [remix-serve] http://localhost:3000 (http://192.168.0.141:3000)
app:dev: [info] app server ready (817ms)
app:dev: [info] hmr

When it didn't work the order of the console messages in my terminal was

app:dev: [info] rebuilding... (~ app/routes/someroute.tsx)
app:dev: [info] rebuilt (993ms)
app:dev: [info] app server ready (127ms)
app:dev: [info] hmr
app:dev: 
app:dev: [remix-serve] http://localhost:3000 (http://192.168.0.141:3000)

notice the order of [remix-serve] and app server ready/hmr messages

Seems like the notification is going to browser before the app server is ready with the changes

matt-hernandez commented 11 months ago

Can confirm that the same thing is happening to me and my team at my job as well. We are using 2.1.0. Only by running remix dev --manual were we able to get back to our expected workflow.

I have a screen recording below that shows the HMR failing with a browser error in the console. The v2 dev server has not restarted, but the browser has started requesting modules, leading to an error.

I also manually refreshed the page during a second HMR attempt to show that the server was in fact down because the browser went to a "This site can’t be reached" page in Chrome. A second manual refresh a little bit later brought the website back.

Definitely a race condition where the app is starting to request updated modules before the v2 dev server is ready.

https://github.com/remix-run/remix/assets/7357877/26d169a6-ba1b-4c9c-9118-cdd02a3dd2d6

akomm commented 11 months ago

Some thoughts.

As mentioned previously, this line causes the issue. Disabling it fixes the issue like --manual does.

Normally, without the notification HMR should not work. The fact that the HMR works means there is some broadcast happening elsewhere already.

I think that based on how fast the other process finished and broadcasted the broadcast at the line linked above will either cause or not cause a problem (the race condition).

This might depend on how large the project, but also how fast the machine is. So some won't be able to reproduce it.

pcattori commented 10 months ago

Race conditions like this are hard to reproduce and fix, but we're aware of the issue and looking into it. This sort of issue is exactly why we are planning to switch to Vite as that will avoid needing to coordinate processes via filesystem reads/writes that are the root cause of this race condition. Thanks for your patience!

jaschaio commented 10 months ago

Happens to me as well just after running:

npx create-remix@latest --template remix-run/remix/templates/cloudflare-workers
ajaybc commented 10 months ago

I am using the cloudflare workers template and encountered the same problem. --manual mode is ON by default and still no luck. In the end I ended up hacking together a service worker that would retry any request to fetch the chunks for live reload. Definitely not the best solution, but it is working for me


async function retryFetch(request, maxRetries) {
  let response;
  let numRetries = 0;

  while (numRetries < maxRetries) {
    try {
      console.log('sw.js: Trying to fetch ', request.url);
      response = await fetch(request);
      if (response.ok) {
        return response;
      }
    } catch (error) {
      // Network error, retry the request
      await wait(1000);
    }

    numRetries++;
  }

  return response;
}

self.addEventListener('fetch', (event) => {
  let url = new URL(event.request.url);
  let method = event.request.method;

  // any non GET request is ignored
  if (method.toLowerCase() !== 'get') return;

  if (url.searchParams.get('t')) {
    // This is the dev mode chunk request that could fail because the server is not ready
    // we will retry the request 10 times
    event.respondWith(retryFetch(event.request, 10));
    return;
  }

  return;
});
markhker commented 10 months ago

Same issue using v2.2.0 the --manual mode hack does not work for me, using the cloudflare pages template

akomm commented 10 months ago

Shouldn't dev be the same regardless of your (prod) deployment environment? I did not use cloudflare so I don't know, just asking. But my guess is if that is the case, your --manual might not work for other reason?

daguitosama commented 10 months ago

Any news on this ?

Just upgraded from v1.x to v2.2.0 and hit this problem as soon as I made the upgrade. I'm experiencing the issue on the cloudflare/pages template. The --manual flag did not help this time 🥲

I have prepared a little reproduction here the remix project is on the /site folder. As soon as you change anything on a route you see the

https://github.com/remix-run/remix/assets/34744883/3c34b27f-e67b-47cf-9afb-22c53b98dfec

jabranr commented 10 months ago

I am facing the same issue after upgrading Remix to v2 with CF pages. Neither --manual seems to help here nor does using cross-end NODE_ENV=development (as suggested by Michael Carter in another issue)

Jackardios commented 10 months ago

I also encountered this issue after updating wrangler from 3.2.0 to 3.15.0. I tried different versions of wrangler and only wrangler <= 3.8.0 works without errors. So I advise everyone who encounters this problem to use version wrangler@3.8.0 for now

Shakahs commented 10 months ago

So the "Quick Start" instructions for new users are broken by this, have been broken for 2 months, are apparently going to stay broken until the Vite transition is complete, and there is no warning in the docs.

AbePlays commented 10 months ago

Can confirm @Jackardios suggestion works by simply moving to wrangler@3.8.0

BoilingSoup commented 9 months ago

Why is there no warning in the official tutorial?

akomm commented 9 months ago

Why is there no warning in the official tutorial?

Why there is no warning in tutorial that bugs exist?

Aside note: in the unstable vite version HMR seem to work without manual. But its not yet production recommended as the unstable suggests.

BoilingSoup commented 9 months ago

Why there is no warning in tutorial that bugs exist?

There could be a simple message to try using --manual since it's known HMR isn't working for everyone....

akomm commented 9 months ago

So you want docs to reflect github issues? Or just for those issues that affect you?

BoilingSoup commented 9 months ago

So you want docs to reflect github issues? Or just for those issues that affect you?

Docs should reflect known issues people run into when following the Quick Start tutorial, if they can't even get started because of it.

What's with the tone?

akomm commented 9 months ago

Never heard of such a thing. Only issues by design or limitations, not simply bugs. Maybe you can go forward with this new Idea and synchronize issues that people might get when they get started in the docs.

BoilingSoup commented 9 months ago

Never heard of such a thing. Only issues by design or limitations, not simply bugs. Maybe you can go forward with this new Idea and synchronize issues that people might get when they get started in the docs.

So you expect new remix users to run into this bug, look up why HMR isn't working as the docs describe, find this discussion, and hope they troubleshoot it themselves?

I think I'm in the minority here to stick around.

akomm commented 9 months ago

Kind of to expect that people look up a problem if they experience it. That is a developers job to be able to do it.

Its quite interesting how you will figure out regularly which of the open issues affects the getting started - !!as it changes!! - including all the possible ways with different templates. And how you track to remove if the warning becomes obsolete.

Apparently you expect that the OSS devs do all the above, while you can not imagine a developer to go to github and search for the issue, or hell, find it via google/other search engine.

BoilingSoup commented 9 months ago

Kind of to expect that people look up a problem if they experience it. That is a developers job to be able to do it.

Its quite interesting how you will figure out regularly which of the open issues affects the getting started - !!as it changes!! - including all the possible ways with different templates. And how you track to remove if the warning becomes obsolete.

Apparently you expect that the OSS devs do all the above, while you can not imagine a developer to go to github and search for the issue, or hell, find it via google/other search engine.

The point of using a framework is to offload boilerplate. If I can't even get HMR working with the official docs, and this is the response I get for a suggestion to make it easier for new users. I have no interest in Remix.

jaschaio commented 9 months ago

No need to be so salty :) But as it's a known issue literally effecting every single use of the cloudflare template putting a note into the docs seems quite sensible to save people time with an issue they are guaranteed to run into 🙄

Same as release notes sometimes contain notes about known limitations.

BoilingSoup commented 9 months ago

No need to be so salty :) But as it's a known issue literally effecting every single use of the cloudflare template putting a note into the docs seems quite sensible to save people time with an issue they are guaranteed to run into 🙄

Same as release notes sometimes contain notes about known limitations.

In the words of @akomm

"So you want docs to reflect github issues? Or just for those issues that affect you?"

akomm commented 9 months ago

It does not affect just cloudflare users. It affects them differently in terms of the --manual workaround.

It is also unclear if it affects every single use of cloudflare.

The only thing we know is that the issue does not affect everyone always, some never and that for everyone who has confirmed this issue here and uses cloudflare can not workaround it via --manual.

I'm out of this silly conversation. As we can see in the above example the person is not out for anything constructive, else the person would have figured out why there maybe no hint there. Its just some passive-aggressive "I had to look up 3 minutes on github, why did you not safe me the time by implementing an overcomplicated issue-docs tracking mechanism!!!"

/out

BoilingSoup commented 9 months ago

It does not affect just cloudflare users. It affects them differently in terms of the --manual workaround.

It is also unclear if it affects every single use of cloudflare.

The only thing we know is that the issue does not affect everyone always, some never and that for everyone who has this issue and uses cloudflare can not solve it via --manual.

I'm out of this silly conversation. As we can see in the above example the person is not out for anything constructive, else the person would have figured out why there maybe no hint there. Its just some passive-aggressive "I had to look up 3 minutes on github, why did you not safe me the time by implementing an overcomplicated issue-docs tracking mechanism!!!"

/out

Stop putting words in my mouth. I said nothing about implementing an overcomplicated issue tracking system.

I'm not the only one to suggest a small note in the Quick Start for beginners to remix. Is that really too much to ask? A small note for a known issue that prevents new users from getting started?

Why have a tutorial if it doesn't work? https://remix.run/docs/en/main/start/tutorial

Apparently it's been like this for months, I ran into it about a month ago and wasn't able to start since. Anyway I have lost all interest in Remix with this interaction. Have a good holiday. I'm out too.

matt-hernandez commented 9 months ago

@brookslybrand Can you or someone else from the team step in here? The latest conversation on this has stepped outside of conduct in my opinion.

Does this bug happen for everyone? No. Does it happen to a lot of people? Yes, and not just those on cloudflare like my team. Is suggesting a note in the docs a bad suggestion? No, in my opinion. It doesn’t need to happen, but putting it on the table is fine. Remix likely has growth and adoption goals and watching the impact of this bug is what we’re doing here. I’m sure if they deem it worthy of a note, they will.

brookslybrand commented 9 months ago

The latest from the team is @pcattori's last comment

Race conditions like this are hard to reproduce and fix, but we're aware of the issue and looking into it. This sort of issue is exactly why we are planning to switch to Vite as that will avoid needing to coordinate processes via filesystem reads/writes that are the root cause of this race condition. Thanks for your patience!

If you have a suggestion for how we can improve the docs please feel free to put up a PR. Right now the team is primarily enjoying the US holiday. When I'm back online Monday I'm happy to read through this issue again and work to determine if we should put up a note in our documentation. Seems like if it's affecting enough users, a small callout is probably worth it in the meantime, though for the longterm as has been mentioned we anticipate this problem being solved entirely.

In the future I would definitely appreciate avoiding the comments of a GitHub issue to fight about things like this. Sharing a suggestion is perfectly fine, there's no need to shoot down people's suggestions, nor is there reason to read comments in the most uncharitable way. Next to sharing a suggestion on an issue, a PR is the best way to encourage change.

We understand why people feel strongly here, and I know it's frustrating to see an issue seemingly not be worked on for a couple of months. I would also some benefit of the doubt on our team's side. I appreciate you all keeping the conversation going and engaging us if you feel we haven't taken adequate steps, but fighting in the comments doesn't help us, yourselves, or future users.

akomm commented 9 months ago

I think the problem here is simply that its not reproducible for everyone and --manual won't fix it for everyone. So the next person file an issue that --manual does not fix the issue. Notice, that the fact --manual does not help everyone could only be assumed later and who knows what new insights we get as time passes by? Adding "maybe solutions" to docs, beside the tracking I've mentioned needed if you do it consistently, might generate even more work with unnecessary additional issue posts. Though: Troubleshooting info is nice if you intend to stick to the current solution and its unfixable.

brookslybrand commented 9 months ago

Thanks @akomm, I definitely appreciate this perspective.

The eventual solution we're betting on is Vite getting stabilized and then switching the templates/tutorials to use that.

If there's a temporary solution that can get most new users through the next few cycles while the team is finishing up that work, that may be worth it. I'll get Pedro next week and get his thoughts on the matter

brookslybrand commented 9 months ago

I also encountered this issue after updating wrangler from 3.2.0 to 3.15.0. I tried different versions of wrangler and only wrangler <= 3.8.0 works without errors. So I advise everyone who encounters this problem to use version wrangler@3.8.0 for now

@Jackardios thanks for pointing out this solution. I'm not sure why later versions of wrangler are causing problems, but in the meantime I put up a PR to set the version of wrangler to 3.8.0 in our templates

Thanks everyone else for the different opinions and expressing frustrating about the initial experience users might have when starting with the tutorial. I've opted to not update any documentation at the moment and instead add the --manual flag to the tutorial (and also move the tutorial into this repo, which was something we meant to do already).

While this doesn't solve the underlying issue, I'll leave that to @pcattori to weigh the tradeoff of shifting to focus on this vs getting Vite over the finish line since he is much more knowledgeable and focused on both issues

My hope is that both these updates help create a smoother experience for folks in the meantime. Please feel free to open a discussion, PR, or reach out on Discord if you have any other thoughts