apache / superset

Apache Superset is a Data Visualization and Data Exploration Platform
https://superset.apache.org/
Apache License 2.0
61.87k stars 13.55k forks source link

Superset app loads infinitely or throws unexpected error occasionally. Ver. 3.1.2 #28259

Open Sabutobi opened 4 months ago

Sabutobi commented 4 months ago

Bug description

This is the 3.1.2 iteration of the this issue While using the superset app, I sometimes face two issues which might be related:

How to reproduce the bug

Host the latest version of the superset app deployed using docker, in an EC2 instance. Connect the superset with a database and create multiple dashboards. Use it continuously for a while. In our case this issue can be raised after the redeployment process.

Screenshots/recordings

SQL Lab example: chunkerr1 Dashboards tab example: chunkerr2

Superset version

3.1.2

Python version

3.9

Node version

16

Browser

Chrome

Additional context

No response

Checklist

rusackas commented 4 months ago

It's not able to load a chunk of the javascript built by webpack. I'm not sure how you're deploying, but is npm run build part of the process, if you're making any frontend changes? That compiles all the JS/TS into bundles as a static asset for Superset to serve. I suspect somehow your deployment is only getting part of the JS updated somehow.

Sabutobi commented 4 months ago

It's not able to load a chunk of the javascript built by webpack. I'm not sure how you're deploying, but is npm run build part of the process, if you're making any frontend changes? That compiles all the JS/TS into bundles as a static asset for Superset to serve. I suspect somehow your deployment is only getting part of the JS updated somehow.

Yep, @rusackas npm run build is there. I've mentioned deployment as one of the factors, but these chunk errors are met without deploy as well.

mistercrunch commented 4 months ago

This is pointing towards a backend/frontend being out of sync. The backend is made aware of a manifest at /static/assets/manifest.json , and there many reasons why this can/could get out of sync (though it's not common AFAIK). But normally if you're on a clean npm run build where manifest is defined and published, and a fresh backend that read the latest manifest, you should be fine.

I'm unclear what happens if the client (browser is out of sync) as it may ask for an old/expired asset while hot-loading (?). I'm guessing at Preset we may persist the older build chunks on the CDN or have some other way to tell the client to refresh its manifest

In your case does a client force-refresh fix the issue? Could you have zombie pods serving an older version of the app?

Sabutobi commented 4 months ago

Hey @mistercrunch and @rusackas . I think I've found the solution for my case. I've added this code snippet into the superset-frontend/src/setup/setupApp.ts file.

window.addEventListener('error', e => {
      // prompt user to confirm refresh
      if (/Loading chunk [\d]+ failed/.test(e.message)) {
        window.location.reload();
      }
    });

Can be done as PR by myself if needed.

mistercrunch commented 4 months ago

Looks like a band-aid. It makes me wonder how Superset manages a new manifest and hot loading of obsolete assets (?) If we don't do this already it seems we should generally look for a manifest version and force a refresh of the manifest/assets when the client is behind. I'm wondering if there's anything like that in the code base (?)

@Sabutobi do you only see this on new deploys?

rusackas commented 4 months ago

If it is the client caching pointers to chunks that no longer exist on the server, it does seem like we might have a few options

This feels like there should be a best practice, and we're reinventing a wheel here. I'll try to do some digging.

mistercrunch commented 4 months ago

Health check seems would be best and could handle a variety of things:

Sabutobi commented 4 months ago

Health check seems would be best and could handle a variety of things:

  • check auth hasn't expired and potentialy try to refresh the token if it's implemented, otherwise send to login screen
  • check manifest version (frontend/backend synchronicity) and either refresh the manifest client-side of tells the user to refresh the page or do it on their behalf

Thanks for the fresh ideas. I'll try to implement that somehow...

mistercrunch commented 4 months ago

Are we sure this is a hot-loading issue (where js bundles are loaded dynamically)? If so it could just be we need better error handling on hot-load and that's it. Like if we get a 404 on hot-load, then you could check the manifest version and pop something to the user "The frontend has gone out of sync from the backend, refresh? Refresh / Cancel"

IamJeffG commented 4 months ago

Just chiming into say that we are getting reports and screenshots of this exact same error (we are on 3.0.2). I have nothing constructive to add here, other than validation that @Sabutobi is not the only one!

Possibly relevant to the questions posed above:

Sabutobi commented 4 months ago

Hey y'all. I know that's not the solution,but:

const origin = console.error;
    console.error = error => {
      if (/Loading chunk [\d]+ failed/.test(error.message)) {
        alert(
          'A new version released. Need to relaod the page to apply changes.',
        );
        window.location.reload();
      } else {
        origin(error);
      }
    };

I've added this code snippet into the superset-frontend/src/setup/setupApp.ts file additionally and I do have this now

reload_chuck_stuff

rusackas commented 4 months ago

It's better than nothin'! I guess if it doesn't resolve the problem (like the chunk is really not there) then you're stuck in a loop... but that's an edge case to an edge case, and it's kinda busted either way then.

etadelta222 commented 4 months ago

If it is the client caching pointers to chunks that no longer exist on the server, it does seem like we might have a few options

  • tell the browser to refresh itself on error (the above suggestion) so it becomes aware and loads chunks... but that does seem glitch-prone in its own way, and does feel like a bandaid.
  • Keep older chunks around in a directory, with a certain TTL on them that's longer than the client cache. This seems like the safest/smoothest option I can think of.
  • Have the browser load from cache or not based on data in the JWT. When the site is redeployed, the JWT could effectively invalidate the cache. Probably not straightforward with a SPA... it'd have to do a check on route change or something.
  • Set up a health check endpoint (I think we might already have one, actually) that responds with a build SHA/UUID. We could check that as a polling operation and/or on SPA route change, and force a reload when it changes... but this also feels like a weird bandaid.

This feels like there should be a best practice, and we're reinventing a wheel here. I'll try to do some digging.

Hi @rusackas , @mistercrunch will the solution be part of a future release?

mistercrunch commented 4 months ago

Yup open a PR and we can see what it'd take to push this through. Some things that would be nice:

Sabutobi commented 4 months ago

Hey @rusackas and @mistercrunch . FYI: The PR has been opened

philicious commented 1 month ago

Since upgrading to 4.x, the JS Chunk load errors have become so severe we had to downgrade to latest 3.x again. (We have several dozen users and use auto-refreshing dashboards alot)

mistercrunch commented 1 month ago

the JS Chunk load errors have become so severe

I thought the JS Chunk load errors would be corollated to [how many users have an active browser open updates] [how often you deploy]. Any other theory ln why they would be more frequent in your context? Partly defective or overwhelmed CDN/server number of chunks to serve?