dotnet / aspnetcore

ASP.NET Core is a cross-platform .NET framework for building modern cloud-based web applications on Windows, Mac, or Linux.
https://asp.net
MIT License
35.44k stars 10.02k forks source link

Blazor WebAssembly: Ensure correctness even without cache-control headers #19199

Closed SteveSandersonMS closed 4 years ago

SteveSandersonMS commented 4 years ago

This issue is to track our thinking around this subject. I don't know yet whether there's work for us to do on it.

Question: to what extent should Blazor take responsibility for ensuring correct behavior even when the server doesn't return the desired cache-control: no-cache header on framework resources?

Who this affects

It doesn't affect the "hosted in ASP.NET Core" scenario, since that does put cache-control: no-cache on _framework/** responses. But if someone deploys their Blazor WebAssembly application to a different server environment, possibly something like a static files host, then they might not think about configuring cache control headers, or might not even have the option to do so. For example, on GitHub pages, you can't control this.

What can go wrong

Without cache-control: no-cache, browsers do whatever they like when it comes to using cached content. As such, there's no guarantee that the versions you load of the following files:

... are mutually consistent.

We do ensure that everything we fetch using fetch has cache: 'no-cache' meaning that we at least revalidate cached data with the server (it can return a 304 if there's no change). This ensures that blazor.boot.json, *.wasm, *.dll, *.pdb all match the current versions on the server.

However, things introduced using <script> tags - blazor.webassembly.js and dotnet.js - have no such capability so may be arbitrarily old cached copies. This can lead to undefined behavior, typically errors.

Possible solution

If we want to take responsibility for this, we could do two things.

1. Ensure consistency between blazor.webassembly.js and blazor.boot.json by stamping blazor.webassembly.js's version number into blazor.boot.json when we generate it on build.

Code inside blazor.webassembly.js can verify that a version: '...' value in blazor.boot.json matches what we expect, and if it doesn't we would throw and stop trying to load. This guarantees we don't have undefined behavior.

So what is the developer meant to do to fix it and make sure their application loads? They must do one of:

We might even enforce an equivalent to a cache-busting parameter, by changing the name of blazor.webassembly.js to blazor.webassembly.3.2.0.js (etc) so developers have no choice but to update this when they upgrade.

2. Ensure consistency between blazor.boot.json and dotnet.js by including dotnet.js in the manifest and auto-adding a cache-busting querystring param using its content hash

We can then even auto-add subresource integrity check attributes to the dynamically added <script> tag for dotnet.js if we want, ensuring the page won't load if somehow the the content is bad, even though it shouldn't be bad because we auto-added a cache-busting parameter to the URL.

This shouldn't go wrong and require developers to fix anything. The only way it could go wrong is if the server is actually serving inconsistent versions of files, independently of browser caches.

Caveats

The main problems with the solution proposed above are:

Pinging @javiercn @pranavkm @rynowak for any opinions on whether this is a problem we should want to solve.

SteveSandersonMS commented 4 years ago

BTW another possible solution would be to tell developers not to have <script src="_framework/blazor.webassembly.js"> in their index.html, but instead tell them to have some kind of "stub" loader JS file that acquires the <script> contents using fetch with cache: 'no-cache'.

I'm not really in favour of this solution because:

  1. It's an extra HTTP request before loading really starts
  2. It's weird to be generating the contents of a script tag dynamically (or using eval) and clashes with people's goals to use CSP.
javiercn commented 4 years ago

@SteveSandersonMS one though here is.

Is it our problem to solve in a general way? When the app is being hosted inside asp.net core I agree we should do whatever it takes to ensure a good experience, but anything else seems that it should be the developer's responsibility to correctly configure their deployment environment.

I'm in favor of adding sub-resource-integrity to everything, as I think that's goodness independent of the configuration problem and that will likely make the application blow-up if the environment is badly configured.

I'm also a bit skeptical about how many people will be hit by this, my reasoning is as follows:

After that, there's fewer places where you get hit by this I think, and should it be our job to solve that? (Other than using SRI)

There's also the question of, will we break people early or only when needed. Currently if the bits you have happen to be compatible, there might not be any issue at all. Something like SRI and so on will break that.

I think overall the value of having strong consistency over "upgradeability" is a trade-off I would be happy with, but it's something to think about.

SteveSandersonMS commented 4 years ago

I agree with everything you've said there. One other data point to bear in mind however is that some hosts won't let you configure cache-control at all, e.g., GitHub pages: https://webapps.stackexchange.com/a/119294

SteveSandersonMS commented 4 years ago

it should be the developer's responsibility to correctly configure their deployment environment

Thinking through this more, especially in the context of GitHub hosting where you simply can't add cache-control: no-cache headers, it now seems to me there there is in fact a necessary feature here.

If we want a mininum-viable-complexity solution, then I don't think we need to do anything special for loading blazor.webassembly.js, because it's no different from any other static resources you might be referencing (e.g., other css, js, etc.). The developer can already choose to add a cache-busting querystring if they want to. They can be as permissive or as aggressive as they want. Having the framework perform an extra version check to ensure consistency with blazor.boot.json might be nice, but is not strictly required. This is consistent with @javiercn's point that "it should be the developer's responsibility".

However, when it comes to loading dotnet.js, that happens by the framework dynamically injecting a script tag. This is different, then, because it's outside the control of the developer. They can't add cache-busting however much might they might want to. They are at the mercy of the server sending back the desired caching headers, which some hosts like GitHub simply won't. So I think it's the framework's responsibility to auto-add a cache-busting parameter to the URL for dotnet.js, which we can do by including it in the blazor.boot.json manifest so we know its content hash. Unless we do this, then it just isn't safe to deploy to GitHub no matter how much work the developer does to ensure they have cache-busting URLs for dependencies.

@javiercn @pranavkm What do you think? It seems like a pretty small and cheap feature to add, and is necessary if we want to be able to say it's legit to deploy to GitHub pages and hosts like that.

SteveSandersonMS commented 4 years ago

Resolved sufficiently with https://github.com/dotnet/aspnetcore/pull/19235

The developer is still responsible for ensuring freshness of any resources they reference explicitly from index.html via <script>, <link> etc., which includes blazor.webassembly.js. But for all the things the framework loads on your behalf (e.g., the boot manifest, dotnet.js/wasm, assemblies/pdbs), we now take responsibility for ensuring freshness regardless of server cache headers.

RemiBou commented 4 years ago

About the script tag, did you think about doing like angular and generating this tag at build time in index.html ? This would help developer and would open the door for things on the maintainer side :

I think this is done here on angular

https://github.com/angular/angular-cli/blob/d0ede149187fee45646aad5a7e55e1d268ac5bfe/packages/angular_devkit/build_angular/src/angular-cli-files/utilities/index-file/augment-index-html.ts

SteveSandersonMS commented 4 years ago

@RemiBou Yes, we used to do that. But we stopped doing it because it's better to make things simpler and more obvious.

Developers already understand what a <script> tag is, so simply having a normal one in the static index.html file makes things far more understandable. For example, you can edit it to change the URL or customize attributes (like defer), you can reorder it among other tags, and perhaps more importantly, it's obvious how you can generate it at runtime if you're doing your own server-side HTML rendering.

RemiBou commented 4 years ago

Ok I didn't remember, thanks for reminding me and for the answer.