whatwg / html

HTML Standard
https://html.spec.whatwg.org/multipage/
Other
8.13k stars 2.67k forks source link

Removing plugins? #6003

Open domenic opened 4 years ago

domenic commented 4 years ago

With Flash deprecation, it might be the case that browsers no longer implement plugins, and as such we could remove some of the spec complexity around them.

Note, however, that we do need to keep enough to allow rendering of PDFs, and to allow rendering of HTML pages (and other supported content types) in embed and object. So it's not like we can completely gut the embed and object sections.

And, it looks like at the time of https://github.com/whatwg/html/issues/3958, Chrome treated PDFs as plugins. So maybe we can't remove much.

Anyway, some candidates for removal:

The param simplification likely holds even if PDF ends up staying around as a plugin. (And in turn we can close #387.)

annevk commented 4 years ago

If we can remove https://html.spec.whatwg.org/#concept-embed-type that would be nice. Hopefully browsers do not look for .pdf.

annevk commented 3 years ago

The way Chrome and Firefox support PDF is as the result of a navigation, so plugin documents continue to exist, but I'm not sure they would be different from inline content that doesn't have a DOM as the result is basically a document with an opaque origin.

A question that came up in https://github.com/annevk/orb/issues/5 is whether we can make these always create nested browsing contexts. I suspect that won't work due to images, but it might be worth investigating more. (See also https://github.com/w3c/webappsec-fetch-metadata/issues/37 and https://github.com/w3c/webappsec-fetch-metadata/issues/45.) https://github.com/whatwg/fetch/pull/948 suggests this might be possible now plugins are removed, but we may need to keep some quirks. Seems very much worth investigating.

cc @anforowicz @arturjanc @mikewest

annevk commented 3 years ago

For navigator.plugins and navigator.mimeTypes I filed https://bugs.chromium.org/p/chromium/issues/detail?id=1164635 and https://bugs.webkit.org/show_bug.cgi?id=220495.

domenic commented 3 years ago

It looks like @mfreed7 is interested in having Chrome follow Firefox and return empty for navigator.plugins and navigator.mimeTypes, hooray!

@annevk, can you give more details on what Firefox does? For example, I could imagine:

  1. Totally removing PluginArray, MimeTypeArray, Plugin, and MimeType, and having navigator.plugins and navigator.mimeTypes return (frozen?) normal arrays.
  2. Not removing any interfaces, and just having the lists always be zero-length, and navigator.plugins.refresh() always a no-op

... or something in between

I suspect (2) is going to be easiest, but is that what Firefox did? A spec PR would be great, where we can hash out the details.

(Also, I guess navigator.javaEnabled() always returns false now?)

mfreed7 commented 3 years ago

😄

For at least a while (weeks? a milestone or two?) my plan is to just do #2. That also appears to be what Firefox has done, at least in effect. And in Chromium, FYI, javaEnabled() has been false since M45.

I'd be happy to do more (i.e. #1), but I suspect that would cause major compat issues. Navigator.plugins is used on 60% of pageloads today. And I would guess those numbers will stay high even after the APIs start returning [].

domenic commented 3 years ago

On the compat front, I suspect that removing PluginArray and MimeTypeArray would be hard, because of the extra methods (refresh(), item(), and namedItem()) which wouldn't be present on normal arrays.

But removing Plugin and MimeType might be easier, since if navigator.plugins and navigator.mimeTypes return [], then the only way to detect their existence is by checking for window.Plugin and window.MimeType, which is pretty pointless (and so, I would suspect, unlikely).

Going with just (2) for now makes a lot of sense to me; you're already taking on a nice chunk of work!

mfreed7 commented 3 years ago

Good point about Plugin and MimeType. Perhaps while I'm emptying the arrays, I'll also add use counters for the Plugin and MimeType constructors. Then we can see how often people construct those directly.

domenic commented 3 years ago

They don't even have constructors, so literally the only thing people could be doing is accessing the properties, like if (window.Plugin) or something.

I posted a PR at https://github.com/whatwg/html/pull/6296. As you can see there, because the arrays are now always empty and thus no Plugin or MimeType instances can ever exist, we can simplify their IDL and algorithms dramatically.

davidp3 commented 3 years ago

I suspect (2) is going to be easiest, but is that what Firefox did?

Firefox does (2) but not as a special case -- the arrays are just naturally empty. In an upcoming cleanup patch, they will be replaced with hard-coded empty arrays but still type (2) -- we do still have the PluginArray and MimeTypeArray IDL types. But this is the browser's only use of those types -- finding a way to get rid of them would be nice.

I guess we also still have the Plugin and MimeType types but they are useless as mentioned.

foolip commented 3 years ago

I guess we also still have the Plugin and MimeType types but they are useless as mentioned.

I looked into this a bit in https://github.com/mdn/browser-compat-data/pull/7863 last week. On Safari iOS, the Plugin and MimeType interfaces are not exposed, while PluginArray and MimeTypeArray are. This makes some sense since the "arrays" are always empty, and there can be no instances of Plugin or MimeType.

This behavior seems to go back to iOS 8, but I haven't confirmed in detail if all aspects of is changed then. In any case it's not new.

I think this gives some confidence that not having Plugin or MimeType is web compatible, and it seems worth trying to match Safari iOS by removing them.

tomayac commented 3 years ago

According to https://github.com/brave/brave-browser/issues/1549, Brave has been returning static values for plugins and mimeTypes for a while now.

mfreed7 commented 3 years ago

Thanks for the comments here. Please see the blink-dev discussion for more context, but I'm hoping to make this change in Chromium Canary soon, to at least see what the compat impact is. I like the idea of removing Plugin and MimeType also, given the iOS research above.

@rniwa, @hober any thoughts on this one?

mfreed7 commented 3 years ago

Ok, the change has landed in Chromium ToT. If anyone notices a site issue, please file a bug and cc me.

I've also filed a bug to track the potential removal of window.Plugin and window.MimeType entirely.

rniwa commented 3 years ago

Okay, I've asked internally, and I got a reply that we had to list the PDF plugin or else multiple websites would ask the user to download Adobe Acrobat Reader so there is some compatibility risk for this change.

mfreed7 commented 3 years ago

Okay, I've asked internally, and I got a reply that we had to list the PDF plugin or else multiple websites would ask the user to download Adobe Acrobat Reader so there is some compatibility risk for this change.

Great, thanks for that information. I'll watch for similar bug reports against Chromium. Do you happen to know when that change was attempted/reverted? I'm wondering if things may have (hopefully!) changed since then.

mfreed7 commented 3 years ago

So based on several bugs filed against Chromium for M90 (in which navigator.plugins/mimeTypes were emptied), I've had to revert that change. As expected (see comments above) there are PDF compatibility issues with sites that check navigator.plugins for the names of several well known PDF viewers. If not present, they default to downloading instead of viewing the PDF.

It would therefore seem that this change is not trivially implementable. My guess would be that we'd have to do some sort of longer term deprecation/removal effort, with an "Origin Trial" like opt-back-in for sites that break as a result. Having done something similar recently for Web Components v0, I can say that this isn't quick or easy.

Can anyone comment on the value of this change? I don't think the work above is worth it for just a clean up. But there is value is reducing the fingerprinting surface presented by these two APIs, right? Can anyone quantify that? It seems like there must not be too many bits of entropy in these lists, because there aren't many plugins available, but perhaps I'm wrong about that. I'm willing to do the work to deprecate/remove these lists, if there's real fingerprinting (removal) value to be had here.

rniwa commented 3 years ago

Why don't we just hard core the list of PDF plugins to be listed instead? If web compatibility requires it now, and it likely will in years or decades to come.

annevk commented 3 years ago

While unfortunate, standardizing what to return seems like a reasonable alternative indeed and should have identical fingerprinting properties.

domenic commented 3 years ago

I'm happy to work on updating the spec for this by partially reverting https://github.com/whatwg/html/pull/6296. But we need a bit more precision. I think we have three alternatives:

  1. Always return zero plugins.

  2. Always return one plugin, a specific PDF viewer. (Which one?) Do so regardless as to whether or not the browser has inline PDF-viewing capabilities.

  3. Return either zero or one plugins, depending on whether the browser has inline PDF viewing capabilities. E.g. Chrome desktop would return one plugin, Chrome Android mobile would return zero plugins.

In the modern landscape, (3) probably doesn't give significant additional bits over (1) or (2), since I'm not aware of any browsers these days which allow installing software that adds or removes inline PDF viewing capabilities. That is, the "has inline PDF viewing capabilities" bit gets subsumed into the general user agent bits which are already exposed through navigator.userAgent.

It sounds like we're leaning away from (1), despite Firefox getting away with it for some time. Between (2) and (3), which is preferred?

And, what plugins would we use? Chrome on Windows and ChromeOS currently has two plugins for PDF viewing:

(I don't have a Safari to test with.)

mfreed7 commented 3 years ago

Technically, on Chrome, there's a user setting that allows the user to disable the internal viewer. With that seeing enabled, PDFs get downloaded instead of viewed. I imagine when that setting is enabled, it has high entropy, if we leave it exposed in navigator.plugins. So if it turned out to be web compatible (it might not), I'd vote for option (2). As you said, though, it will be a bit odd for the spec to say this attribute must always return "Chrome PDF Plugin".

annevk commented 3 years ago

Firefox has an equivalent feature. As per https://github.com/web-platform-tests/wpt/pull/27129 it might be observable regardless of what this returns, e.g., depending on whether embed/object show fallback or not.

andreubotella commented 3 years ago

Note that Safari has an inline PDF viewer, but WebKit doesn't. And the WebKit-based GNOME Web supports inline PDF viewing by routing PDF responses through pdf.js but has an empty window.navigator.plugins.

rniwa commented 3 years ago

Note that Safari on iOS has empty plugins and mimeTypes.

Here's what Safari on macOS currently returns:

plugins
rniwa commented 3 years ago

I suspect the name and the filename doesn't really matter since Safari clearly gets away with an empty string. I suspect suffixes and type is what matters. Hopefully we can do away with postscript?

domenic commented 3 years ago

My proposal then would be:

And maybe we go for (3), i.e. either 0 plugins or the plugin above, except we say that a browser should not (must not?) vary between 0 and 1 plugins during its lifetime or according to a user setting?

I'm not sure exactly how to phrase that, but the idea is that navigator.plugins.length should give no more information than navigator.userAgent. That is:

domenic commented 3 years ago

I just re-read @annevk's comment from 6 hours ago https://github.com/whatwg/html/issues/6003#issuecomment-805908496 and realized that maybe it's premature to try to lock down the variance between 0 and 1. Maybe it's good enough for now for the spec to allow either 0 or the 1 plugin above.

I'll try to work on a spec PR for that tomorrow just so we can get a concrete sense of what it looks like...

mfreed7 commented 3 years ago

I'm starting to think this is an interop story, along with a privacy/fingerprinting story. I don't think the suggestion of just using name: "PDF" will be compatible. I've seen some of the broken site code here, and it explicitly looks for perfect string match for various known PDF plugin names (e.g. 'Chromium PDF Viewer'). Clearly these sites are already broken or at least different on different browsers, seemingly without good reason.

Is there perhaps a different avenue we could take? Since it seems like the broken use case is just PDF viewer support, could we instead add an API like

interface Navigator {
  readonly attribute boolean pdfViewerSupported;
}

And then we'll have a straightforward migration story, which can be used to properly deprecate the plugins arrays. That'll take a while, but it might (?) end up with a cleaner end-state.

rniwa commented 3 years ago

I'm starting to think this is an interop story, along with a privacy/fingerprinting story. I don't think the suggestion of just using name: "PDF" will be compatible. I've seen some of the broken site code here, and it explicitly looks for perfect string match for various known PDF plugin names (e.g. 'Chromium PDF Viewer'). Clearly these sites are already broken or at least different on different browsers, seemingly without good reason.

Maybe we can just declare that both WebKit Chrome PDF viewers are available.

Is there perhaps a different avenue we could take? Since it seems like the broken use case is just PDF viewer support, could we instead add an API like

interface Navigator {
  readonly attribute boolean pdfViewerSupported;
}

And then we'll have a straightforward migration story, which can be used to properly deprecate the plugins arrays. That'll take a while, but it might (?) end up with a cleaner end-state.

I mean we could add such a thing but I don't think we can get away with getting rid of navigator.plugins anyway. We're certainly not willing to make that kind of change anytime soon (as in, not in the next ~5 years).

domenic commented 3 years ago

Is there perhaps a different avenue we could take? Since it seems like the broken use case is just PDF viewer support, could we instead add an API like

This avenue makes a lot of sense to me, and I think would make web developers happy. The cost, of course, is the migration and corresponding outreach. But yeah, there are definite interop benefits, e.g. maybe some pages would start working better in Firefox compared to today where they assume Firefox's empty navigator.plugins means it can't display inline PDFs.

If you'd be willing to go that route in Chrome, and Firefox would be interested in adding such a boolean, then we could proceed in that fashion. That would leave nonempty navigation.plugins as a single-browser feature that doesn't need to be (re-)standardized.

Note also that this was proposed by @travisleithead back in 2018: https://github.com/whatwg/html/issues/3462 :)

annevk commented 3 years ago

I like @rniwa's idea of listing both Chromium and WebKit. A new API doesn't really solve the interop issue for sites that are not maintained.

mfreed7 commented 3 years ago

I suppose I'm ok specifying the list of PDF viewers, though it seems very strange. So the idea would be that mimeTypes would just return ['application/pdf'] (what about 'text/pdf'?), and plugins would return this?

[
   'Chrome PDF Plugin',
   'Chrome PDF Viewer',
   'Microsoft Edge PDF Viewer',
   'WebKit built-in PDF',
]

At what point do we start jockeying for the order of this list? 😄

I'm also happy to go the navigator.pdfViewerSupported route instead. It sounds like there isn't appetite for that from Mozilla, though.

rniwa commented 3 years ago

I'm also happy to go the navigator.pdfViewerSupported route instead. It sounds like there isn't appetite for that from Mozilla, though.

The issue here is really that there isn't really any guarantee that all the websites in the world will get updated to support this new properly, and even if they were, it will probably takes many years if not decades to get there.

domenic commented 3 years ago

I worked on a pull request to get us back to something implementable. You can see the results, and a summary, in https://github.com/whatwg/html/pull/6738. TLDR it's very similar to Safari, but a few places Safari uses the empty string I ported over some values from Chrome for maximum compat.

However, that PR has one bit of implementation-defined behavior, the name of the Plugin object. Could we clean that up?

We have examples of code in the wild that UA-sniffs Chrome and then depends on specific names (one of 'Chrome PDF Viewer', 'Microsoft Edge PDF Viewer', 'Chromium PDF Viewer'). So we (as in the Chrome team) need to preserve the ability to have at least one plugin with one of these names in the navigator.plugins array.

Given that constraint, I think the best we could do is instead of having 1 plugin, have N plugins. Perhaps 5? They would all be the same as the one specified in #6738, except they would have the following names (in order): PDF Viewer, Chrome PDF Viewer, Chromium PDF Viewer, Microsoft Edge PDF Viewer, WebKit built-in PDF. They would all point to those two MimeTypes, and those two MimeTypes would point to the 0th plugin (the PDF Viewer one). And everyone would implement the same list of 5 (or 0), and there'd be no way to use navigator.plugins as a sneaky user agent detector.

Notes about such an approach:

What do people think? Should we try to go this extra mile to totally eliminate the entropy in navigator.plugins?

mfreed7 commented 3 years ago

Thanks for working on this! I am strongly in favor of the "alternate" approach you laid out, with a fixed, unified list of N plugins and 2 MimeTypes. This eliminates the fingerprinting/UA sniffing vector, should retain nearly-full web compat, and hopefully improves interop.

domenic commented 3 years ago

Thanks! Any thoughts from @annevk @smaug---- @rniwa on the 1 Plugin/2 MimeTypes approach currently specified in #6738, vs. the N Plugins/2 MimeTypes approach described above?

rniwa commented 3 years ago

We should do whichever is more web compatible although hit would be nice if all UAs could use the same set of plugins & MIME types. Maybe worth attempting a single fixed list approach & see if we find any compatibility issues.

domenic commented 3 years ago

https://github.com/whatwg/html/pull/6738 has been updated with the 5-Plugin/2-MimeType approach

domenic commented 3 years ago

I found https://bugzilla.mozilla.org/show_bug.cgi?id=840439 which indicates people are at least interested in this for Gecko, and might be worth looking at in more detail to see some of the impacts.

domenic commented 3 years ago

The way Chrome and Firefox support PDF is as the result of a navigation, so plugin documents continue to exist, but I'm not sure they would be different from inline content that doesn't have a DOM as the result is basically a document with an opaque origin.

They are different for same-origin PDFs. In Firefox for a same-origin PDF you can introspect the resulting Document and see the <embed> element, etc. In Chrome (probably also Safari) you cannot.

I will spec this for now as allowing either the plugin document section (Firefox behavior) or the inline content that doesn't have a DOM section (Chrome behavior). We should tighten this up over time though.

domenic commented 3 years ago

I think we've finished the saga of navigator.plugins and navigator.mimeTypes.

At the next triage meeting I'd like to discuss if anyone is willing to spend time on the other issues in the OP.

mfreed7 commented 3 years ago

Quick check in on this issue. Chromium implemented and shipped the "Fixed" set of plugins and mimetypes in M94, which has been in stable release for about a week at this point. It is still too early to declare a full success, but I can at least report that I do not know of any reported breakage bugs due to this change. Which at least means that this change doesn't break the world. I'll update here if something does come up that looks bad, but otherwise we can perhaps conclude that the fixed plugin lists are web compatible.

mfreed7 commented 3 years ago

It has now been about a month and a half, and Chromium is almost 2 versions past M94 where these changes were made. I am unaware of any reported issues. I think it might now be safe to declare victory in removing this fingerprinting vector.

For reference, here are the related implementation bugs:

domenic commented 1 year ago

A status update:

Much of what the OP lists has been completed. What remains is:

It's tempting to entangle these tasks. E.g. do browsers really check for file extensions for PDFs? Do mutations of the classid attribute really cause PDFs to reload? Is the type attribute still relevant when PDFs are the only option? Etc.

Some sub-issues on this subject: