Open domenic opened 4 years ago
If we can remove https://html.spec.whatwg.org/#concept-embed-type that would be nice. Hopefully browsers do not look for .pdf
.
The way Chrome and Firefox support PDF is as the result of a navigation, so plugin documents continue to exist, but I'm not sure they would be different from inline content that doesn't have a DOM as the result is basically a document with an opaque origin.
A question that came up in https://github.com/annevk/orb/issues/5 is whether we can make these always create nested browsing contexts. I suspect that won't work due to images, but it might be worth investigating more. (See also https://github.com/w3c/webappsec-fetch-metadata/issues/37 and https://github.com/w3c/webappsec-fetch-metadata/issues/45.) https://github.com/whatwg/fetch/pull/948 suggests this might be possible now plugins are removed, but we may need to keep some quirks. Seems very much worth investigating.
cc @anforowicz @arturjanc @mikewest
For navigator.plugins
and navigator.mimeTypes
I filed https://bugs.chromium.org/p/chromium/issues/detail?id=1164635 and https://bugs.webkit.org/show_bug.cgi?id=220495.
It looks like @mfreed7 is interested in having Chrome follow Firefox and return empty for navigator.plugins
and navigator.mimeTypes
, hooray!
@annevk, can you give more details on what Firefox does? For example, I could imagine:
PluginArray
, MimeTypeArray
, Plugin
, and MimeType
, and having navigator.plugins
and navigator.mimeTypes
return (frozen?) normal arrays.navigator.plugins.refresh()
always a no-op... or something in between
I suspect (2) is going to be easiest, but is that what Firefox did? A spec PR would be great, where we can hash out the details.
(Also, I guess navigator.javaEnabled()
always returns false now?)
😄
For at least a while (weeks? a milestone or two?) my plan is to just do #2. That also appears to be what Firefox has done, at least in effect. And in Chromium, FYI, javaEnabled() has been false since M45.
I'd be happy to do more (i.e. #1), but I suspect that would cause major compat issues. Navigator.plugins is used on 60% of pageloads today. And I would guess those numbers will stay high even after the APIs start returning [].
On the compat front, I suspect that removing PluginArray
and MimeTypeArray
would be hard, because of the extra methods (refresh()
, item()
, and namedItem()
) which wouldn't be present on normal arrays.
But removing Plugin
and MimeType
might be easier, since if navigator.plugins
and navigator.mimeTypes
return []
, then the only way to detect their existence is by checking for window.Plugin
and window.MimeType
, which is pretty pointless (and so, I would suspect, unlikely).
Going with just (2) for now makes a lot of sense to me; you're already taking on a nice chunk of work!
Good point about Plugin
and MimeType
. Perhaps while I'm emptying the arrays, I'll also add use counters for the Plugin
and MimeType
constructors. Then we can see how often people construct those directly.
They don't even have constructors, so literally the only thing people could be doing is accessing the properties, like if (window.Plugin)
or something.
I posted a PR at https://github.com/whatwg/html/pull/6296. As you can see there, because the arrays are now always empty and thus no Plugin
or MimeType
instances can ever exist, we can simplify their IDL and algorithms dramatically.
I suspect (2) is going to be easiest, but is that what Firefox did?
Firefox does (2) but not as a special case -- the arrays are just naturally empty. In an upcoming cleanup patch, they will be replaced with hard-coded empty arrays but still type (2) -- we do still have the PluginArray
and MimeTypeArray
IDL types. But this is the browser's only use of those types -- finding a way to get rid of them would be nice.
I guess we also still have the Plugin
and MimeType
types but they are useless as mentioned.
I guess we also still have the
Plugin
andMimeType
types but they are useless as mentioned.
I looked into this a bit in https://github.com/mdn/browser-compat-data/pull/7863 last week. On Safari iOS, the Plugin
and MimeType
interfaces are not exposed, while PluginArray
and MimeTypeArray
are. This makes some sense since the "arrays" are always empty, and there can be no instances of Plugin
or MimeType
.
This behavior seems to go back to iOS 8, but I haven't confirmed in detail if all aspects of is changed then. In any case it's not new.
I think this gives some confidence that not having Plugin
or MimeType
is web compatible, and it seems worth trying to match Safari iOS by removing them.
According to https://github.com/brave/brave-browser/issues/1549, Brave has been returning static values for plugins
and mimeTypes
for a while now.
Thanks for the comments here. Please see the blink-dev discussion for more context, but I'm hoping to make this change in Chromium Canary soon, to at least see what the compat impact is. I like the idea of removing Plugin and MimeType also, given the iOS research above.
@rniwa, @hober any thoughts on this one?
Ok, the change has landed in Chromium ToT. If anyone notices a site issue, please file a bug and cc me.
I've also filed a bug to track the potential removal of window.Plugin
and window.MimeType
entirely.
Okay, I've asked internally, and I got a reply that we had to list the PDF plugin or else multiple websites would ask the user to download Adobe Acrobat Reader so there is some compatibility risk for this change.
Okay, I've asked internally, and I got a reply that we had to list the PDF plugin or else multiple websites would ask the user to download Adobe Acrobat Reader so there is some compatibility risk for this change.
Great, thanks for that information. I'll watch for similar bug reports against Chromium. Do you happen to know when that change was attempted/reverted? I'm wondering if things may have (hopefully!) changed since then.
So based on several bugs filed against Chromium for M90 (in which navigator.plugins
/mimeTypes
were emptied), I've had to revert that change. As expected (see comments above) there are PDF compatibility issues with sites that check navigator.plugins
for the names of several well known PDF viewers. If not present, they default to downloading instead of viewing the PDF.
It would therefore seem that this change is not trivially implementable. My guess would be that we'd have to do some sort of longer term deprecation/removal effort, with an "Origin Trial" like opt-back-in for sites that break as a result. Having done something similar recently for Web Components v0, I can say that this isn't quick or easy.
Can anyone comment on the value of this change? I don't think the work above is worth it for just a clean up. But there is value is reducing the fingerprinting surface presented by these two APIs, right? Can anyone quantify that? It seems like there must not be too many bits of entropy in these lists, because there aren't many plugins available, but perhaps I'm wrong about that. I'm willing to do the work to deprecate/remove these lists, if there's real fingerprinting (removal) value to be had here.
Why don't we just hard core the list of PDF plugins to be listed instead? If web compatibility requires it now, and it likely will in years or decades to come.
While unfortunate, standardizing what to return seems like a reasonable alternative indeed and should have identical fingerprinting properties.
I'm happy to work on updating the spec for this by partially reverting https://github.com/whatwg/html/pull/6296. But we need a bit more precision. I think we have three alternatives:
Always return zero plugins.
Always return one plugin, a specific PDF viewer. (Which one?) Do so regardless as to whether or not the browser has inline PDF-viewing capabilities.
Return either zero or one plugins, depending on whether the browser has inline PDF viewing capabilities. E.g. Chrome desktop would return one plugin, Chrome Android mobile would return zero plugins.
In the modern landscape, (3) probably doesn't give significant additional bits over (1) or (2), since I'm not aware of any browsers these days which allow installing software that adds or removes inline PDF viewing capabilities. That is, the "has inline PDF viewing capabilities" bit gets subsumed into the general user agent bits which are already exposed through navigator.userAgent
.
It sounds like we're leaning away from (1), despite Firefox getting away with it for some time. Between (2) and (3), which is preferred?
And, what plugins would we use? Chrome on Windows and ChromeOS currently has two plugins for PDF viewing:
mimeType
: type
= application/x-google-chrome-pdf
, suffixes
= pdf
name
: Chrome PDF Plugin
filename
: internal-pdf-viewer
description
= Portable Document Format
mimeType
: type
= application/pdf
, suffixes
= pdf
name
: Chrome PDF Plugin
filename
: mhjfbmdgcfjbbpaeojofohoefgiehjai
description
: Portable Document Format
(I don't have a Safari to test with.)
Technically, on Chrome, there's a user setting that allows the user to disable the internal viewer. With that seeing enabled, PDFs get downloaded instead of viewed. I imagine when that setting is enabled, it has high entropy, if we leave it exposed in navigator.plugins
. So if it turned out to be web compatible (it might not), I'd vote for option (2). As you said, though, it will be a bit odd for the spec to say this attribute must always return "Chrome PDF Plugin".
Firefox has an equivalent feature. As per https://github.com/web-platform-tests/wpt/pull/27129 it might be observable regardless of what this returns, e.g., depending on whether embed
/object
show fallback or not.
Note that Safari on iOS has empty plugins and mimeTypes.
Here's what Safari on macOS currently returns:
I suspect the name and the filename doesn't really matter since Safari clearly gets away with an empty string. I suspect suffixes and type is what matters. Hopefully we can do away with postscript?
My proposal then would be:
mimeType
: type
= application/pdf
, suffixes
= pdf
, description
= "Portable Document Format"name
: "PDF" (in case people are looking for the string "PDF" which both "Chrome PDF Plugin" and "WebKit built-in PDF" have)filename
: ""description
: ""And maybe we go for (3), i.e. either 0 plugins or the plugin above, except we say that a browser should not (must not?) vary between 0 and 1 plugins during its lifetime or according to a user setting?
I'm not sure exactly how to phrase that, but the idea is that navigator.plugins.length
should give no more information than navigator.userAgent
. That is:
I just re-read @annevk's comment from 6 hours ago https://github.com/whatwg/html/issues/6003#issuecomment-805908496 and realized that maybe it's premature to try to lock down the variance between 0 and 1. Maybe it's good enough for now for the spec to allow either 0 or the 1 plugin above.
I'll try to work on a spec PR for that tomorrow just so we can get a concrete sense of what it looks like...
I'm starting to think this is an interop story, along with a privacy/fingerprinting story. I don't think the suggestion of just using name: "PDF"
will be compatible. I've seen some of the broken site code here, and it explicitly looks for perfect string match for various known PDF plugin names (e.g. 'Chromium PDF Viewer'). Clearly these sites are already broken or at least different on different browsers, seemingly without good reason.
Is there perhaps a different avenue we could take? Since it seems like the broken use case is just PDF viewer support, could we instead add an API like
interface Navigator {
readonly attribute boolean pdfViewerSupported;
}
And then we'll have a straightforward migration story, which can be used to properly deprecate the plugins
arrays. That'll take a while, but it might (?) end up with a cleaner end-state.
I'm starting to think this is an interop story, along with a privacy/fingerprinting story. I don't think the suggestion of just using
name: "PDF"
will be compatible. I've seen some of the broken site code here, and it explicitly looks for perfect string match for various known PDF plugin names (e.g. 'Chromium PDF Viewer'). Clearly these sites are already broken or at least different on different browsers, seemingly without good reason.
Maybe we can just declare that both WebKit Chrome PDF viewers are available.
Is there perhaps a different avenue we could take? Since it seems like the broken use case is just PDF viewer support, could we instead add an API like
interface Navigator { readonly attribute boolean pdfViewerSupported; }
And then we'll have a straightforward migration story, which can be used to properly deprecate the
plugins
arrays. That'll take a while, but it might (?) end up with a cleaner end-state.
I mean we could add such a thing but I don't think we can get away with getting rid of navigator.plugins
anyway. We're certainly not willing to make that kind of change anytime soon (as in, not in the next ~5 years).
Is there perhaps a different avenue we could take? Since it seems like the broken use case is just PDF viewer support, could we instead add an API like
This avenue makes a lot of sense to me, and I think would make web developers happy. The cost, of course, is the migration and corresponding outreach. But yeah, there are definite interop benefits, e.g. maybe some pages would start working better in Firefox compared to today where they assume Firefox's empty navigator.plugins
means it can't display inline PDFs.
If you'd be willing to go that route in Chrome, and Firefox would be interested in adding such a boolean, then we could proceed in that fashion. That would leave nonempty navigation.plugins
as a single-browser feature that doesn't need to be (re-)standardized.
Note also that this was proposed by @travisleithead back in 2018: https://github.com/whatwg/html/issues/3462 :)
I like @rniwa's idea of listing both Chromium and WebKit. A new API doesn't really solve the interop issue for sites that are not maintained.
I suppose I'm ok specifying the list of PDF viewers, though it seems very strange. So the idea would be that mimeTypes
would just return ['application/pdf']
(what about 'text/pdf'
?), and plugins
would return this?
[
'Chrome PDF Plugin',
'Chrome PDF Viewer',
'Microsoft Edge PDF Viewer',
'WebKit built-in PDF',
]
At what point do we start jockeying for the order of this list? 😄
I'm also happy to go the navigator.pdfViewerSupported
route instead. It sounds like there isn't appetite for that from Mozilla, though.
I'm also happy to go the
navigator.pdfViewerSupported
route instead. It sounds like there isn't appetite for that from Mozilla, though.
The issue here is really that there isn't really any guarantee that all the websites in the world will get updated to support this new properly, and even if they were, it will probably takes many years if not decades to get there.
I worked on a pull request to get us back to something implementable. You can see the results, and a summary, in https://github.com/whatwg/html/pull/6738. TLDR it's very similar to Safari, but a few places Safari uses the empty string I ported over some values from Chrome for maximum compat.
However, that PR has one bit of implementation-defined behavior, the name
of the Plugin
object. Could we clean that up?
We have examples of code in the wild that UA-sniffs Chrome and then depends on specific names (one of 'Chrome PDF Viewer', 'Microsoft Edge PDF Viewer', 'Chromium PDF Viewer'). So we (as in the Chrome team) need to preserve the ability to have at least one plugin with one of these names in the navigator.plugins
array.
Given that constraint, I think the best we could do is instead of having 1 plugin, have N plugins. Perhaps 5? They would all be the same as the one specified in #6738, except they would have the following names (in order): PDF Viewer
, Chrome PDF Viewer
, Chromium PDF Viewer
, Microsoft Edge PDF Viewer
, WebKit built-in PDF
. They would all point to those two MimeType
s, and those two MimeType
s would point to the 0th plugin (the PDF Viewer
one). And everyone would implement the same list of 5 (or 0), and there'd be no way to use navigator.plugins
as a sneaky user agent detector.
Notes about such an approach:
If anyone is depending on the 0th plugin having a specific UA-correlated name
, this will not work. We have not seen such code, and it seems unlikely someone would write it, but you never know...
I specifically went for maximalism with this list and am happy to add more. We could also try to pare it down to a minimum (probably Chrome PDF Viewer
plus WebKit built-in PDF
), but see the next point.
We need to pick a specific "privileged" plugin for the MimeType
s to point to (via their enabledPlugin
property). I specifically inserted a generic PDF Viewer
entry so that no brand would get this privilege. But maybe it'd be better to just be minimal, at the cost of privileging some brand?
Apart from the privileged 0th position, I alphabetized the rest of the names. If people have other ideas I'm happy to do that instead.
What do people think? Should we try to go this extra mile to totally eliminate the entropy in navigator.plugins
?
Thanks for working on this! I am strongly in favor of the "alternate" approach you laid out, with a fixed, unified list of N plugins and 2 MimeTypes. This eliminates the fingerprinting/UA sniffing vector, should retain nearly-full web compat, and hopefully improves interop.
Thanks! Any thoughts from @annevk @smaug---- @rniwa on the 1 Plugin/2 MimeTypes approach currently specified in #6738, vs. the N Plugins/2 MimeTypes approach described above?
We should do whichever is more web compatible although hit would be nice if all UAs could use the same set of plugins & MIME types. Maybe worth attempting a single fixed list approach & see if we find any compatibility issues.
https://github.com/whatwg/html/pull/6738 has been updated with the 5-Plugin/2-MimeType approach
I found https://bugzilla.mozilla.org/show_bug.cgi?id=840439 which indicates people are at least interested in this for Gecko, and might be worth looking at in more detail to see some of the impacts.
The way Chrome and Firefox support PDF is as the result of a navigation, so plugin documents continue to exist, but I'm not sure they would be different from inline content that doesn't have a DOM as the result is basically a document with an opaque origin.
They are different for same-origin PDFs. In Firefox for a same-origin PDF you can introspect the resulting Document
and see the <embed>
element, etc. In Chrome (probably also Safari) you cannot.
I will spec this for now as allowing either the plugin document section (Firefox behavior) or the inline content that doesn't have a DOM section (Chrome behavior). We should tighten this up over time though.
I think we've finished the saga of navigator.plugins
and navigator.mimeTypes
.
At the next triage meeting I'd like to discuss if anyone is willing to spend time on the other issues in the OP.
Quick check in on this issue. Chromium implemented and shipped the "Fixed" set of plugins and mimetypes in M94, which has been in stable release for about a week at this point. It is still too early to declare a full success, but I can at least report that I do not know of any reported breakage bugs due to this change. Which at least means that this change doesn't break the world. I'll update here if something does come up that looks bad, but otherwise we can perhaps conclude that the fixed plugin lists are web compatible.
It has now been about a month and a half, and Chromium is almost 2 versions past M94 where these changes were made. I am unaware of any reported issues. I think it might now be safe to declare victory in removing this fingerprinting vector.
For reference, here are the related implementation bugs:
A status update:
Much of what the OP lists has been completed. What remains is:
It's tempting to entangle these tasks. E.g. do browsers really check for file extensions for PDFs? Do mutations of the classid
attribute really cause PDFs to reload? Is the type
attribute still relevant when PDFs are the only option? Etc.
Some sub-issues on this subject:
With Flash deprecation, it might be the case that browsers no longer implement plugins, and as such we could remove some of the spec complexity around them.
Note, however, that we do need to keep enough to allow rendering of PDFs, and to allow rendering of HTML pages (and other supported content types) in
embed
andobject
. So it's not like we can completely gut theembed
andobject
sections.And, it looks like at the time of https://github.com/whatwg/html/issues/3958, Chrome treated PDFs as plugins. So maybe we can't remove much.
Anyway, some candidates for removal:
NavigatorPlugins
and friends, but see #3462; PDF support might end up living here in existing browsers.embed
and especiallyobject
could likely be simplified to be more likeiframe
.param
element might be able to be totally neutered and moved to the obsolete section.The
param
simplification likely holds even if PDF ends up staying around as a plugin. (And in turn we can close #387.)