Open Rob--W opened 7 years ago
Can you look here? https://bugzilla.mozilla.org/show_bug.cgi?id=1287264
@def00111 I looked (and I filed a new feature request at https://bugzilla.mozilla.org/show_bug.cgi?id=1425479). Why did you want me to look at that bug?
Why did you want me to look at that bug?
I just want to have you look at this bug :)
Maybe, we can also expose nsIChannel.contentDispositionFilename [1]?
I also have another idea. Can we add the download [1] attribute value to webRequest.onBeforeRequest details [2]? To get the filename from download attribute? Like with Content-Disposition header in webRequest.onHeadersReceived [3]?
Look here please: https://github.com/def00111/always-preview/blob/master/content.js
[1] https://developer.mozilla.org/en-US/docs/Web/HTML/Element/a#attr-download [2] https://developer.mozilla.org/en-US/Add-ons/WebExtensions/API/webRequest/onBeforeRequest#details [3] https://developer.mozilla.org/en-US/Add-ons/WebExtensions/API/webRequest/onHeadersReceived
Maybe, we can also expose nsIChannel.contentDispositionFilename [1]?
This extension is a very specialized use case. While having such a property would make the life of me as an extension developer easier, I don't think that that convenience outperforms the maintenance cost of exposing the info through the webRequest
extension API. Especially since it can fully be implemented in JavaScript with minimal performance impact - https://github.com/Rob--W/open-in-browser/blob/05b80a3ce151737cfc7735eb1a714dfa84f3e3a5/extension/content-disposition.js
Can we add the download [1] attribute value to webRequest.onBeforeRequest details [2]?
This, on the other hand, could be a good reason to support the API enhancement. But...:
<a download>
does not work for cross-origin resources, only same-origin resources. Furthermore, <a download>
is more commonly ysed for JS-generated content (blob:/data:-URLs), which is not intercepted by my extension. So the value of an accessor for the value of <a download>
is limited.
In the case of <a download>
to a same-origin resource without Content-Disposition
response header (which I presume is rare), users can just open the link in a new tab to get the dialog if they want to view it inline or trigger an Open in Browser dialog). In the worst case (e.g. if the link is not visible), then they can use the extension menu in the Tools menu to force the dialog to appear anyway.
I appreciate your comments, but I'd like to keep the comments here on-topic. If you have more to say (unrelated to content sniffing), please open a new issue or continue via e-mail.
is more commonly ysed for JS-generated content (blob:/data:-URLs), which is not intercepted by my extension. So the value of an accessor for the value of is limited.
Can i use content-disposition.js [1] in my add-on?
Is this the same what firefox does?
Can i use content-disposition.js [1] in my add-on?
Yes. When you add a commit in your repo, do link back to the original source in the commit description. Then in the future it will be easier for others to check whether the implementation is still up-to-date.
Is this the same what firefox does?
Yes, except for a few cases of malformed response headers (I don't think that you will ever find these in the wild). See the commit description and unit tests from https://github.com/Rob--W/open-in-browser/commit/6f3bbb8bbfc1e3e943200fffdb68d35075e82ddd
Last month I spent two weeks on implementing content sniffing, which was behaviorally identical to Firefox's implementation. Unfortunately, I lost the laptop before I pushed the changes, so I will document what's necessary in case anyone (maybe me?) is interested in implementing a content sniffer.
The full implementation (code and comments) consisted of about 3 - 5k lines of JS code (unit tests were written but not included in this count).
The implementation details are as follows (this is a brain dump from my recollection):
webRequest.filterResponseData
API can be used to inspect and modify the response body. This filter is activated after thewebRequest.onHeadersReceived
event stage, for http(s) only. There are several bugs, see the list of bugs that I appended to the bug that introduced this new webRequest method : https://bugzilla.mozilla.org/show_bug.cgi?id=1255894#a48785057_447061NS_CONTENT_SNIFFER_CATEGORY
(aka"net-content-sniffers"
) category are used to estimate the MIME type.nsUnknownDecoder::DetermineContentType
is used (which includes entries from theNS_DATA_SNIFFER_CATEGORY
(aka"content-sniffing-services"
) category.onHeadersReceived
by using thewebRequest.filterResponseData
to change the response body. For some types, prepending magic bytes can be done in a transparent way (e.g. HTML and plain text). For others, the response can be forced to HTML that in turn embeds a full-page iframe that requests the original URL (with cache buster). The extension can then intercept this request and pipe the original response to this new request. The reason for using an iframe is to ensure that the original response stream is not aborted. If the original response is not important, redirecting would work too.Content-Type
header.nsDocumentOpenInfo::DispatchContent
as I mentioned at )https://github.com/Rob--W/open-in-browser/issues/1#issuecomment-331710653)text/plain
,application/octet-stream
andapplication/x-unknown-content-type
MIME types, Firefox MAY activate content sniffing, and open a download dialog even if the content would otherwise be displayed inline (text/plain), or display the content inline even though the content usually triggers a download dialog (application/octet-stream).application/octet-stream
orapplication/x-unknown-content-type
, perform media sniffing:webRequest.filterResponseData
method can NOT be used to modify the response stream. To replace the document, you must run a content script in this new media document.text/html
,application/octet-stream
or containing "xml", then the feed sniffer is activated.Content-Type
is a case-sensitive match fortext/plain
,text/plain; charset=ISO-8859-1
,text/plain; charset=iso-8859-1
ortext/plain; charset=UTF-8
, AND theContent-Encoding
request header is NOT set, then the sniffer will either force a download dialog or display inline:application/octet-stream
= download dialog."application/x-unknown-content-type"
(or empty, as mentioned before), sniff magic bytes.NS_DATA_SNIFFER_CATEGORY
(aka"content-sniffing-services"
) categorytext/plain
sniffing (which would result intext/plain
orapplication/octet-stream
).Other notes relevant for the implementation:
application/octet-stream
, and neither mentions the specialapplication/x-unknown-content-type
(this MIME is an artefact of Firefox's implementation; internally it represents the default value for a MIME type in a HTTP channel).Bugs in the
webRequest.filterResponseData
API that I haven't reported upstream (yet?):Content-Type
isapplication/x-unknown-content-type
and the response is content-encoded, then the filtered response must also be encoded using the same type (e.g. gzipped) (for other types, e.g.text/html
, the encoding is transparent, i.e. the value of theContent-Encoding
header does not matter). The easiest way around this is to remove theAccept-Encoding
request header or theContent-Encoding
response header (or set it to "identity"). The more difficult way to get around this is to implement gzipping (and possibly other (obscure) encoding schemes such as deflate/brotli).StreamFilter
is closed, Firefox will always commit a navigation to a new document, even if no data was written to thatStreamFilter
, and even if the tab/frame has navigated to a different page. The only work-around that I could think of is to keep theStreamFilter
open forever (yuck).