Freeze-dry messes up if a stylesheet or framed document is encoded in utf16, utf32, or possibly other encodings. We useFileReader.readAsText to decode these resources, which by default assumes utf8 encoding. This assumption is adequate most of the time, but when it isn’t the resource is effectively unreadable.
I do not know enough about the standards, but I suppose the decoder should look at the HTTP Content-Type header, the file’s byte order mark (BOM), and in-document declarations (@charset in CSS, <meta charset=…> in HTML).
This detection&decoding issue seems so generic it should not have to burden this repo, but I have not yet discovered the right tool. Some options I thought of:
The browser’s fetch, but unfortunately appears not to help with decoding; its Response.text() is spec'd to "return the result of running UTF-8 decode on bytes".
XMLHttpRequest.responseText does seem to respect HTTP header and BOM, though I am not sure about in-document declarations. And it feels a little outdated, as I think fetch was supposed to make it obsolete; but perhaps not.
Some javascript module? I did not yet find anything that comes close.
Tips welcome.
Note this issue is similar to issue #29, but that one concerns the DOM that the browser has already decoded for us; this issue is about subresources we fetch.
Freeze-dry messes up if a stylesheet or framed document is encoded in utf16, utf32, or possibly other encodings. We use
FileReader.readAsText
to decode these resources, which by default assumes utf8 encoding. This assumption is adequate most of the time, but when it isn’t the resource is effectively unreadable.I do not know enough about the standards, but I suppose the decoder should look at the HTTP
Content-Type
header, the file’s byte order mark (BOM), and in-document declarations (@charset
in CSS,<meta charset=…>
in HTML).This detection&decoding issue seems so generic it should not have to burden this repo, but I have not yet discovered the right tool. Some options I thought of:
fetch
, but unfortunately appears not to help with decoding; itsResponse.text()
is spec'd to "return the result of running UTF-8 decode on bytes".XMLHttpRequest.responseText
does seem to respect HTTP header and BOM, though I am not sure about in-document declarations. And it feels a little outdated, as I thinkfetch
was supposed to make it obsolete; but perhaps not.Tips welcome.
Note this issue is similar to issue #29, but that one concerns the DOM that the browser has already decoded for us; this issue is about subresources we fetch.