Closed minibiti closed 8 years ago
If you look in your browser console, you will likely see the following:
The character encoding of the plain text document was not declared. The document will render with garbled text in some browser configurations if the document contains characters from outside the US-ASCII range. The character encoding of the file needs to be declared in the transfer protocol or file needs to use a byte order mark as an encoding signature.
I have yet to figure out why Firefox would use such a dumb default, and how to work around it. It is Firefox that is giving Asciidoctor garbled text, and admitting it. Why on earth do they not serve UTF-8 by default?
The plugin retrieves the raw text using:
document.firstChild.textContent
If you run this in the console, you can see the garbage that Firefox is giving us.
The plugin used to fallback to XmlHttpRequest to retrieve the text in UTF-8, automatically detecting the encoding error. However, that seems to have been removed in recent versions (perhaps because Mozilla rejected it).
https://github.com/asciidoctor/asciidoctor-firefox-addon/blob/master/data/asciidocify.js#L53
Here's what it used to do:
// if charset is not UTF-8, try techniques to coerce it to UTF-8
// likely used only for local files
if (document.characterSet.toUpperCase() != 'UTF-8') {
try {
// this technique works if all characters are in standard ASCII set
// see: http://www.ascii-code.com
sanitizeAndShowHTML(convertToHTML(decodeURIComponent(escape(document.firstChild.textContent))));
} catch (decodeError) {
// XMLHttpRequest responseText is UTF-8 encoded by default
var xhr = new XMLHttpRequest();
xhr.open('GET', window.location.href, true);
xhr.onload = function (evt) {
if (xhr.readyState === 4) {
// NOTE status is 0 for local files (i.e., file:// URIs)
if (xhr.status === 200 || xhr.status === 0) {
sanitizeAndShowHTML(convertToHTML(xhr.responseText));
} else {
console.error('Could not read AsciiDoc source. Reason: [' + xhr.status + '] ' + xhr.statusText);
}
}
};
xhr.onerror = function (evt) {
console.error(xhr.statusText);
};
xhr.send();
}
} else {
sanitizeAndShowHTML(convertToHTML(document.firstChild.textContent));
}
I don't know any other way to force Firefox to give us UTF-8 encoded text.
...and what we used to do worked.
Thanks for you update Dan! I just realized that I have the problem on Firefox only indeed. It is working ok with Chrome. Interesting... :)
@Mogztter Is it true that Mozilla won't let us use XmlHttpRequest to fetch the source text? If not, can they offer a way to get UTF-8 encoded text from the document already loaded?
However, that seems to have been removed in recent versions (perhaps because Mozilla rejected it).
Correct.
Is it true that Mozilla won't let us use XmlHttpRequest to fetch the source text?
I think they don't want XmlHttpRequest because this is a synchronous method but I can try to explain why we need it (and maybe they will give me a workaround to get UTF-8 encoded text)
@Mogztter https://github.com/Mogztter Is it true that Mozilla won't let us use XmlHttpRequest to fetch the source text? If not, can they offer a way to get UTF-8 encoded text from the document already loaded?
— Reply to this email directly or view it on GitHub https://github.com/asciidoctor/asciidoctor-firefox-addon/issues/43#issuecomment-174918383 .
The strategy I recommend when talking to them is to focus on the primary objective, UTF-8. That way, the conversation doesn't get derailed by a debate about XmlHttpRequest. In other words, the focus should be UTF-8, not XmlHttpRequest.
Tell them that we need to obtain the plain text in UTF-8 (regardless of the browser's default encoding) and that we are open to using any API that will get us that result.
You can also emphasize that without the UTF-8 text, we cannot properly support non-English languages. They should be sensitive to that need.
Well said.
Really tired of wasting my time... Firefox SDK is pure nonsense ! The documentation is getting better but there's so many way to write extensions XUL, WebExtensions, SDK (High level API which is not compatible with Low level API)...
Anyway, I put back the fallback to XmlHttpRequest to retrieve the text in UTF-8 :tada:
And please vote for this issue https://bugzilla.mozilla.org/show_bug.cgi?id=1071816 :smile:
\o/
Voted.
A user of my little asciidoctor-notetaking
script, had a similar issue with window 7:
when the script open a local note-text-file in firefox, firefox assumes the text encoding to be the default windows locale, in his case something like "Windows-1252". Since the note-text-files are generated
with a template in UTF-8, special chars are not shown correctly.
Does your patch also solve this issue?
Not sure, is asciidoctor-notetaking
built on top of the Asciidoctor Firefox Addon ?
If yes I assume this will fix the issue.
If not you will have to wait for Mozilla to resolve this issue in core or to create an Add-on to fix it yourself. If you need help, feel free to ask ;)
A simple temporary workaround is to write the unicode bom in the source (geany editor has a command for this, for example)
Yes but this is now fixed in the 0.5.0 release: https://github.com/asciidoctor/asciidoctor-firefox-addon/releases/tag/v0.5.0
@Mogztter thank’you, I didn’t know of the new release. Will you go back to signing? From version 46 signature overriding won’t be possible anymore
From version 46 signature overriding won’t be possible anymore Yes I know :disappointed:
Currently the signing API is broken: https://bugzilla.mozilla.org/show_bug.cgi?id=1244644 but I will try to get 0.5.0 signed !
Hi, I have German and Norwegian characters in my document which do not render properly when I use the extension. Is there some custom attribute I could use to turn UTF-8 support on, or it is just not supported at the moment?
Thanks! JM.