Open Kikobeats opened 6 years ago
Hey Kiko, glad you like the library. I'm actually not really sure what "base64 html entities" you're talking about. There's regular HTML entities, but they are not base64 encoded - they use either a string or a number between "&" and ";", e.g:
HTML Entity | What it means | Decoded value |
---|---|---|
< |
"less than" character | < |
> |
"greater than" character | > |
© |
"copy" character | © |
† |
unicode character number 8212 | † |
All HTML entities are ASCII, which is a proper subset of UTF-8, so there shouldn't be a problem decoding in either order. I'd still recommend decoding UTF-8 first to keep a clear process and mental model:
iconv.decode
and get a JS string. JS string contains unicode characters and you can work with it using all JS operations like search/replace, regex, etc. Hi there, we have a maybe similar issue open in the MagicMirror repo: https://github.com/MichMich/MagicMirror/issues/2712
The problem is that the input contains html-entitries like "ö" for the german "ö" which isnt decoded by iconv-lite. If I understand this issue here and your reasing @ashtuchkin, then the MagicMirror code should handle those chars since they are not part of the encoding itself?
Yeah decoding html entities is outside of iconv-lite scope. Maybe there's another library that can do that?
Thx for the clarification (and of course your library). As it turned out, the nunjuck templating used was the culprit for our issue :-)
Hello,
Thanks for the library, it's very helpful 🙏 .
I'm afraid to do something wrong using it and I want to as openly to ask for advice.
I'm using icon-lite for decoding HTML. I created html-encode for that purpose, and normally we are interested in getting UTF8 string.
My concern is about Base64 HTML encoding entities.
Let put we have a simple HTML like that:
You can see two things into this markup:
Because Base64 is
ascii
, using the library I expect have a decode HTML doing something like:and then I use the otuput as input for decoding again into the target encoding, in this case, UTF-8:
The order is important; if I do ascii conversion as final step, the output is not the expected.
The thing I feel afraid is that doing ascii conversion first could be decode something related with the target charset.
I want to ask, do you think is it a good workflow, or shoul I delegate into specific base64 html entitites libraries, such as he?