Open natevw opened 5 years ago
This is a tough one. As you pointed out, this requires a lookup table for a portable implementation. A browser-specific implementation could leverage the DOM to transform HTML entities, but there would be performance implications. My usual approach is to pull the text out into strings as you described, but it seems like we'll want a solution for this in order to maintain some semblance of parity with JSX.
Most of the rather large HTML set seems more for "typing on ASCII keyboard" (convenience) than necessity. How about supporting only the "core" ones that are built into XML? They are the ones needed syntactically:
quot
, amp
, lt
, gt
, and apos
*?
*iiuc '
was not standardized on the HTML side until the HTML5 spec, so maybe it could be left out?
There's also the decimal/hex forms; personally I don't see a strong need for those but they'd just be code rather than LUT entries.
Until someone merges the correct solution, here's what I'm using temporarily.
In my class, I added:
decode(str) {
const s = "<b>" + str + "</b>";
let e = document.createElement("decodeIt");
e.innerHTML = s;
return e.innerText;
}
And to use it:
render() {
return html`<div>Hello ${this.decode('·')} Goodbye</div>`;
}
It's probably not good code, but it works for me at least. Might help someone else too.
@thorie7912 Wouldn't that be super slow since that is creating a DOM node every time you render? I'm not sure if it's a good idea.... It would be better using a package like unescape
Yeah, its very slow. But like I said, it's temporary. I hope this ticket gets resolved soon. I don't think I can use a package like unescape, because I'm not using NodeJs. I'm directly pulling HTM from a CDN.
This could be optimized, if we keep one DOM node available for doing all conversions. Then we don't need to recreate a new DOM node every time. It's only using the text, and it can be replaced for each entity we want to convert.
@thorie7912 For character entities like that it would be better to manually decode them yourself. Assuming you can save/serve your file as UTF-8 then it will read well as simply:
render() {
return html`<div>Hello · Goodbye</div>`;
}
If your content can only be served as ASCII (and you also can't <meta charset="utf-8">
inline):
render() {
return html`<div>Hello \u00b7 Goodbye</div>`;
}
There are only a couple characters where you can't always do this; e.g. the <
character would get parsed as the opening of a tag in some contexts, even if "escaped" [at the source-code level] as \u003c
. For those, again rather than sending the entity out to the DOM for decoding, simply pre-convert and "escape" them via the original workaround above:
render() {
return html`<div>Hello ${'<'} Goodbye</div>`;
}
Just wanted to note that I've read this and am pondering what we could do to move forward.
@natevw I like your point about which ones are needed (vs wanted for compat). From a purely design perspective, HTM's parser interprets <>
, but does not offer a mechanism for escapement. That seems worth rectifying to me, but I worry about special-casing characters.
I also wanted the output in an unescaped way and as @pcr910303 mentioned, decode(htm(some jsx here))
worked very well.
Ran into this as well. Please keep htm
portable, I am using it in a web worker. 😀
Hello, I was looking into this issue and I stumbled into this. My use case looks like this:
render(html`
<style>
.selector > * {
padding-top: 0.75rem;
}
</style>
<h1>HTML</h1>
`);
And I use preact-render-to-string
to convert it and use it in an 11ty file. Using ${'>'}
still doesn't solve the problem and makes it impossible to use inline styles using >.
I understand the problem and I get why this should not really be fixed directly in HTM, but I wonder if there is a workaround to mu issue above, or if we can import something extra o handle this scenario.
One way to fix my issue is to
render(html`...`).replace(/>/gi, ">");
But I would rather not change all of them in my output html.
For those use cases, as they are text, you likely could opt-out of htm, right?
render(html`
<style>
${`.selector > * {
padding-top: 0.75rem;
}`}
</style>
<h1>HTML</h1>
`);
Reproduction
With the following code:
I get a VNode with
.children = ["<"]
— i.e. what I mean to render as<
gets rendered as<
instead.Expected results
Testing this in JSX via say https://jsx.egoist.moe/?mode=vue,
<div><</div>
gets transformed intoh("div", ["<"]);
as I originally expected.I haven't poked into how they are doing this… seems like something that ultimately relies on a lookup table.
Workaround
I am able to escape via string interpolation (e.g.
${'<'}
instead of<
). Changing the last line in my sample code to:Results in a VNode with
.children = ["<"]
as I need. Is this the recommended style?