facebook / jsx

The JSX specification is a XML-like syntax extension to ECMAScript.
http://facebook.github.io/jsx/
1.96k stars 132 forks source link

Add TemplateLiteral support to JSXAttribute. #132

Closed tolmasky closed 2 years ago

tolmasky commented 2 years ago

Given the comments in this Babel thread, here is my attempt at resolving the template string issue. This PR supports the "decode strings in the normal JavaScript manner", and my reasoning is below (there doesn't really seem to be a place to put the reasoning in the spec itself, although I have tried to clarify the encoding issues for future spec readers):

  1. As stated above, this spec change has template string attributes behave identically to their JavaScript counter-parts with regard to string encoding. In other words, <tag name = `[value]`/> is identical to writing <tag name = { `[value]` } />.
  2. The main reasoning here is that the primary purpose of having <tag name = "[value]"/> use the HTML-style string encoding is because there is a matching syntax production in HTML that we want to emulate. As a simple example, copying <tag name = "&amp;"> should not produce two surprisingly different results depending on whether it appears in HTML or JSX.
  3. However, there is no such matching production in HTML that we are attempting to support with our template strings. Short of having the template string incorporate the delimiting back-ticks and ignore any internal template elements, it will never match the expected behavior of HTML. (That is to say, in HTML <tag name = `[value]`> is like writing <tag name = "`[value]`">, with the additional complications since you don't have quote boundaries to clearly delimiter when the value ends, etc.).
  4. Template strings that provide a good escape hatch from the potentially surprising html entity encoding process to give you normal JS encoding. When used without any internal template elements, you essentially get JSX attributes that encode the same as JavaScript without the additional curly brace noise.

As @sebmarkbage mentioned in the above comment, it appears the sentiment is that we wish JSX simply used JS encoding across the board, so given that template strings are a new production that pose no backwards compatibility issues, it seems like a good opportunity to introduce a way to get this behavior without introducing breaking changes. If this is a direction we want JSX to go more generally, we also would have the opportunity to start telling people to use template strings across the board as a best practice, to clearly communicate in their code that they are using a JS version of strings and not an HTML one. Arguably, you could someday even deprecate double and single quoted attributes in favor of these and have a similar "JSX is not HTML" message when used to how it currently says "JSX is not XML" when namespaced tags are used. All these options are of course outside the scope of this one addition, but I bring them up to demonstrate that this simple addition opens up a lot of flexibility for creating smoother transitional paths to more breaking changes that have been discussed here for years.

NOTES

I've made a few other changes here to hopefully help people in the future:

  1. I've linked Syntax productions to their ECMAScript pages.
  2. I've tried to do the same with ESTree node elements (although outside the code block since you can't link there).
  3. I've attempted to explicitly note the HTML attribute encoding for JSXSingleStringCharacter and JSXDoubleStringCharacter as this appears to currently be absent from the spec (please correct me if I'm wrong here). This also through omission implies that the TemplateLiteral syntax production behaves identically to the ECMAScript one.
  4. I updated the @babel/parser link.

Closes #25 Closes #112

sebmarkbage commented 2 years ago

This is a somewhat unique case in that it's not a breaking change by itself. It would be a breaking change if everyone also switched the encoding.

We don't actually have to make that breaking change but say that the intention is to make the change at some point in the future when there's an appropriate time to version or introduce modes (like strict mode was). In that case, this is just jumping ahead and then it wouldn't be a breaking change later neither. In other words, if we declare the intention, we don't have to rewind this later on.

In that case I think it would be ok to land this.

@DanielRosenwasser is there anyone on the TypeScript side that has opinions about this?

DanielRosenwasser commented 2 years ago

I have to discuss with our team - but could you update the title to reflect the change in encoding? Otherwise, it isn't obvious that this goes beyond adding template strings.

tolmasky commented 2 years ago

Hi @DanielRosenwasser,

Just wanting to clarify here, to make sure I understand the request and make certain that there's no misunderstanding (including on my part!). This PR is specifically not changing the encoding of any existing JSX syntax construct. This PR does however include documentation of the existing use of HTML encoding in JSX attributes that was previously not documented in the spec however.

The only actual goal of this PR is to add template strings as supported values to JSX attributes. It was asked in the thread what encoding these should have, simply because double-quote and single-quote strings have this (undocumented) subtle behavior that diverges from JS proper, and this PR just answers the question that this specific element requires no such divergence and behaves identically to having placed a template string in curly braces. Does that make sense? I'm happy to add stuff to the title, but I don't want to confuse anyone into thinking I am changing encodings on existing features or anything like that, the entire discussion around encodings is just a meta-question that was asked about this, and that should also be included in the documentation so that this confusion doesn't come up again in the future.

tolmasky commented 2 years ago

Also, don't know if the title change question was directed at me or not. If it wasn't, then no worries and I'm sure @sebmarkbage knows the right title change to make!

DanielRosenwasser commented 2 years ago

Hi @tolmasky, thanks for clarifying! We just discussed the issue at our design meeting, and I now understand that this is a clarification of the intended behavior for how <el foo="&emdash;" /> and the like are interpreted (and I guess TypeScript diverges from the intended behavior).

Taking a little more time to read the other issue, it seems like aligning with JS string semantics would have been desirable. Given the complexity of supporting another way to interpret string characters, and the refactoring hazard between <el foo="&emdash;" /> and <el foo={"&emdash;"} />, I feel the same. I wonder if other implementations are also in a similar spot.

I'll bring this back up with others.

nicolo-ribaudo commented 2 years ago

Babel currently support HTML entities in attributes, which makes it easier to copy-paste from HTML to JSX. However, we are not opposed to change it an align with JS strings: we will have the next major in the next few months, and we could introduce this breaking change.

I agree that template strings should align with JS semantics: since HTML doesn't support JS templates the difference will never cause copy-paste problems. However, I think we should work with the eslint-plugin-react authors to introduce a warning when using HTML entities in template attributes (and potentially also in "/'-style attributes).

Given the difference between TS and Babel, I don't think that this PR should mention the WHATWG encoding and we can just consider it "compiler-dependent behavior".

RyanCavanaugh commented 2 years ago

Babel currently support HTML entities in attributes

Is this an opt-in? In the repl today I see it transform <div title="&dash;"></div> to React.createElement("div", { title: "&dash;" });

nicolo-ribaudo commented 2 years ago

Uh well, &dash is missing: https://github.com/babel/babel/blob/main/packages/babel-parser/src/plugins/jsx/xhtml.js

RyanCavanaugh commented 2 years ago

LOL, I should buy a lottery ticket 👍

DanielRosenwasser commented 2 years ago

Don't know how you could have possibly missed the emdash 😄

nicolo-ribaudo commented 2 years ago

TS supports HTML entities too, and it we are not the only ones who missed &dash; :stuck_out_tongue: https://www.typescriptlang.org/play?#code/DwEwlgbgBAhgvAIgGQBsAuBuJIYGcAWGCUA9AHxA

Jokes aside, it looks like both Babel and TS support HTML 4 but not HTML 5.

RyanCavanaugh commented 2 years ago

Can't believe dash didn't make the cut for HTML4. Anyway the complete list of HTML5 entities seems untenably large for every JSX transformer to include as a lookup table (it appears to be possibly thousands of entries long?), and it also seems awkward to say "HTML4 entities are encoded for convenience but not any higher", or "Entity transformation is a host-dependent function".

We should find some way to spec something that Babel and TS can both agree on, even if it's just a hardcoded list of ~223 entries. I don't want people's display changing based on which transformer they're using.

Me1000 commented 2 years ago

The whatwg hosts a json file of all the HTML5 entities: https://html.spec.whatwg.org/entities.json It's 2231 entries (some of which are duplicative; e.g. &AMP;, &AMP, &amp;, and &amp), which honestly doesn't seem that terrible, especially considering the working group has said the list will not be expanded: https://github.com/whatwg/html/blob/main/FAQ.md#html-should-add-more-named-character-references

From a developer's perspective I'd find it quite odd if I tried to use an html5 entity and it wasn't working, only to learn that JSX only supports HTML 4 entities.

tolmasky commented 2 years ago

Should we perhaps split out discussion on the entities and so forth to a separate issue as its somewhat orthogonal to the template string stuff, and arguably more subtle as it also means things like not resolving JavaScript unicode escape sequences, etc.

tolmasky commented 2 years ago

Just wanted to follow up and confirm that there's nothing blocking this PR and that remaining open issues seem to be in existing syntax.

nicolo-ribaudo commented 2 years ago

I agree that the possible breaking change regarding "/' attributes is separate from this one. If I had to implement this in Babel behind a flag (while this is still a "proposal"), I'd start by throwing for HTML-entities used in template-like attributes so that then we can relax the restriction in either direction if we decide to align them with "/'-style attributes.

tolmasky commented 2 years ago

Just wanted to check in again as it's been a week, any reason not to merge this in?

tolmasky commented 2 years ago

Just checking in again. Are there any blockers here? Happy to work on anything left to do.

sebmarkbage commented 2 years ago

Given that I'm leaving Facebook/Meta, we're still figuring out merge rights in this repo. Let's pick back up after the holidays.

nicolo-ribaudo commented 2 years ago

@tolmasky If you want, you could start working on a draft PR for @babel/parser, that we can merge when this is merged.

tolmasky commented 2 years ago

I hope everyone had a good new year! I am hoping to pick this back up now so we don't lose the thread on this. If I recall correctly, there were no outstanding issue with this PR and we simply wanted to wait until the holidays were over. Are we ready to go now?

tolmasky commented 2 years ago

Just checking in again. My fear is that this is getting punted to whoever is coming onboard next and this will have to be relitigated all over again. Given that we've already determined this is a safe backward and forward-compatible change, and the unfortunate 7 year history of this request, I'm hoping we can avoid delaying this further.

tolmasky commented 2 years ago

Just checking in on this again, can we get it merged in?

tolmasky commented 2 years ago

Coming here for my weekly check in. Back in the Babel thread that kicked this off, it was implied that if the work was just put in, the result would be different than what's been going on for the last 7 years. I think we all agree this is a super low-risk addition, and I'm actually really excited to get it implemented in Babel once it's accepted. But I'm apprehensive of devoting any more time on this if the expected result is that it stays behind some flag forever because the spec never actually gets updated.

Huxpro commented 2 years ago

@tolmasky sorry for the late. The holiday season was a bit long. There was New Year, but then there is Lunar New Year...😛

My fear is that this is getting punted to whoever is coming onboard next and this will have to be relitigated all over again.

So apparently, this is getting punted to me, indeed someone newly onboard...

Jk aside, I'll need a bit of time to warm up and discuss with other folks, but I'll make sure I follow up soon 🤞

wooorm commented 2 years ago

One thing that I haven’t seen discussed before is how whitespace is handled in backticked strings. It doesn’t have to be handled. But it might be good to think about it due to:

tolmasky commented 2 years ago

Hi @wooorm, from a technical perspective that proposal is orthogonal to this one since it requires a new syntax production that this proposal doesn't have (the @template literal, vs. the normal template literal) so you'd have to use it you'd have to wrap it in curly braces and thus be outside the scope of this issue, much like the question of "bare" numeric and boolean literals. I think the appropriate place to discuss it would be to open a separate bug to track this fourth type of string literal ('', "", ``, code>@``</code) to not further confuse this.

I will mention that I think there's really no concern though, as just like with normal template literals, it could behave identically as to placing the same expression in curly braces, and we'd really have to go out of our way to come up with new behavior considering both of these exist at least in part specifically to define how to treat newlines.

tolmasky commented 2 years ago

@tolmasky sorry for the late. The holiday season was a bit long. There was New Year, but then there is Lunar New Year...😛

My fear is that this is getting punted to whoever is coming onboard next and this will have to be relitigated all over again.

So apparently, this is getting punted to me, indeed someone newly onboard...

Jk aside, I'll need a bit of time to warm up and discuss with other folks, but I'll make sure I follow up soon 🤞

Sounds good, and happy to have you here! Hopefully we can skip the relitigation part though 😉

tolmasky commented 2 years ago

Another week, another check-in. Is there a specific question that needs answering? The entire thread is just everyone agreeing this is a super minor change that is both forwards and backwards compatible. This "feature" is basically as close as a feature can get to just being a bug fix, and in fact contains fixes to the spec that describe the behavior it has already had for the entire existence for the language but no one ever bothered to document. It seems like we should be able to just hit the merge button on this thing and move forward.

Huxpro commented 2 years ago

@tolmasky after some ramping up, I think there are at least two concerns here:

1. Adding template literal

Yep it's mostly incremental by its own, but the concern is not about adding this one particular syntax. If we add this, then why not numeric literals? Why not array/boolean/null literals? Why not Object literal...oh we can't because it conflicts with JSXExpressionContainer so shall we just change everything like a JSX 2.0?

I think @sebmarkbage's response to numerical literal (#64) at #119 still hold:

This one is tied to the other forms we might want to allow here. There needs to be a story about how all these values tie together, not just added one-by-one. One possible solution is to allow a limited set of all expressions. If we can't do all expressions, what's the rule for what is included and what isn't?

Those minor things add up. The real cost here is that it breaks some people's perception of JSX being a very simple HTML/XML-like syntax sugar with only JS semantics presented in JSX identifiers and within {}.

The entire thread is just everyone agreeing this...

And don't forget "everyone" here is tiny comparing to the actual JSX audience.

2. "Fixes" to the spec

I'm not sure if those changes are either correct or desired.

In one hand, if what we wanna fix is to be able to "describe the behavior it has already had", then it's incorrect because the status quo is that Babel/TSC only supports ~200+ XHTML/HTML4 entities, likely not what is captured by the living HTML standards. It's also strange to "NOTE" JSXSingleStringCharacter as a HTML string while having its production SourceCharacter referencing to the ECMA-262.

In another hand, if what we wanna fix is this poorly standardized situation, then this may not be the desired changes. We (React, Babel, TSC, etc.) may see this as a opportunity to enforce the current spec if we agreed that it's better, which technically derive JSXSingleStringCharacter, JSXDoubleStringCharacter and JSXTextCharacter from ECMA-262 6th's SourceCharacter. So we shouldn't convert any HTML entities.

It doesn't mean we shouldn't move forward to add template literal, but I personally don't think it's that simple.

tolmasky commented 2 years ago

Yep it's mostly incremental by its own, but the concern is not about adding this one particular syntax. If we add this, then why not numeric literals? Why not array/boolean/null literals? Why not Object literal...oh we can't because it conflicts with JSXExpressionContainer so shall we just change everything like a JSX 2.0?

This is part of the problem of having the reviewer swapped out 3 months in -- the original context that was presented to me was a strong desire to move to JavaScript strings, and that this provides both a great backwards compatible way of avoiding the very subtle and surprising (and undocumented) situation with HTML entities, as well as being a feature that people have been strongly asking for for years, that precisely avoids any of these weird "HTML comparison" issues since there's no analogue there, like there is with numeric literals, which are widely used in normal HTML and would have a divergent meaning (numeric vs. string). We've been teasing "JSX 2.0" (and using its theoretical future existence as an excuse to punt JSX questions) for 7 years with little to actually show for it. I think we can stop letting perfect be the enemy of good, and "a grand unified theory of JSX literals" seems like an unfair 11th-hour requirement for a feature that will offer strictly less roadblocks than the existing strings. And regardless, if the whole idea for JSX 2.0 is to be "the big breaking change", then we have nothing to worry about! Worse comes to worst it'll break this feature too (although I feel you'd have to go out of your way to figure out how to do that).

And don't forget "everyone" here is tiny comparing to the actual JSX audience.

Well, there's also that bug that's been open for 7 years and numerous duplicates. I find it hard to believe people would be upset by supporting all 3 string literal types instead of just 2.

In one hand, if what we wanna fix is to be able to "describe the behavior it has already had", then it's incorrect because the status quo is that Babel/TSC only supports ~200+ XHTML/HTML4 entities, likely not what is captured by the living HTML standards.

The HTML spec specifically states that the list is static and will not be changed in the future, precisely for the reason that it would otherwise create endless updating work for implementers.

It's also strange to "NOTE" JSXSingleStringCharacter as a HTML string while having its production SourceCharacter referencing to the ECMA-262.

It's a "NOTE" precisely because it's a stop-gap until we can do something better. You'll note in this very thread that people forget about it because there's no mention of this anywhere, it exists solely in the heads of certain individuals and in the various scattered implementations. Hey look, here's an 8 year old bug that asks about updating this very spec: "We should probably document how (X)HTML entities are parsed." I think perhaps adding my "NOTE" 8 years ago would have been more helpful than bike-shedding for 2 years only for the thread to become a ghost town. I would have greatly appreciated some call out about it, even if not perfect, and figured this note was better than nothing. But you know what, I agree. It's not worth needlessly slowing down this simple feature, so I'll happily get rid of it from this PR tonight to not further distract the conversation.

It doesn't mean we shouldn't move forward to add template literal, but I personally don't think it's that simple.

If you set aside the "controversial" NOTE that's unrelated to the rest of this change, I don't think anything brought up here makes it non-simple, unless we choose to endlessly expand the scope by making it dependent on some mythical 2.0 that seems very far from having any shape considering the current state of the repo.

I honestly mean no antagonism with this, and I do sympathize with the situation you've been placed in with respect to this project. However, the context here is that this feature has been asked for for 7 years, and in this thread the necessary work was laid out, and if you look above in our thread it seems like we got right next to the finish line only to then be abandoned as the main maintainer left the company. I was repeatedly warned to not spend any time on this and that nothing ever happens on this repo, and there are plenty of public GitHub issues displaying quite public frustration. I chose to believe the representations made to me instead, only to have the requirements changed again. This is ultimately your repo, and it's fine to have it be static! There's arguably some value in that, and there are other great options in Open Source. We can do a WhatWG-style parallel spec to avoid conflicting with the "stability" that's desired for this JSX spec that's essentially considered done. That's fine and has plenty of precedent, like the static markdown spec or the TypeScript spec that was finally mercifully closed. I just ask that if the plan is to have this be some lightly maintained repo with tons of stop energy and the absolutely highest standards of adding minor features I've ever seen, then be upfront about it, so we can stop investing time in it. It'd be worthwhile IMO to clearly state at the top of README that the spec is on indefinite hold, instead of having people reach that conclusion for themselves from a cursory reading of the GitHub issues. Again, you can't help the situation that was here before you, but you do have the power to change the way things work moving forward.

tolmasky commented 2 years ago

As promised, I have removed the controversial NOTE that attempts to explain the way JSX has always worked. This spec has been wrong for 8 years running, why not keep up the record? If not, we might be in danger of having to remove "DRAFT" from the title.

hax commented 2 years ago

The entire thread is just everyone agreeing this is a super minor change that is both forwards and backwards compatible.

It's too easy to say it's a forwards/backwards compatible. Entity problem is a good hint that warn us there may be something we should be careful.

Also, I'm curious if we added template literal, should we add all other literal (like attr = null). I hope such problem could be solved in a consistent way, or we may find some inconsistency in the future.

wooorm commented 2 years ago

@tolmasky perhaps you can try and land things like the Note on ES vs HTML you just removed and the new links in this PR, in new PRs? A big problem with this thread is that it’s all over the place because there’s a lot to unpack. Perhaps your goal, TemplateLiterals, can more easily land (or not) when those “dependencies” have landed?


I personally find @Huxpro’s comment/hesitations very reasonable. I agree that it’s not so simple. Trying to tally this thread, everyone’s talking about JS vs HTML and HTML 4 vs HTML 5. I can only find @sebmarkbage being explicitly :+1: and @nicolo-ribaudo implicitly :+1: on adding TemplateLiteral.

tolmasky commented 2 years ago

@tolmasky the new links in this PR, in new PRs?

Oh come on. Are you seriously now calling this PR "a lot to unpack" because it links the surrounding relevant parts of the ECMAScript spec? Really? This is gaslighting now. Yes, sorry for making this PR "a lot to unpack" by linking to the surrounding relevant parts of the spec:

Screen Shot 2022-02-10 at 7 49 55 AM

I should clearly have only linked to TemplateLiteral, because linking the neighboring Literal is "out of scope" and makes the diff completely unreadable! Casually fixing the Babel link that is 4 years out of date should clearly also be a separate PR. I 100% expect an apology for this ridiculous insinuation. If you don't like the TemplateLiteral proposal, fine. But insinuating that the PR is disorganized because it adds relevant links is insulting. To be absolutely clear here, that sentence is not a casual remark, I find this comment offensive and meriting an apology.

A big problem with this thread is that it’s all over the place because there’s a lot to unpack. Perhaps your goal, TemplateLiterals, can more easily land (or not) when those “dependencies” have landed?

The original concern brought up in every TemplateLiteral thread was the "secret" HTML entities feature. It was presented as a binary question: "If we add TemplateLiterals, should we continue the HTML entities feature or not". I explained here why it shouldn't be, which the original maintainer agreed with. The NOTE was thus relevant as a comparison point. No PR has been accepted in this repo (aside from code of conduct changes) in literal years, so why would I make a separate PR with a NOTE that @Huxpro has already criticized? For it to be nit-picked for 2 years like the last time this was attempted 8 years ago? Just for the status quo to remain having this existing intended feature that affects every single JSX user be a secret that only the elite few who have written a JSX parser know about? Talking about fixed documentation links as "dependencies" isn't normal. This repo has a clear reputation, and it was my mistake for believing the maintainer that this time would be different. As you can tell from my tone, I obviously no longer believe this is getting accepted. So no, I won't be making a separate PR that fixes the embarrassingly out of date Babel link since I don't have time to end up in a 2 year discussion about whether we want to link to implementations at all or not.

I can only find @sebmarkbage being explicitly 👍

Right, just, you know, the previous sole maintainer of this repo who told me what was needed to add it.

tolmasky commented 2 years ago

Also, I'm curious if we added template literal, should we add all other literal (like attr = null). I hope such problem could be solved in a consistent way, or we may find some inconsistency in the future.

Hi @hax, this has been discussed to death (even already appearing in this very thread), but I don't blame you for asking again, as it's not really easy to keep track of these issues as they come up and die off every year. So let me attempt to sum up the situation with literals, and the specific approach this PR takes as a result. Literals like null or numbers already have an existing, and this next part is critical, widely used meaning in HTML. If you look at something like <input type = "number" min = 10>, the min = 10 is unfortunately identical to writing min = "10", both give the attribute a value of the string "10". This is also the case with false, true, null, etc. of course. As such, adding this to the existing spec would diverge with JSX's current behavior of doing something halfway reasonable when you copy/paste from an actual HTML file. This is thus not a "breaking" change in the sense that any old code would be broken, but "breaking" in certain expectations of certain workflows. As such, it keeps being punted to some sort of "JSX 2.0" consideration, where everyone feels better about making breaking changes.

The special thing about template strings (which didn't exist when JSX was originally released), is that they don't really suffer from this problem. <div value = `hi there`>hi</div> is already a "broken" construction in HTML which would translate to something like <div value = "`hi" {there`}>hi</div> because the back ticks are ignored as delimiters and treated as any other character. In other words, no one out there is using back ticks as a replacement for single or double quotes because they don't get parsed like strings, so it is both technically not a breaking change in that no existing JSX code is broken, and it is also with 99% certainty not a "breaking copy/paste workflow expectations" change since it is unlikely anyone is copying HTML into JSX with an un-single-or-double-quoted-wrapped template string. There is thus some sense that if JSX had originally been released 2 years later, after TemplateLiterals had already existed, that it would have just supported all three string literal types. That's why this change specifically can sidestep the larger (never ending, no joke, we've been talking about it going on 5 years now) conversation of "what about all other literals???". All other literals have to go in the next breaking change, so the appropriate place to talk about them is there. Since that change is expected to be breaking (and let's be honest here, a long ways away), there's no problem with adding this non-breaking change. Which gets us to...

It's too easy to say it's a forwards/backwards compatible.

I hope you now at least agree that it's non-controversial that it's backwards compatible. This is just a verifiable fact. There is no existing JSX code that breaks with the addition of this new feature.

Entity problem is a good hint that warn us there may be something we should be careful.

So that covers backwards compatibility, now let's talk forward compatibility. The entity problem is the forwards compatibility "problem" and is precisely what this PR addresses. The original reason for this PR was specifically to address the entity question brought up by the original maintainer 7 years ago, and then specifically again in this Babel thread that directly lead to the creation of this PR (and for the record, he was happy with the solution, but then due to the curse of this repo, happened to also leave the company and thus reset this entire process back to step 1). Anyways, the "problem" is that there is a strong desire to remove this entity feature entirely. That is of course a breaking change. We can't do that in the current "version" without possibly breaking people's code that relies on it. However, if you read the PR summary (very first comment in this thread) it goes into excruciating detail about how adding TemplateLiterals makes the entity situation strictly more manageable: it of course doesn't break any existing code that relies on such entity behavior in normal strings, while providing a tool today for people to escape this entity stuff by just using the TemplateLiterals. As such, if the choice in the future is to get rid of HTML entities everywhere, then clearly this current proposal wouldn't conflict. Whereas, on the other hand, if the decision ends up being to keep them in existing strings for legacy purposes, then of course TemplateLiterals continue to not conflict the same way they currently don't conflict.

Despite the disingenuous way it's been presented by the newcomers to the thread, the question around HTML entities with respect to this specific feature is very clear and binary: do we add them to this new feature or not. We may disagree with the answer to that question, and I'm happy to lay out all the reasoning yet again, but the "confusion" in this thread stems from this question not being properly understood. Whether or not HTML entities behave the way they do in the current "spec" shouldn't be up for debate. They do. We have to live with that. This thread should absolutely not fall into a parallel discussion about whether we should fix them across the board. I guess I made a mistake by trying to, for the first time in 8 years, make some mention of this in the spec that indicates this, which was attacked for not being sufficiently thorough in this already sparse and incomplete spec. Fine, criticism accepted and I have removed it from the spec. I invite anyone else to go tackle that change, but I agree it is ultimately orthogonal to this feature that doesn't suffer from entity issues and would have zero backwards compatibility problems, or any problems, with respect to them.

tolmasky commented 2 years ago

Per @wooorm's request, I have removed those pesky links that deprived the reader of the opportunity of Googling these terms themselves while reading the spec. I also reverted the Babel link back to the out of date one. This previously unacceptably large 13 line change is now just two lines. I know, probably 2 lines longer than we wish it would be, but look at the bright side, it now follows the existing conventions of providing zero assistance when making your way through the spec:

Screen Shot 2022-02-10 at 12 25 16 PM

Now that's a diff that Jony Ive would be proud of! No needless ports on that thing! Hopefully this 2-liner isn't too much to unpack.

I'm curious what the critique will be now though. My money is we'll rehash some vague worry about consistency with future features that haven't been discussed in over 3 years, have already been addressed somewhere in this thread already, and is moot since they'd be breaking changes anyways. Or perhaps this will now, lacking any surrounding context, be accused of lacking context? That'd be a great catch-22 that would put me in a bind as to how to proceed and lock up any progress without explicitly saying no and the courage to just close the bug. Either way, one thing I'm pretty sure of is that it won't be a specific request or question, but rather a purposefully open-ended concern that provides a good way of kicking the can another week.

RyanCavanaugh commented 2 years ago

@tolmasky speaking as a third party here, you're not writing about this in a way that makes it seem like you want to approach this constructively. Framing this as a change that must happen because you got some engagement from the prior maintainer isn't going to allay anyone's legitimate concerns about the future-looking impact of expanding the grammar. Spec maintainers are cautious by nature because mistakes are not undoable, and I'd encourage you to think about any change, particularly this one, with that mindset.

wooorm commented 2 years ago

@tolmasky I understand your frustration with the long, arduous process. I feel you’re reading my comment incorrectly: I said this thread was all-over-the-place, not your contributions. To clarify: a) I did not mean to cause you grieve, b) I am not involved in this process, other than an outsider that’s interested in seeing JSX improve. I never said that your notes, that you removed, or links, that you now removed, should not land. I proposed to split them exactly because I do think they should land. And that separating them might make everything easier to discuss. I don’t think anyone is against your links. But separating them, makes this tough nut easier to grasp and discuss!

tolmasky commented 2 years ago

Hi @RyanCavanaugh, I am going to answer you in good faith and I would appreciate an answer back.

First off, you hit the nail on the head, as I explicitly stated actually:

As you can tell from my tone, I obviously no longer believe this is getting accepted.

My goal is not to get this accepted. It won't be. But you already know this. I'm not going to be the first person to pull off something larger than a typo fix in 8 years. I was temporarily convinced otherwise when the previous maintainer encouraged it, but then he fell silent for a month and ultimately left.

Framing this as a change that must happen because you got some engagement from the prior maintainer isn't going to allay anyone's legitimate concerns about the future-looking impact of expanding the grammar.

This is simply not the case. Specific, answerable, issues have not been brought up in this repo that haven't either already been addressed in countless other places, or are unrelated to this feature. The "what about other literals in a hypothetical new version of JSX" is a good example. But I answered this one to the thread newcomer because it's in good faith and it's not his fault that all this is burried (I hold him to a different standard than the maintainer). But @Huxpro also brought this up:

Yep it's mostly incremental by its own, but the concern is not about adding this one particular syntax. If we add this, then why not numeric literals? Why not array/boolean/null literals? Why not Object literal...oh we can't because it conflicts with JSXExpressionContainer so shall we just change everything like a JSX 2.0?

Read that "slippery slope" sentence. There's no issue with TemplateLiteral, it just boils down to "but then why not change the entire language?" Because this is the one unique literal that disntinctly doesn't require a breaking change and has repeatedly been requested in the repo, and the various parser repos, for over 7 years. This is covered both in my original summary and in sebmarkbage's first reply. These exact arguments would apply to adding double-quote strings if only single-quote strings had been previously allowed: "Well what about booleans? And then any object literal? Well then you'd have a conflict with JSXExpressionContainer!" It demonstrates a fundamental lack of appreciation for the history of this bug and PR.

Those minor things add up. The real cost here is that it breaks some people's perception of JSX being a very simple HTML/XML-like syntax sugar with only JS semantics presented in JSX identifiers and within {}.

And here's the kicker. The real argument is a vague reference to the "aggregate" of unspecified other issues. But there isn't actually anything else. It's all "well, sure, nothing wrong with this, but hypothetically, if we did other stuff similar to this that yes, you already explained why we don't do, but just bear with me, if we did do that, then people's perceptions would be affected".

That's a conversation killer. Am I supposed to conduct a survey of whether people feel that this simpler string system that arguably is more in line with the current JS language will make them perceive it differently? That's an intentionally non-technical and unwinnable discussion. There is no ask here. Read his reply. There's no ask. He didn't provide some series of steps to get to where we need to be.

That's my real issue. I'd be fine with:

  1. Closing the bug and flat out saying you don't like it and that's the end of it.
  2. Making a specific ask or providing specific requirements as to what would convince you.
  3. Saying that all changes are on hold until you come up with contributing guidelines (look, I'm giving you easy ways to get out of this!)
  4. Admitting that you're new to the repo and that you're just not ready to make this kind of change any time soon to a repo that has a history of rarely changing.

All of those responses are respectful of my time and the work already put in and set expectations accordingly. Coming up with random vague critiques are not. I promise that if any PR was left open for months after the maintainer had basically approved it, you'd have random people walking in throwing their two cents and making the history of that thread an unreadable mess too.

And this is what I think is really important: There is an unfortunate pattern in our community of tacitly accepting "passive" disrespectful behavior, but if you call it out, then people get flustered. It is disrespectful to waste someone's time like this. The Facebook employees may get paid to work on this repo, but I don't. People privately reached out to me to warn me against spending time on this, but the fact that the maintainer specifically laid out the requirements made me think that something had changed. Then he leaves and we just start rehashing everything all over again. Not even a "hey, I know this may seem unfair, but my standards are different and this is what I think should be done". That's not OK, and it has nothing to do with the technical aspects of the pull request, and that is what should be called out, not whether I am employing sufficiently advanced techniques of persuasion to achieve a different result than anyone else who has ever tried to do anything in this repo. So back to your original concern:

Framing this as a change that must happen because you got some engagement from the prior maintainer isn't going to allay anyone's legitimate concerns about the future-looking impact of expanding the grammar.

So why am I bothering? Because I think if this thread had existed in its current form before I started working on this feature, it would have provided sufficient warning to not bother wasting my time on this. One thing is a long bike-shedding thread from years ago, another is seeing how close this got by satisfying the precise requirements asked for, and then seeing the ultimate result. Feel free to read my entire GitHub comment history. I am more than happy to make changes I personally don't agree with in other people's repos. I am also a big believer that the repo owner has the ultimate final say. I am however not OK with providing vague carrots when you know nothing is actually going to change, and I am really disappointed that the comment posted here wasn't about how JSX could better treat the community that has invested time and labor and instead was directed at how I could navigate this PR that we all know is dead in the water.

tolmasky commented 2 years ago

@tolmasky I understand your frustration with the long, arduous process. I feel you’re reading my comment incorrectly: I said this thread was all-over-the-place, not your contributions. To clarify: a) I did not mean to cause you grieve, b) I am not involved in this process, other than an outsider that’s interested in seeing JSX improve. I never said that your notes, that you removed, or links, that you now removed, should not land. I proposed to split them exactly because I do think they should land. And that separating them might make everything easier to discuss. I don’t think anyone is against your links. But separating them, makes this tough nut easier to grasp and discuss!

Hi @wooorm, I appreciate the reply and sorry if my tone was harsh with you. My actual frustration is around the kremlinology around the right ways to poke this PR in order to get it accepted. Especially because I think these links are actually very useful in reading and understanding that part of the spec. I certainly had another tab open with each one to double check that there wasn't any conflict, etc. And I understand that you simply think it should be split out. Perhaps in a repo with a different history I'd agree (well, weird hypothetical because such a repo would have just accepted the helpful links in this PR too, but I digress), but given this repo, I just don't think it will do anything other than create 2 PRs for me to track for weeks. Also, @huxpro can just add them if he thinks that subset of changes is worthwhile and the rest needs further consideration. It's ~10 lines, he could do it with the built-in GitHub editor, I don't care who gets the credit for the commit, I just want the next person who comes to be able to click on a link.

acdlite commented 2 years ago

Hi @tolmasky, I think your frustrations are legitimate, just want to chime in here with a few bits of context:

[...] I'd be fine with:

  • Closing the bug and flat out saying you don't like it and that's the end of it.
  • Making a specific ask or providing specific requirements as to what would convince you.
  • Saying that all changes are on hold until you come up with contributing guidelines (look, I'm giving you easy ways to get out of this!)
  • Admitting that you're new to the repo and that you're just not ready to make this kind of change any time soon to a repo that has a history of rarely changing.

We'll give a more definitive response soon. I'm sorry we haven't been as transparent or responsive as we could have been.

tolmasky commented 2 years ago

Hi @acdlite, thanks so much for the update. That sounds good to me!

magic-akari commented 2 years ago

I am new to the repo. It looks like there are some non-technical issues being discussed here.


Please allow me to ask some more details on compatibility and inconsistency.

<div foo="&amp;" bar=`&amp;` baz={`&amp;`} />

This means that in some hypothetical future version of JSX2.0. We'll get a breaking change but keep the consistency version. Right?

React.createElement("div", {
  foo: "&amp;",
  bar: `&amp;`
  baz: `&amp;`
});

All foo bar and baz will get the same vlaue &amp;. Although foo will get different results in the existing behavior. But since this is a breaking change and it keeps consistency, we are happy to accept it.

Personally, I prefer JS string encoding, HTML entity are annoying and confusing. But the inconsistency is more unacceptable.


For some time after this PR will be merged before the upcoming release of JSX2.0, we will get a non-breaking change but inconsistent behavior. Right?

React.createElement("div", {
  foo: "&",
  bar: `&amp;`
  baz: `&amp;`
});

The behavior of bar is the same as baz's, but different from bar's. That's what I'm concerned about.

Imagine a theoretical situation.

<div>
  <MySpan content={`&amp; Interpreted as ${userSelectedNote}`} />
</div>

Well, we are now ready to use the shiny JSX TemplateLiteral!

<div>
  <MySpan content=`&amp; Interpreted as ${userSelectedNote}` />
</div>

Hey! we want to color the userSelectedNote.

<div>
  <MySpan content=`&amp; Interpreted as ` />
  <MySpan color={themeColor} content={userSelectedNote} />
</div>

Since there is no substitutions in TemplateLiteral, let's switch it back to a normal string! Maybe some formatter or linter well do it automatically.

<div>
  <MySpan content="&amp; Interpreted as " />
  <MySpan color={themeColor} content={userSelectedNote} />
</div>

And what happened now? I got unexpected results.


There is thus some sense that if JSX had originally been released 2 years later, after TemplateLiterals had already existed, that it would have just supported all three string literal types.

If that were true, we would get three consistent strings, since TemplateLiterals does not exist in HTML String and we choose JS string encoding.

Simply patching it with template string and leaving the other two strings alone will not take us back in time. We've gone into a different route, and the fix comes at a cost. The inconsistency was exacerbated by the introduction of TemplateLiterals.

Therefore I do not agree that this change is orthogonal to the entity problem.

What concerns me even more is if this PR is merged but JSX2.0 never arrive. Then we have to live with this inconsistency forever.

Kingwl commented 2 years ago

Hi @tolmasky. Thanks for your effort here.

My point of view from the user's perspective is (Opinions are my own):

  1. Is the motivation clear? Yes. 1. To avoid unnecessary JSXExpression ({}). 2. To avoid html4 entries.

  2. Is this useful? Yes, nice to have. It's could avoid some effort for me (eventhough not too much).

  3. Is this a breaking change ? Nope. At least not breaking in syntax. It's Illegal in most of common parsing tools for now.

  4. Is this orthogonal with existed jsx string's entries issue ? Yes. It's a new syntax, Personally, I'm ok with new syntax has new semantic (or behavior).

  5. Should we change the existed jsx string's entries? Nope. It's a breaking change.

  6. Why just template literal? why not boolean? numeric? It's hard to answer:

    1. As a user, I am happy to see these too.
    2. As someone who tries to push something: I will keep the scope to a minimum so that I can get more consensus with everyone.
  7. Does anyone actually use entries inside jsx attribute values in real world? Honestly, It's (In this issue) my first time to know there's entries in string jsx attribute value.

    1. And I asked some colleagues around me, they also do not know or have not used this. To clearify: This is not a basis for anything, nor does it make any constructive sense.

IMO, the problem you need to convince someone of is: Should we continue to block other literal expressions if template literal has merged in. And also: why now?

tolmasky commented 2 years ago

Hi @magic-akari, thanks for the question, scenarios like this are very helpful for thinking about this. As I'll show below, there is zero additional inconsistency from today's behavior. Let's go through your exact steps, only using the current JSX implementation. Let's begin with exactly where you started:

Imagine a theoretical situation.

<div>
  <MySpan content={`&amp; Interpreted as ${userSelectedNote}`} />
</div>

To be perfectly clear, the user would see &amp; in their rendering, not "&", both with today's JSX, and if this PR is accepted. Now, since we don't currently have a shiny new TemplateLiteral, we never go through the simplification step. But, just like before, eventually we want to color the note, so we break up the previous template expression above into two elements:

Hey! we want to color the userSelectedNote.

<div>
  <MySpan content={`&amp; Interpreted as `} />
  <MySpan color={themeColor} content={userSelectedNote} />
</div>

And, just like in your example, there's no longer any replacement in the TemplateLiteral, so we'll switch back to the normal string! Maybe some formatter or linter will do it automatically.

<div>
  <MySpan content="&amp; Interpreted as " />
  <MySpan color={themeColor} content={userSelectedNote} />
</div>

And what happened now? I got non-expected results... today. You can't tell me that this would be expected by your average JSX user, especially considering that this is an undocumented feature. I agree, it sucks... but the unexpectedness of this behavior has nothing to do with whether or not we remove the need for wrapping the template literal in an additional set of curly braces.

There is thus some sense that if JSX had originally been released 2 years later, after TemplateLiterals had already existed, that it would have just supported all three string literal types.

If that were true, we would get three consistent strings, since TemplateLiterals does not exist in HTML String and we choose JS string encoding.

No, there's a fairly good chance they would have the two sets of differing behavior, if you look at the original reasoning behind including HTML entities in the first place: single and double quoted strings appear in existing HTML, so the entire reason this HTML entities feature (that no one seems to like) exists is to preserve those single and double quoted strings if you half-haphazardly copy and paste them into your JSX. To be clear, since then, we've discovered that this is not really as important a use case as it was originally considered to be, especially considering that the "dream" of copy-pasting HTML never really manifested since things like class= and style= never worked. But, again, as an exercise of putting ourselves in their mindset at the time, there would have been this same reason for double and single-quoted strings, but not for template strings because there's no large existing plethora of backticked-strings out in HTML that would be nice to copy over to JSX. Additionally, HTML entities really can't work in template strings (consider for example the fact that <div value = `&copy;${"&copy"}`> would have to generate ©&copy;, if this were the case). As such, there's a pretty good chance that the compromise of "html entities in html strings, normal JS strings for template literals" would have been reached as the ultimate solution. And arguably, this would have been better for the same reason I outlined above: it would allow you to message to people to "just stick to template strings to avoid this whole entity mess".

What concerns me even more is if this PR is merged but JSX2.0 never arrive. Then we have to live with this inconsistency forever.

I will absolutely grant that if we merge this it will exist for a long time before JSX 2.0, because JSX 2.0 is probably never arriving (not unless Facebook is secretly working on unrelated an unannounced changes). You can look at this dead thread here where 2.0 was originally discussed 8 years ago. That being said, as I think I displayed above, this doesn't actually add any particular additional confusion IMO.

Kingwl commented 2 years ago

Hi @magic-akari , Thanks for your feedback.

But the inconsistency is more unacceptable.

It's the same situation between template literal and string literal, eg:

const stringLiteral = 'line
feed' // Illegal
const noSubstitutionTemplateLiteral = `line
feed` // line\nfeed42
const templateLiteral = `line
feed${42}` // line\nfeed42

(And In my memory, there's some minor different for escape sequence too. Sorry I'm not pretty sure.)

They are all something like strings, but why they are inconsistency? because they are different concept even they looks same in most case.

It's also my point with this issue: they are different concept.

If that were true, we would get three consistent strings, since TemplateLiterals does not exist in HTML String and we choose JS string encoding.

To be clear: JSX does never claim the HTML compatibility.

This specification does not attempt to comply with any XML or HTML specification. JSX is designed as an ECMAScript feature and the similarity to XML is only for familiarity.

And also means: There's always JS.

The inconsistency was exacerbated by the introduction of TemplateLiterals.

It's the same as the inconsistent between string literals and the template literals. Put another way, this is what we(this PR) want .

And also a problem same as template literals: Why that inconsistent is acceptable but not this one?

@tolmasky

because JSX 2.0 is probably never arriving

lol.

magic-akari commented 2 years ago

I am the supporter of JS encoding string. I subconsciously treat JSX as sugar of JS.

So it surprises me when the following code behaves differently.

const foo = "&amp;";

<div foo={foo} bar="&amp;" />;

And the supporter of html encoding string well tell me "It follows the html encoding rules. Use {} to get back to the world of JS." Okay, okay, it's logical and self-consistent, I'll just have to accept it.


Everything will change if this pull request is merged.

Hey, look at this! It behaves consistent, whatever the literial string is in or out the JSX. Don't tell me to use {} to get back to the js world. We are use the same string syantax in JS and it is JS encoding string as we expect.

const foo = "&amp;";

<div foo={foo} bar=`&amp;` />;

So, Why not change the existed jsx string's? Sure, it is a breaking change. But this pull request destroys the arguments of html encoding supporters.

magic-akari commented 2 years ago

@Kingwl

It's the same situation between template literal and string literal

No, they use a different syntax that is more easily noticeable.

The JSX string uses the same syntax as the JS string. The difference is in their contexts which is hard to notice.

The normal string with multiple lines are handy. I would be happy to have it in JS. But it is another case.

Kingwl commented 2 years ago

@magic-akari Thanks for the response

Okay, okay, it's logical and self-consistent, I'll just have to accept it.

IMO, this is a concept that to help understanding the behavior. But Is that a spec or design goals? I'm not sure.

In other words, can this concept what the supporter of html encoding string told you be used as a rule for our future designs?

I'll just have to accept it.

I'm not a big fan about that too. we might just accept it because it's already became reality.

Everything will change if this pull request is merged.

Well. I don't think so. Nothing changed if you are not using this new syntax. The difference will only happend If you are using this template literal.

So, Why not change the existed jsx string's?

As you said. it is a breaking change. It's might happend after jsx 2.0 which will probably never arriving.

The JSX string uses the same syntax as the JS string. The difference is in their contexts which is hard to notice.

It's the problem of existed jsx string. Not what we are talking about in this issue. And that is what this proposal want which has designed to limit the scope to avoid the conflict and get more consensus.

BTW: noticeable or inconsistency is not a boolean. It's not only inconsistency or consistency. It's actually a bar that like I think this one is more consistency than that.

IMO It' might hard to be a strong blocker.