Closed mhassan1 closed 1 year ago
CodePointAt returns a Record, and strings basically are code points (the [[CodePoint]] field). What value would you expect there?
The description of UTF16SurrogatePairToCodePoint
(below) sets cp
to a number then returns cp
(unless we are supposed to interpret Return the code point cp.
as Return the code point whose numeric value is that of cp.
).
- Let cp be (lead - 0xD800) × 0x400 + (trail - 0xDC00) + 0x10000.
- Return the code point cp.
Maybe this is just a matter of interpretation of what a code point could be (a number or a string), but I thought it was supposed to be a number.
I think either is certainly a reasonable interpretation - and yes, I interpreted that line as a transformation of the number, otherwise it would have said Return _cp_
.
Do you have a use case where it makes a difference?
I think part of the confusion is that String.prototype.codePointAt
returns a number:
- Return 𝔽(cp.[[CodePoint]]).
For that reason, "code point" feels like a number, but I agree that the spec isn't explicit about that. FWIW, MDN also says it's a number, but I guess that's talking about the String.prototype.codePointAt
result, not necessarily the abstract "code point."
I don't have a use case where it makes a difference. I just found it confusing while I was implementing a polyfill for String.prototype.toWellFormed
. For example, UTF16EncodeCodePoint
has this:
- If cp ≤ 0xFFFF, return the String value consisting of the code unit whose numeric value is cp.
Again, that ≤
could be interpreted as "the number representing the code point is less than."
do note that https://npmjs.com/string.prototype.towellformed already exists :-p
indeed, <=
in JS works on both strings and numbers, so that was my interpretation here ras well.
Understood. Closing this. Thanks!
Thanks for opening the discussion!
Currently, the implementations of the following operations return strings, but they should return numbers (for example, https://tc39.es/ecma262/#sec-utf16decodesurrogatepair):
UTF16SurrogatePairToCodePoint
UTF16DecodeSurrogatePair
UTF16Decode
UTF16DecodeString
CodePointAt
StringToCodePoints