dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.09k stars 1.56k forks source link

`String.fromCharCode` and `String.fromCharCodes` should be `const` constructors #49407

Open rakudrama opened 2 years ago

rakudrama commented 2 years ago

I have needed to construct constant expression strings from their code points. This usually happens in programs that do some string processing with code points. Sometimes it is desired to define named constants for both the code unit and the String.

This is currently impossible to construct a constant String form a code point, and impossible construct a const code point from a String. This leads to constants that are not obviously consistent:

const codeA = 0x41;
const charA = 'A`;

These could be correct by construction:

const codeA = 0x41;
const charA = String.fromCharCode(codeA);

https://github.com/dart-lang/language/issues/2219 proposes making aString.codeUnitAt(anInt) a potentially constant expression, permitting

const codeA = charA.codeUnitAt(0);
const charA = 'A';

However, using String.codeUnitAt has a UTF-16 pitfall. The following is correct

const codePerson = 0x1F9D1;
const charPerson = String.fromCharCode(codePerson);

whereas this not:

const codePerson = charPerson.codeUnitAt(0);
const charPerson = '🧑';

This request is broken out from https://github.com/dart-lang/language/issues/2219#issuecomment-1175981550

lrhn commented 2 years ago

Making String.fromCharCodes a const constructor is a lie. It takes an iterable, and there is no constant way to iterate an iterable. Even if we require the argument to be the result of a constant list literal, there is still no in-language way to iterate a constant list.

Doesn't mean we can't do it, it just requires special-casing in the compilers, the magic doesn't happen in the constructor, and we likely need to mention the restriction (argument must be result of const list literal expression) in the language specification.

How about:

Then you do

const codePerson = char(0x1f9d1);
const charPerson = "$codePerson";

(Hmm, that does mean that views need to have const constructors. Might be an issue.)

rakudrama commented 2 years ago

@lrhn Some questions:

Q1: So can String.fromCharCode can be const? I don't want String.fromCharCode to be delayed because of some issue with String.fromCharCodes.

Q2: I don't get why making String.fromCharCodes a const constructor.is a 'lie'. We have other const constructors that are only const if the argument is const.

class Foo {
  final Iterable<int> ints;
  const Foo(this.ints);
}

const aFoo = Foo([1,2,3]);

As far as evaluation goes, if we can spread constant Iterables, there must be a mechanism in the compiler for accessing the elements of the constant operand.

lrhn commented 2 years ago

The String.fromCharCode also needs compiler cooperation. We can't write it in pure Dart code. The only way to allocate a string in constant Dart code is a string literal, and as you recognized, there is no way to go from code point to string, or vice versa, in constant (or potentially-constant) code.

A plain generative Dart const constructor is limited in what it can do. It can create an instance of the class it's on, and it can fill in (necessarily final) fields by evaluating only potentially constant expressions. The String.fromCharCode/String.fromCharCodes factory constructors do more than that.

There is no way to write a generative String constructor in plain Dart. Therefore there is no way to write a const constructor in plain Dart, because it must either be generative or be redirecting to another const constructor.

To have const String.fromCharCode(42) or const String.fromCharCodes([42]) requires the compilers (and analyzer) to recognize the calls, extract the arguments, and create a constant string instance from it without calling a String constructor, like it does for string literals.

That's not impossible. We do it for const Symbol already. It's just not a simple library change. It's not just a simple const constructor like other const constructors (that's why I say we "lie".)

It might be possible to handle it entirely in the front-end, effectively rewriting const String.fromCharCode(42) or const String.fromCharCodes([42]) into the string literal "\u002a". And we don't even need to make the constructors const to enable this rewrite, we can do it on any invocation to those constructors with suitable constant arguments. (Still need the const to allow the invocation in a constant context, though.)

So, this is a front-end request more than a library request.

eernstg commented 2 years ago

this is a front-end request

I suggested at some point that this should be an SDK issue with 'area-library' (rather than a language repo issue) because I would not expect the language specification to mention it, not because I thought that it could be implemented without going outside the language. This is similar to, e.g., bool.hasEnvironment.

Maybe, if adopted and scheduled for implementation, it's an 'area-meta' with sub-issues in 'area-library' and in 'area-front-end'?