dart-lang / language

Design of the Dart language
Other
2.65k stars 201 forks source link

Typings is a package that transpiles TypeScript to Dart and enable directly use of NPM packages. Lets make it sound #2995

Open jodinathan opened 1 year ago

jodinathan commented 1 year ago

Hi,

I am the author of js_bindings and now I've published Typings.

The package transpiles a TypeScript declaration file (.d.ts) to Dart JS interop.
It will replace js_bindings as it has full ES2023 interop.

The transpilling process can be done automatically to any NPM package that has a TS declaration file and also exposes a helper function that injects the JS into the DOM.
It basically allows a developer to use NPM packages without the burden of manually generate the JS interop nor having to add the script tag in the HTML file.

The builder tries to understand the TS code in an attempt to transpile sound Dart code.
However, this is no simple task.

Some questions:

I already currently address some of these questions and you can know more in the README and the WIP Wiki.

As it is a new package and intended to use full Dart 3, we can make tests to find the best way to transpile the code before releasing an stable version.

leafpetersen commented 1 year ago

cc @srujzs @joshualitt @sigmundch @eernstg

lrhn commented 1 year ago

Disclaimer, I don't actually know anything about Dart JS interop.

  • Can inline classes help here?

Probably. Providing a static Dart view on a different/foreign object representation is pretty much what inline classes are designed for. It should be usable for anything like this.

  • How to make js types like Array or [Symbol.iterator] loopable?

That's harder. Dart currently requires the target of a for/in to implement Iterable. So, you need an object which does that. Inline classes do not count, since they are entirely static and cannot implement (late-dispatched) interfaces. If you can make your interop class implement Iterable, it should work.

What types can be directly translated to Dart? You can check the current mappings here.

By "translated", do you mean "converted to" or "interpreted as"?

If the former, I'd convert bigint to Dart BigInt. (But I don't know the DOM types well enough to say anything useful.)

Promises to futures, and vice-versa, should be doable.

  • What to do with types that doesn't exist as feature in Dart? Like unions, intersections, type guards (predicates) etc.

The usual Dart approach to union types is to use dynamic and runtime type checks. Or creating separate functions, if the combinatorial explosion is limited.

  • Can Records be used for TS interfaces (anonymous classes)? They seem to be the best fit

Possibly not. Interfaces have optional properties, Dart records cannot, which means that you have to insert a default value at the creation point. (But if the TS compiler does that, it's probably fine.) It does seem that you're not allowed to have extra properties in the interface values, which would otherwise have been a problem.

The bigger issue is that interface properties can be non-readonly, Dart records are unmodifiable. It's probably possible to work around that (use a record containing mutable value wrappers), but it's not as direct.

  • What to do with overloading methods?

Naming them differently, but that's hard to automate. You may also have to account for TS member names which do not translate to Dart.

var action = {while: "I'm OK", do: "Stay OK"};

This seems to be valid TS, because object properties are allowed to be reserved words. Both while and do are reserved words in Dart and cannot be used as record field names, because the parser won't allow reserved words where an identifier is expected.

eernstg commented 1 year ago

Agreeing with @lrhn, I'll add a few extra comments.

@jodinathan wrote:

However, this is no simple task.

That is certainly also what I would expect. The type systems are very different, and even small differences can be hard to bridge when it comes to global properties like static typing.

Computer scientists and language designers have done many things over a period of decades to make static type checking modular. So it could be claimed that static typing is a local property. However, that's only because we are adding a lot of declared types to programs (such as formal parameter types and return types of functions, or types of instance variables, etc). It is probably necessary to deal with the declared as well as the computed types in a global manner in order to find "an equivalent typing" using a different type system. That said, I'd be really surprised if any such "equivalent typing" even exists, in all but the most trivial cases.

However, we can use some systematic techniques to express a compatible, but less precise, typing. For instance, we could start by using dynamic as the return type of every function, and as the type of every formal parameter, and as the type of every local variable.

Next step could be to handle some particularly safe cases, e.g., by translating basic types of TypeScript to Dart types. Further steps would provide a non-dynamic typing to larger and larger parts of the given JS code. However, I don't think it will be possible to make firm statements about the soundness of the typing (in particular, TypeScript isn't sound ;-). What we can do (and basically can't even avoid in Dart) is to ensure that the execution is sound at the object level: Every value which is considered to be a reference to an object is actually a reference to an object (that is, we won't have SIG_SEGV or SIG_BUS errors at run time). We may also be able to run JavaScript code to confirm that any given JavaScript object is actually an appropriate object for being labeled as having a given type in Dart; but that might not happen in practice, because nobody wants to pay for that work during execution of a deployed application. In summary: We're going to have some level of unsoundness no matter what.

There is ongoing work on a new version of JS interop (@srujzs and @joshualitt are at the core of this work), and they plan to use specialized Dart types to provide access to JS objects in a Dart program, based on special "native" classes like JSObject, JSNumber, and several others.

I think this new approach to JS interop would be a very useful component of your effort with Typings. Basically, I think you would be able to generate code using the new JS interop from the given *.d.ts source.

The ability to make the transition from a Dart int or double to/from a JSNumber will be part of the new JS interop. There are ongoing discussions about how to do that, smoothly and conveniently. This means that it would essentially be a feature that you get for free. This feature might not work a 100% in the manner we (or any particular developer) might want. For instance, can be do JSNumber n = 1; or do we have to say JSNumber n = 1.asJSNumber;? However, I think it's one of those cases where a well-supported standard solution should be used, because it's simply not realistic or useful to deal with that task in a non-standard manner. So if you have to say JSNumber n = 1.asJSNumber; then so be it. ;-)

Some questions:

  • Can inline classes help here?

The plan is that the new JS interop will use inline classes to provide a statically typed way to access JS objects, and I'd assume that Typings would provide support for generating JS interop code which would otherwise be created in some other way (e.g., it could be handwritten, based on an informal type analysis of the underlying JS code).

Of course, this doesn't eliminate the difficulties I mentioned above, based on the fact that the type systems are deeply different. But it does provide a well-understood, widely used framework to express the chosen Dart typing which is the result of that difficult type-system-to-type-system translation process.

  • How to make js types like Array or [Symbol.iterator] loopable?

The exact same question has been raised in connection with the new JS interop.

Some possible designs were discussed last year, e.g., in https://github.com/dart-lang/language/issues/2150. The basic idea was that it should be possible to satisfy the requirements for being the iterable object of a 'for-in' statement without actually implementing Iterable<T> for any T.

We used to have a structural (typeless) approach: Any object with an iterator getter will do if it returns an object whose interface has a moveNext() method with return type bool and and a current getter whose return type is T. However, nobody in the language team was happy about eliminating the element of "documented intent" that we get from the requirement that we must actually have an Iterable<T>.

We could also have a special exception for inline classes (so the iterable object could be an Iterable<T> for some T, or it could have a similarly blessed inline type).

I don't think there is an easy way out here, other than implementing a getter in the inline class that actually returns an object of type Iterable<T> and then use that: for (var v in myInlineThing.iterable) ....

However, there was a considerable amount of interest in the topic, so we might be able to come up with a better approach, just not in the short term.

  • What types can be directly translated to Dart? You can check the current mappings here.

I think the fact that the new JS interop uses JSObject, JSNumber etc. is a strong hint that there is a need for some "native" classes, such that we can access the raw JS objects in a Dart context. It might be convenient and easy to translate those objects to regular Dart objects (or maybe that's not even quite as easy as we would want), but there will most likely be a need to translate.

  • What to do with types that doesn't exist as feature in Dart? Like unions, intersections, type guards (predicates) etc.

That was the hard part. ;-)

jodinathan commented 1 year ago

@eernstg @lrhn There is already some logic to try to make the transpilled Dart a sound code.

For example, overloaded methods become a Record and for each overloaded version I add a function field in the record.
Ie:

interface Document {
  createElement<K extends keyof HTMLElementTagNameMap>(tagName: K, options?: ElementCreationOptions): HTMLElementTagNameMap[K];
  createElement(tagName: string, options?: ElementCreationOptions): HTMLElement;
}

Becomes a Record in Dart:

class Document {
({
    /// Creates an instance of the element for the specified tag.
    ///  @param tagName The name of an element.
    K$ Function<K$ extends _i3.Element>(
      HTMLElementTagNameMap<K$> tagName, [
      _i3.ElementCreationOptions? options,
    ]) $1,

    /// Creates an instance of the element for the specified tag.
    _i3.HTMLElement Function(
      _i2.String tagName, [
      _i3.ElementCreationOptions? options,
    ]) $2,
  }) get createElement => (
        $1: _createElement$1,
        $2: _createElement$2,
      );
}

In an attempt to make it as sound as I can I make it use inference for the first version $1 of createElement:

final div = js.document.createElement.$1(js.HTMLElementTagNameMap.div);

// div is of type HTMLDivElement, thanks to inference

The idea is not to transpile an exact version of the TS code, but a sound Dart version and let the inner code do the JS magic.
However, this must be done carefully so to not diverge too much from the TS signature and also not let the inner code hurt performance.

For example, we could do some trick to make Union types sound in Dart:

interface Foo {
  foo(bar: A | B): void;
}

The bar arg would end up a dynamic: void foo(dynamic bar);.

however, we could transpile the union to a record and check for non null in the call:

class Foo {
  void foo(({A? a, B? b}) bar) {
    js.callMethod(this, 'foo', [ bar.a ?? bar.b ]);
  }
}

Then the user will have a sound signature to use:

  final f = Foo();
  f.foo((a : A()));

Adding a condition within foo could hurt performance.
But not at all if dart2js can inline the call in the compiled JS:

Foo().foo(A()); // there isn't a record or condition in the compiled JS

Then we would be ok regarding performance, however, it would still remain the question if this is the best approach for the users.

Please, take a look at the Wiki as it contains some of the "tricks" I am already doing to try to make it sound.

Would be nice to know if the current approach makes sense or what I could/should change.

The ability to make the transition from a Dart int or double to/from a JSNumber will be part of the new JS interop.

Weird. Can't we already use int or double for JS interop without JSNumber? I mean, all interop stuff I did always worked with number types directly.
Will this change?
For example, current I map JS ArrayBuffer to dart:typed_data ByteBuffer and many other types.

We could also have a special exception for inline classes (so the iterable object could be an Iterable for some T, or it could have a similarly blessed inline type).

This seems to be the best approach.
I could easily translate generate inline classes with the Iterable trait from the TS interfaces.

sigmundch commented 1 year ago

Thanks for sharing about your current efforts in package:typings!

In case you are not aware, our prior work in js_facade_gen is very related and may provide some insights. js_facade_gen uses the TS parser to read in .d.ts files and generate Dart JS-interop facade code. This predates @staticInterop, but the generated patterns are similar - the output has empty JS classes and static extension methods. At the time, we even used TS's dom.d.ts as an input to experiment with replacing dart:html (which heavily informed what we did for static interop and inline classes, and clearly overlaps with what both js_bindings and package:web do).

The team that worked on that package also faced similar questions as those you are addressing in package:typings. For example, they used a least-upper-bound to resolve unions and intersection types (see merge.ts). Note also that many of these questions also exist since the early days of dart:html. In dart:html we made some choices that sometimes ended up informing decisions in js_facade_gen (and recently in package:web).

It's worth noting that there are scenarios were we couldn't pick a single strategy in the past. For example, dart:html took two approaches for method overloading. Sometimes it exposed each overload with a different name (see for example, send in RtcDataChannel which is exposed as multiple methods like sendBlob and sendByteBuffer) and sometimes it added a trampoline and used is/null checks to select the appropriate override (see for example CanvasElement.getContext or createImageData).

The ability to make the transition from a Dart int or double to/from a JSNumber will be part of the new JS interop.

Weird. Can't we already use int or double for JS interop without JSNumber? I mean, all interop stuff I did always worked with number types directly. Will this change? For example, current I map JS ArrayBuffer to dart:typed_data ByteBuffer and many other types.

Great questions. This is changing in part due to the need to be more clear about the Dart/JS boundary. We are working to make JS interop work both on JS-based backends (DDC and dart2js) and in the wasmGC backend (dart2wasm). Without WASM, it was not as noticeable, and we felt OK conflating the types: int vs JSNumber were represented by the same value, so we didn't bother making a distinction. With the WASMGC backend this is changing, we have separate heaps and we need to better map the boundary. Moreover, we had cases in the past where conflation was problematic. For example, we allowed List to be passed, but in reality it only work if you passed a JSArray. Similarly, we allowed arbitrary functions, when in practice only functions wrapped with allowInterop worked. Changes to enforce strict JS types at the JS-interop boundary will help detect these issues with List and Functions at compile-time.

That said, we know this would be a big regression in ergonomics if you have to write .toJS on every argument to every JS-interop API. That's why we have been pushing for a language change that would let us have automatic conversions as a feature. We would then use it to cover the common primitive values (int to JSNumber). We probably don't want to use automatic converstions everywhere, since for more expensive objects developers should be involved in deciding what to do. For example, should a list be copied over or proxied?