rescript-lang / rescript

ReScript is a robustly typed language that compiles to efficient and human-readable JavaScript.
https://rescript-lang.org
Other
6.75k stars 449 forks source link

RFC: New GenType for TypeScript #6196

Open cometkim opened 1 year ago

cometkim commented 1 year ago

Proposes a new type of GenType that is incompatible with the existing GenType (will probably have the name GenType v2 or something quite different)

Motivation

GenType is a must in today's JS ecosystem starting with TypeScript. GenType is ending support for JS(untyped) and Flow, which it had previously, and is focusing more on TypeScript.

As ReScript v11 gets better interop capabilities and gets rid of runtime conversions.

It may be a good time to explore new possibilities of GenType through aggressive changes.

Goals

Design

The key to the change is to assume the output of GenType is no longer the input of tsc. Instead, it becomes the direct output.

The only behavior of the new GenType is to emit an interface per module. So it is not specified per binding, but per project.

%%private(let value = 123)

let foo = () => value

Will generates Module.bs.d.ts next to the Module.bs.js

export const foo: () => number;

So makes Module.bs.js usable in a TypeScript project without any additional builds. That's it. All we have to do is adjust the output to represent the ReScript output.

No attributes

New genType will work for all sources if enabled. It no longer requires a @genType attribute on a per-binding basis, nor does it introduce a per-file attribute like @@genType.

The @genType attribute makes PPX compatibility difficult (https://github.com/rescript-lang/rescript-compiler/pull/6537) and requires complexity to track dependencies. This is because even if the user does not specify it, it appears implicitly according to the relationship.

Configuration

Don't inherit the existing gentypeconfig. Almost all existing options are not required.

Determine the output format and location via package-specs.

Determine the output filename via suffix. e.g. .bs.js, -> .bs.d.ts.

Shims

Shims for external types are OK. Use it as-is.

If the project is a library, shims must be published together.

Nested / First-clss modules.

Namespaces might be useful for representing nested modules (See #6117). But we don't use namespaces here because it isn't compatible with type referencing and first-class module syntax.

Example: First-class modules ```res @@genType module type MT = { let x: int type t = int @module("foo") external f: int => int = "f" module type XXX = { type tt = string } module EmptyInnerModule: { } module InnerModule2: { let k: t } module InnerModule3: { type inner = int let k3: inner => inner } module type TT = { let u: (int, int) } module Z: TT let y: string } module M = { let y = "abc" module type XXX = { type tt = string } module EmptyInnerModule = { } module InnerModule2 = { let k = 4242 } module InnerModule3 = { type inner = int let k3 = x => x + 1 } module type TT = { let u: (int, int) } module Z = { let u = (0, 0) } type t = int @module("foo") external f: int => int = "f" let x = 42 } type firstClassModule = module(MT) let firstClassModule: firstClassModule = module(M) ``` will generates ```ts export type MT = { x: number, t: number, XXX: MT_XXX, EmptyInnerModule: MT_EmptyInnerModule, InnerModule2: MT_InnerModule2, InnerModule3: MT_InnerModule3, TT: MT_TT, Z: MT_Z, }; type MT_XXX = { tt: string, }; type MT_EmptyInnerModule = { }; type MT_InnerModule2 = { k: MT["t"], }; type MT_InnerModule3 = { inner: number, k3: (arg1: MT_InnerModule3["inner"]) => MT_InnerModule3["inner"], }; type MT_TT = { u: [number, number], }; type MT_Z = MT_TT; export const M: { x: number, t: number, XXX: M_XXX, EmptyInnerModule: M_EmptyInnerModule, InnerModule2: M_InnerModule2, InnerModule3: M_InnerModule3, TT: M_TT, Z: M_Z, }; type M_XXX = { tt: string, }; type M_EmptyInnerModule = { }; type M_InnerModule2 = { k: M["t"], }; type M_InnerModule3 = { inner: number, k3: (arg1: M_InnerModule3["inner"]) => M_InnerModule3["inner"], }; type M_TT = { u: [number, number], }; type M_Z = M_TT; export type firstClassModule = MT; export const firstClassModule: firstClassModule; ```

Opaque types

TBD

GADT

TBD

Others

Please leave comments what cases it needs to cover.

cristianoc commented 1 year ago

Would you update some info which is old? As of master, gentype does not generate any runtime conversion so we can get that out of the design considerations.

cristianoc commented 1 year ago

I would include removing shims as a goal. Do we need them?

cristianoc commented 1 year ago

Based on this goal https://github.com/rescript-lang/rescript-compiler/issues/6210, it seems sensible to only focus on export to JS, and move imports questions to the FFI layer, which can be revisited separately. So effectively, the concept of @genType.import could be removed.

In particular, typed checked bindings could be re-explored in the context of https://github.com/rescript-lang/rescript-compiler/issues/6211

Import: a bit more clarity is needed. It would be nice to remove them. Does it mean that imports will be essentially untyped? Or that it is not genType's job to make sure that imports are well typed? Not sure which one is meant, just asking for clarification of the proposal.

cristianoc commented 1 year ago

Not sure we need a new tool. It seems easier to just add configuration to turn on the new mode. And if successful, the new mode will replace the legacy mode over time which will be deprecated and removed.

gustavopch commented 12 months ago

Having GenType generate .bs.d.ts automatically will be very much appreciated. While that doesn't happen, I'm using this js-post-build script in case it helps anyone:

#!/usr/bin/env bash

bs_js_path="$1"
bs_ts_path="${bs_js_path/bs\.js/bs.ts}"
bs_dts_path="${bs_js_path/bs\.js/bs.d.ts}"
gen_tsx_path="${bs_js_path/bs\.js/gen.tsx}"

if [ -f "$gen_tsx_path" ]; then
  mv -f "$gen_tsx_path" "$bs_ts_path"
  yarn tsc --declaration --emitDeclarationOnly --isolatedModules --skipLibCheck "$(readlink -f "$gen_tsx_path")"
  rm -f "$bs_ts_path"
else
  rm -f "$bs_ts_path" "$bs_dts_path" "$gen_tsx_path"
fi
cometkim commented 11 months ago

I would include removing shims as a goal. Do we need them?

This is possible if we embed .d.ts for things that are treated first-class like Js/React/Belt, etc.

cometkim commented 5 months ago

Retreat Update: GenType

We (@cometkim, @JonoPrest, @cristianoc) discussed the roadmap and detailed design of the output formats in the ReScript Retreat 2024. And we have concluded most issues! If there are no remaining questions, implementation will begin as soon as possible.

Core concepts

Most are similar to existing plans but are more specific.

Drop attributes

Except in special cases, all @genType attributes will be ignored. Instead, enabling genType in the project will generate a type that matches any actual JavaScript value.

Not only is this clearer, but it also significantly reduces the complexity of the compiler codebase.

Drop shims

We haven't found a legitimate use case for shims without runtimes.

By supporting .d.ts, libraries can ship type definitions without any additional TypeScript toolchain. Alternatively, it can also be generated on the fly by the user.

Also many cases may covered by using @genType.import

Keeping support @genType.import

We found that @genType.import has actual use case unlike other attributes.

@genType.import(("./modulePath", "TypeName"))
type typeName = {
  ...
};

This serves to connect back to external types in GenType output while using compatible types defined in ReScript. In other words, it's like FFI but for TypeScript types.

There was suggestion for syntax support rather than atrribute,

@module("./modulePath") @type
external typeName: resType = "TypeName"

but that won't happen right away because it doesn't solve extra problem and adds complexity somehow.

New opaque format

We're removing special attributes like @genType.opaque but still support opaque types.

We explored practical use cases leveraging opaque types in both ReScript and TypeScript worlds, and designed interoperable formats. An example:

type valid
type invalid

type t<'s> = string

let validate: t<invalid> => t<valid>

This pattern is well known as a "phantom types" in many other type systems and as a "branded types" in TypeScript.

The existing format uses TypeScript’s abstract classes and protected fields to express .

abstract class valid { protected opaque!: any }
abstract class invalid { protected oapque!: any }
abstract class t<a> { protected oapque!: a }

This is a well-known trick, but since class has both value and type semantics, there is a possibility that identifier may be abused in any value position. It can be triggered by IDE auto-completion even if the user doesn't intend it to be.

The new proposed format is:

declare global {
  interface $$Module {
    const valid: unique symbol;
    const invalid: unique symbol;
    const t: unique symbol;
  }
}

export type valid = { [$$Module.valid]: [] };
export type invalid = { [$$Module.invalid]: [] };
export type t<a> = string & { [$$Module.t]: [a] };

It could look more complex but gives the best usability so far.

One downside is TS error message will be bloated (even) more. But it doesn't make much of a difference from utilities (e.g. ts-brand) that are popular today.

Fix GADT format

GADT doesn't work today, but it's easy to fix.

GADT types have different names for each specific type by appending $Tag

type rec t<_> =
  | Int(int): t<int>
  | Float(float): t<float>

let logInt: t<int> => unit = v => {
  switch v {
  | Int(v) => Js.log(v)
  }
}

let logFloat: t<float> => unit = v => {
  switch v {
  | Float(v) => Js.log(v)
  }
}

let log: type a. t<a> => unit = v => {
  switch v {
  | Int(_) => logInt(v)
  | Float(_) => logFloat(v)
  }
}

will generates

export type t$Int = { TAG: "Int"; _0: number };
export type t$Float = { TAG: "Float"; _0: number };
export type t = (
  | t$Int
  | t$Float
);

export const logInt: (v: t$Int) => void;

export const logFloat: (v: t$Float) => void;

export const log: (v: t) => void;

This format may not work for all parameterized GADTs, but at least it supports the most common and practical GADT use case.

Modules format

(Note: This part was left out of the Retreat discussions, but I proposed again as a result of the further PoC)

Because ReScript modules can contain both values and types and module types can be converted to first-class module values anytime. Therefore, we need to represent module values and types in a more flexible way.

To make this as simple as possible, I've suggested a few rules.

  1. Modules (types) always produce a pair of value representation and type representation.
  2. A value representation doesn't contain any type fields, but a type representation is defined as a superset containing both value fields and type fields.
  3. Value representations reference only other value representations, type representations reference only other type representations
  4. First-class modules use value representation.

This approach allows us to cover first-class modules while at the same time implementing GenType in a much simpler way. Using these rules, GenType only needs to convert the name in a location without any complex dependency analysis.

See the example for more details.

Note: We don't generate it for module functions that are only used in ReScript. Although the TypeScript representation for module functions is not complex, there are no practical use cases.

Implementation plan

Phase 1. Tweak the existing implementation

Add a configuration flag to create .d.ts by tweaking the current GenType implementation.

Except that original outputs have runtime bindings, this proposal can be treated as a "bug fix" even in existing formats.

Phase 2. Create a new GenType entry point as an isolated module

The new implementation is expected to be much simpler than the existing one, so we may eventually have a lighter-weight dedicated implementation.

Rather, if Phase 1 consumes more time, we can skip it and reimplement the gen.ts(x) output and .d.ts output from scratch.

Expected migration story

Typically the only migration task we require of our users is to change the existing .gen.ts(x) import path to the regular .res.js path. Then TypeScript can automatically find definitions from sibling .d.ts.

Users depend on any shims and need to migrate code a bit. But in most cases we expect that to be the core library. Then removing shims and users don't need extra work.

And most existing attributes will be ignored. We end up helping the user remove unnecessary attributes in the formatter or another codemod tool.

Key phrase

If you have any ideas for it, please comment!

JonoPrest commented 5 months ago

I still feel it would be useful to have @genType.as("...") attribute as well. It's helpful in that you can have idiomatic type names in both ReScript and TypeScript. For instance I use this to uppercase type names in TS but lowercase in Res. It's also helpful for reserved keywords etc.

cometkim commented 5 months ago

Ok, that makes sense. IMO, just @as should be fine. In the end, GenType needs to minimize having special grammar and to work well within regular ReScript codebase without any modification

cometkim commented 2 months ago

@genType.import would have a special form to assert type-safety

See https://github.com/rescript-lang/rescript-compiler/issues/6947#issuecomment-2299369675

gabriel-bezerra commented 2 months ago

@cometkim, if you are interested in use cases for the new design, I'm trying to use genType.import for for type safety in bindings and am missing the ability to use it with JS classes.

TS side

export class C {
  constructor(x: boolean) {};
  method() { console.log(this.x) };
  static staticFunction(x: number) { return [new C(true), false]; };
}

export function moduleFunction(x: number, y: boolean): string { return `${x}${y}`; };

Res side

[@genType.import ("package", "C")]
type c;

// this works
[@genType.import ("package", "moduleFunction")]
external module_function_here: (int, bool) => string = "module_function_here";

// these don't work
[@genType.import] [@bs.new]
external constructor_here: bool => c = "constructor_here";

[@genType.import] [@bs.send]
external method_here: c => unit = "method_here";

[@genType.import ("package", "C.staticFunction")]
external static_here: int => (c, bool) = "static_here";
jmagaram commented 5 days ago

Overall very excited about the possibility of new and simpler approach here. In particular I have had difficulty using GenType with functors and include, which are the best way to wrap basic int and string types into domain-specific types with validation, construction, equality, ordering, and serialization functionality.

Please see https://github.com/rescript-lang/rescript-compiler/issues/7156