rescript-lang / rescript

ReScript is a robustly typed language that compiles to efficient and human-readable JavaScript.
https://rescript-lang.org
Other
6.76k stars 453 forks source link

Draft RFC: Simplifying config and user manifest #6475

Open cometkim opened 1 year ago

cometkim commented 1 year ago

Status: WIP. More research is required to complete this RFC.

This aims to simplify the current configuration schema and its parsing. Trying to explain its specifics and how to achieve it.

Motivation

ReScript v11 will support rescript.json for projects and their dependencies (See #6382), and so is deprecating bsconfig.json.

Although it was a good opportunity to become aware of its inherent problems, and many deprecated or undocumented features. ReScript v11, we want to clean up these to make config easier to use and simplify the parsing phase.

Goals

For end-users, it is confusing because it's too complex, and contains many deprecated or undocumented features. So, the goal is to re-organize this to reduce confusion and provide a simpler schema that fits most use cases today.

For maintainers, the goal is to reduce code complexity. This is achieved by major code refactoring for the long term rather than simply deleting features.

For 3rd-party developers, the goal is to make reusing it outside the compiler (and the compiler itself) easier.

Detailed design

Cleanup & Simplifying Schema

TBD, this should be asked to the forum to collect actual use cases of current living features.

Not all of the suggestions here will happen, but we'll list out possible options for exploration ahead of the discussion.

Remove bs prefixes:

Remove legacy specs:

Considerable redesign:

Remove undocumented, rarely used features:

Fix naming convention:

So ideal configuration might look like this:

{
  // TBD
}

Two-phase parsing

The current implementation (bsb_config) is interprets JSON input as it reads, with many IO operations. Since it's not pure and pretty heavy, it cannot be reused elsewhere it is needed such as syntax, gentype, reanalyze, etc.

This proposal makes a new concept "manifest" or "user manifest" or "project manifest" which is a pure representation of rescript.json input. It is generated by JSON parsing and not by any system calls. So the manifest's types and parsing logic can be reused with no restrictions where necessary.

Then the existing bsb_config represents the execution context of the bsb program and is initialized by the rescript_manifest module.

The program flow would be like this:

JSON -> rescript_manifest -> (bootstrap misc) -> bsb_config -> (bootstrap bsb)

Also rescript_manifest can be published to opam, to be consumed by third-party developers

Managing Codebase

To keep the new config logic sustainable, we need to isolate it from the old codebase. This is achieved by the separated module rescript_manifest described in the "Two-phase parsing" section.

To do this, we should extract the JSON-related part from the existing bsb_config code, and use it in bsb_config again.

JSON parsing logic is managed for each field within the rescript_manifest module. The rescript_manifest parser is a combination of field parsers.

Schema Detection

The rescript_manifest module maintains a snapshot of supported field combinations and exposes this as a version of the schema.

The manifest version exists independently of the compiler version and is mapped to the config version of the running compiler, although the compiler can still handle multiple versions of the manifest.

The manifest version is encoded in URL and managed in compatibility date format to facilitate incremental enhancement such as adding fields.

e.g. https://rescript-lang.org/manifest/2024-01-01/rescript.json

The manifest input must provide a $schema field for the compiler to identify the manifest version.

Users must select one manifest version. This means that users cannot use bs-dependencies and dependencies at the same time.

Implementation steps

  1. Building a base parser combinator library in OCaml,

    • It should have no dependencies
    • At the high level, it is a group of field parsers and its combination, the manifest parsers
    • Each field parser is independent so it shouldn't depend on each other
    • Each field parser can emit a JSON Schema representation for the field
  2. Extract the bsconfig schema from the bsb_config module, and isolate it to separate module rescript_manifest.

  3. Define a registry and workflow to experiment with new manifest schema, incrementally

Migration Strategy

Switching Schema Detection Behavior

For v11.x, If the $schema field is not given, the https://rescript-lang.org/manifest/bsconfig.json (legacy) schema is automatically selected.

For v12, it will select the latest version supported by the current compiler version. The compiler may drop bsconfig schema support entirely.

An explicit version is recommended, the compiler may warn users about this.

Tooling Support

TBD

on rescript init, on VSCode extension, migration command, etc

zth commented 1 year ago

This is good stuff @cometkim ! How do we move forward on this?

cometkim commented 1 year ago

I'm not actively promoting this to avoid adding any more noise before the v11 release. And still having time to think about better refactoring and migration strategies :grinning:

But please feel free to mention me if you have any thoughts on the config schema you rely on when working with other proposals, such as the #6408

jfrolich commented 1 year ago

Do we want to keep dev-dependencies? Is there a clear use-case for it?

cometkim commented 1 year ago

Do we want to keep dev-dependencies? Is there a clear use-case for it?

no use cases right now unless we plan to support a standalone package manager for rescript.json. However, I have explored this and it may still be useful information to third parties.

glennsl commented 1 year ago

dev-dependencies is useful for test frameworks and such.

cometkim commented 2 months ago

I briefly discussed with @cknitt about using a third-party library for parsing manifest in the last ReScript Retreat.

TL;DR: Why not?

We have a lot of "ext" code to achieve zero dependencies. It has become another gray area for us over time.

There are some planned changes to JSON parsing (removing comments, trailing comma support) but making the change has a very low ROI.

I suggest adding Yojson as a dependency to build a new manifest. Yojson is very actively maintained and used in the OCaml community. Avoid "too declarative" options like ppx_deriving_yojson, atdgen for structured validation and error integration. At least for now.

cknitt commented 2 months ago

Yes, in earlier days it was important to minimize dependencies at all costs as their source had to be vendored and shipped as part of the npm package to be able to build the compiler binaries on demand on npm install rescript.

These days, fortunately, we just ship precompiled binaries for all major platforms instead and can easily add dependencies via our opam/dune-based build workflow.

While I still think we should be conservative about adding dependencies, adding yojson for JSON parsing seems absolutely fine to me.