Metadata on the type system a module is using

giltayar commented 2 years ago

In the meeting on the 4th of March, the issue of knowing what type system a module is using came up.

The JavaScript runtime would not care, as it is agnostic to whatever is written in the places carved out for type systems. But the tooling would want to know. For example, the Flow type checker would want to ignore any modules that are using the TypeScript type system.

This will become very important once this becomes popular, as some packages will use one type system, and the other would use another type system, and tools would need a way to figure out which is which.

As I can see it, this can be done inband (in the module code itself, e.g. in a comment, or in a "use type:flow" kind of construct) or out of band (in the filename extension, in a configuration file near the file itself, or any other way that is not in the module code itself).

Should the proposal define where this metadata resides? If so, where does it reside?

benjamingr commented 2 years ago

Should the proposal define where this metadata resides? If so, where does it reside?

I believe this is interesting but belongs in a future proposal.

What do languages currently do? Is mixing Flow and TypeScript in the same project common and if so - what do users currently do?

orta commented 2 years ago

React Native projects are probably good examples of mixed projects, the 'main' in 'react-native' is an index.js file which has flow specific syntax, then most libraries ship with transpiled JS (but they don't have to) and lots of react native projects are then written in TypeScript.

The assumption is that all react-native projects use the metro bundler which has some of these assumptions baked in and generates .js files for the runtime. As a user, this is largely invisible to you as the bundler handles the complexity.

giltayar commented 2 years ago

@benjamingr currently, all packages export JS only, and so this is not a concern. The missing type information is exported out of band by TS using .d.ts.

But if JS files includes the type information, theoretically we wouldn't need any .d.ts. But this is assuming the type-checker (e.g. TS or Flow) can know what type system a module/package is using. This is the concern raised in the call we had last Friday: that without this in band information, the DX will be lacking, or the typechecker will need to "bake in" heuristics that figure out whether this file has types from TS, Flow, or whatever.

ljharb commented 2 years ago

It seems like something that would be easy to do out of band, as a field in package.json.

RN's approach is massively problematic for a ton of reasons; shipping non-JS code has proven to be an antipattern in a myriad of ways (which envs that JS should target is a quite different discussion, of course). We should be designing a world where people can ship JS, with any additional metadata needed in accompanying files and out-of-band metadata, for maximal compatibility.

lucacasonato commented 2 years ago

I disagree that this should happen out of band. Not all runtime environments have out of band places to put this (eg Deno does not use package.json).

Also, formatters would likely need to know the type dialect to be able to use the correct parser and emitter for the type comments. Do we really want a world where every js file needs to be accompanied by a config file to let formatters/ide/etc know what type dialect is being used in the comments?

ljharb commented 2 years ago

Deno has the filesystem, so it has plenty of places to put it - just like import maps and the cache of dependencies, i believe.

I’m suggesting a manifest file per-package/project, not per-file.

lucacasonato commented 2 years ago

No, it doesn't.

a) Deno can run individual JS/TS files without requiring any additional metadata or config files. We'd like to keep it that way. b) Deno supports URL imports. How should a type checker know if a remote imported file uses TS or Flow type annotations? Probe the network until we find a config file? (we can't) c) Deno has no concept of a package (package being a collection of modules). To Deno a program is just a graph of ES modules. Individual modules are never augmented with additional per module out of band information. The only out of band information we use at runtime right now are import maps, but those are global (only a single one per program), not per module, or per package.

Information about the type dialect needs to be provided in-band.

lucacasonato commented 2 years ago

I’m suggesting a manifest file per-package/project, not per-file.

So you want to require a manifest file for this out of band info, even if you are writing a single file tooling script?

ljharb commented 2 years ago

Yes, that’s what I’m suggesting.

How would you suggest the information be provided in-band? If a comment, that can be done without language cooperation; if something syntactic, how do we agree on the meaning of the metadata? Would it be a land grab, for whoever can gain the most adoption for their flavor first? What if there ends up being two popular type systems named “awesome”?

lucacasonato commented 2 years ago

I see two main ways of doing this:

a top of file comment (// @ts-check, // @flow) etc
- upside that it adds no new syntax
- downside that we can not enforce that type checkers will require these kinds of pragmas to be present to parse their "dialects"
start of file syntax (e.g module type "ts";)
- downside is that we add new syntax
- upside is that we can enforce the constraint that files wishing to use "types as comments" must specify the module type at the start of the file.

And to re-iterate on your other comment: Deno can not have out of band manifest files for remote URL dependencies that are discovered via heuristics like relative location to the module. As such, we will need this information in band to be able to provide a good user experience.

ljharb commented 2 years ago

Who defines what “ts” means? Do we need a registry, or do we risk collisions? Do we repeat the mistake of module specifiers and leave it up to hosts, or do we specify it somehow?

wparad commented 2 years ago

I don't think this should be configurable, the type system should be coupled and canonical to JS, and if other other tools such as flow or tsc want to handle it, they can. There doesn't need to be extra metadata nor explicit configuration. I don't understand why the default proposal would need to introduce this. And even in the rare case we could argue for something it is completely independent from the proposal of adding canonical types.

ljharb commented 2 years ago

For that, the proposal would have to contain and define a type system - currently it does not.

Knagis commented 2 years ago

Couldn't import assertions be the solution for this aspect?

Yes, it puts the definition on the consumer of the module, not the module itself, but it also means that the consumer could choose if it wants to incur the cost of verifying that module in its tooling.

lucacasonato commented 2 years ago

No. Import assertions assert existing properties of a module, not add new metadata. Specifying the metadata at the consumer also does not address the issues I listed above.

Knagis commented 2 years ago

Isn't "uses typescript types" an existing property of that module? Seems similar to how the JSON assertion works - it is the module that itself is a JSON file, but it is the responsibility of the consumer to declare that you want to use it as such.

import { } from "https://foo/foo.js" assert { validation: "typescript" };

Also an option - with the isolated modules, a "package" could export an index file that declares these typesystems with reexports from other files.

lucacasonato commented 2 years ago

Isn't "uses typescript types" an existing property of that module?

Maybe in the mind of the programmer, but not for the tooling. How does the tooling know the file uses typescript before it encounters the assertion? It doesn't (that's the point of this issue). Import assertions do not hint or give metadata to the runtime at all. They just assert (check) that the metadata that the runtime found and the one that the programmer expected are the same. Also see https://twitter.com/lcasdev/status/1495684778443644929.

In any case, this still does not solve the problem of how a formatter would figure out in complete isolation that the type dialect of a given file is TypeScript so it can format it correctly.

Knagis commented 2 years ago

Looking at the same import from "./foo.json" example - part of tooling such as IDE would look at the file extension and work from there. While others (bundlers etc.) would follow the import statement and likely verify that the target file is indeed a JSON that can be imported. Wouldn't the same be applicable to types?

ljharb commented 2 years ago

The format of the module (like JSON) is also something the author should be asserting, not the importer, but web constraints forced our hand. We shouldn't repeat that mistake; the type system in use is solely up to a module author, and shouldn't need to be stated by consumers.

theScottyJam commented 2 years ago

As for this point:

Would it be a land grab, for whoever can gain the most adoption for their flavor first? What if there ends up being two popular type systems named “awesome”?

Would it not be a land grab, first-come-first-serve? Whether it's done in-bound or out-of-bound, that's how these sort of things tend to work. There's no official committee that decides what the "ts" file extension means. TypeScript just decided to use it (even if it were possible that some other people had used it in the past), then they became popular, and editors now support the ts file extension under the assumption that they're loading a typescript file. Similarly, if a she-bang is used, what happens if multiple people decided to create a product called "typescript", and they both decided to install it at "/bin/typescript", your shebang could end up executing some random typescript program that someone happened to have installed on their machine.

I would be interested to see some sort of consideration given to this problem. The last thing I'd want to see if for each type-safe language providing their own mechanism to denote that "this is a flow file" or "this is a TypeScript file" to help editors. (i.e. flow has you put a // @flow at the tope of a file, while TypeScript has you put a 'using typescript' string at the top, or something like that). Sometimes, I open up a single file from a project in my editor (not a whole project), and I'd like the editor to still know how to properly syntax highlight and provide minimal tooling support, even if it doesn't have the context of the whole project or the project's configuration to help out.

giltayar commented 2 years ago

I would guess that even if an in-band mechanism isn't selected, one such mechansim would be very quickly adopted by all existing type systems and would become the de-facto standard. So I'm optimistic that one way or another, this will be solved.

As I can see it, there are three options for an in-band way to define what type system is used in a module:

Mode, e.g. "use type flow";
Comment, e.g.//@types typescript
Metadata addition to JavaScript, .e.g @#@types: hegel

(the suggestions above are strawpersons. please don't pick on them specifically)

Defining such in-band solutions is a minefield. TC39 have historically been averse to "modes", and using comments to denote something has never been done before. Which leaves the third option, which would probably be very difficult, even as a separate proposal, and definitely not in a proposal of this size!

wparad commented 2 years ago

There are at least four other options as well, which are all better so I'm not sure why wouldn't include them:

Convention based file properties package.json props
Environment variables
Global object with method handler to be called in index.js
reserved words syntax added to require or import

With latter two being significantly better.

ljharb commented 2 years ago

The last one puts the burden on the importer instead of the author, so i find that to be significantly worse.

giltayar commented 2 years ago

@wparad the first two are out of band options, which is why I haven't included them. I'm not sure that TC39 could include them in the standard even if they wanted to!

The fourth option, if I understand it correctly, puts the burden of choosing the type system on the importer of the module? Would that make sense? If the module decides to change the type system, do we now need to change all the imports everywhere?

The third one I don't understand.

wparad commented 2 years ago

module and require are for example globally defined, it would be possible to specify in the module module.setTypeConfiguration({}) and run with it or as part of a modified export statement.

giltayar commented 2 years ago

So that would mean the type checker would need to run the module to figure out what type system it needs? That doesn't feel right.

jethrolarson commented 2 years ago

I proposed a pragma syntax in this branch: https://github.com/jethrolarson/proposal-types-as-comments/blob/generic_annotations/README.md as part of discussion on #80

::@typescript

where :: is the preceding sigil for all annotations. I just picked JS identifier as arbitrarily requirement for the identifier but it's conceivable to allow dashes.

tc39 / proposal-type-annotations

Metadata on the type system a module is using #36