google / starlark-go

Starlark in Go: the Starlark configuration language, implemented in Go
BSD 3-Clause "New" or "Revised" License
2.32k stars 212 forks source link

Proposal: Add Function Type Annotation Syntax #439

Open carlverge opened 1 year ago

carlverge commented 1 year ago

I'd like to re-visit the proposal for python-like function type annotations in starlark, at least in the starlark-go implementation. This has been discussed before in:

An example of what this looks like in practice:

def func_one(a, b: str, c: list[int]) -> bool:
    pass

This proposal is only for function definition type hints. Other kinds (such as assignment annotations) could be discussed, but I believe function annotations provide the most value with least effort, and can be done without changing any runtime semantics. Function type annotation syntax has already been implemented in buildifier (https://github.com/bazelbuild/buildtools/pull/946) and starlark-rust (https://github.com/facebookexperimental/starlark-rust/blob/main/docs/types.md).

The proposed syntax follows a subset of python type annotations, as well as the same syntax already supported by buildifier and starlark-rust. Type annotations are Test syntax elements, and are parsed as expressions.

-DefStmt = 'def' identifier '(' [Parameters [',']] ')' ':' Suite .
+DefStmt = 'def' identifier '(' [Parameters [',']] ')' ['->' Test] ':' Suite .

 Parameters = Parameter {',' Parameter}.

-Parameter = identifier | identifier '=' Test | '*' | '*' identifier | '**' identifier .
+Parameter = identifier [':' Test] | identifier [':' Test] '=' Test | '*' | '*' identifier [':' Test] | '**' identifier [':' Test] .

Requirements

Why Now?

Choices

Syntatic Element of a Type Annotation

Test / Expr. This is the same element chosen by starlark-rust and buildifier, and is the closest analogue to python type hints which allows arbitrary expressions.

Ident with TypeHint Field instead of TypedIdent

The buildifier implementation parses a TypedIdent with an ident and type hint:

type TypedIdent struct {
    Comments
    Ident   *Ident
    Type    Expr
}

Whereas the proposed implementation for starlark-go reuses type Ident struct with a TypeHint Expr field:

type Ident struct {
    commentsRef
    NamePos  Position
    Name     string
    TypeHint Expr // TypeHint can be nil

    Binding interface{} // a *resolver.Binding, set by resolver
}

There are two reasons for this:

syntax.Walk Support

This PR intentionally excludes syntax.Walk support for type hints, as it would change the behaviour for programs using starlark-go to parse starlark.

On-the-fence Choices

Syntax Feature Flag

Whether or not to lock the syntax behind a flag. Unlike other features behind flags, this does not change the output of programs and does not have backwards compatibility implications.

Resolving Type Annotation Identifiers

The PR adds a ResolveTypeHintIdents flag in the resolve package, which defaults to false. When it is false, type hints cannot cause resolution/compilation failures.

When it is true, type annotation identifiers are resolved and given binding information. If a type annotation refers to an unknown identifier, it could cause resolution to fail. Additionally, if a type hint in a nested function definition referred to an identifier from its parent scopes, it could cause local variables to be promoted to Cell. Because this can slightly change program behaviour, and cause compilation failures it is enabled behind a flag.

I consider this fairly desirable because:

Skip Type Hints for Assignment Syntax

These are supported by starlark-rust, but not buildifier. Python PEP-526 formalizes this for python, however the only applicable syntax for starlark would be variable assignment and definition:

x  # Binding failure
x: int  # Works in python, binding failure in starlark
y: str = "asdf" # Would work in starlark, but we could infer type anyways.

Python changed its behaviour slightly to allow binding unknown variables if given a typehint. This would be a runtime behaviour change in starlark. The currently supported behaviour would require assignment anyways, where we can often infer the type anyways.

While assignment annotations could provide some value, it's a lot less clear cut to me than function definition annotations. I'd like to table discussion of assignment type hints for a different proposal.

adonovan commented 1 year ago

Thanks for this proposal. There is definitely some interest within Google for experimentally evaluating the costs and benefits of type annotations within Starlark, but I expect any movement will be slow and cautious. In particular, I think it would be crucial for structs to be dealt with in some way.

Also, some Starlark files within Google's main repository are evaluated in multiple dialects of the language, even by interpreters in more than one implementation language, so we are very sensitive to deviations from the spec or differences between the implementations (and we still have work to do there) and wary about adding new ones.

If the purpose of the new annotations is to provide documentation, but not to change the dynamic semantics in any way, then one possibility (suggested by @laurentlb) is that you write a function from []byte to []byte that uses the buildifier parser and printer to replace type annotations by spaces. Then you can put types in your source files and run your static checking tools on those source files, but you can easily strip the annotations out before the starlark-go parser sees them. This would allow you to try them out without needing to change any code in this repo.

andponlin-canva commented 6 months ago

...so we are very sensitive to deviations from the spec or differences between the implementations...

Hello @adonovan ; in relation to this ticket, it appears there is some work happening on the Rust implementation with typing. Is that something that Google are interested to mirror in the Go-lang implementation?