proposal: remove notion of core types from the language spec.

griesemer commented 2 days ago

This issue replaces the investigative issue #63940 with a concrete proposal. To go straight to the proposal text, skip the Background and Motivation section.

Background

The Go 1.18 release introduced generics and with that a number of new concepts, including type parameters and type constraints. A type constraint acts as the "type" of a type parameter by describing the type parameter's properties, similarly to how a struct type describes the properties of a variable of that struct type.

The language requires that any concrete type that instantiates a specific type parameter satisfies the type parameter's constraint. Because of this requirement one can be certain that a value whose type is a type parameter possesses all of that type parameter's properties, no matter what actual, concrete type is used to instantiate the type parameter.

In Go, type constraints are described through a mix of method and type requirements which together define a type set: a type set comprises all the types which satisfy all the requirements. Specifically, Go 1.18 uses a generalized form of interfaces for this purpose. An interface enumerates a set of methods and types, and the type set described by such an interface consists of all the types that implement those methods and that are included in the enumerated types.

For instance, the interface

type Constraint interface {
    ~[]byte | ~string
    Hash() uint64
}

consists of all the (possibly named) []byte and string types that also implement the Hash method.

Given these descriptions of type sets, which in turn describe the properties of type parameters, it is possible to write down the rules that govern operations on operands of type parameter type.

For instance, the rules for index expressions state that (among other things) for an operand of type parameter type P:

The index expression a[x] must be valid for values of all types in P's type set.

The element types of all types in P's type set must be identical. In this context, the element type of a string type is byte.

These rules enable the following code (playground):

func at[P Constraint](x P, i int) byte {
    return x[i]
}

The indexing operation x[i] is permitted because the type of x is P, and P's type constraint (type set) contains []byte and string types for which indexing with i is valid.

Motivation

The rules for index expressions have specific clauses for when the type of an operand is a type parameter. Similarly, the rules for unary and binary operations also have such clauses. For instance, in the section on Arithmetic operators, the spec says:

If the operand type is a type parameter, the operator must apply to each type in that type set.

This rule allows for the operator + to add two operands of (identical) type parameter type, as long as + is valid for any type in the respective type parameter's constraint.

This type set-based and individualized approach permits the most flexible application of operations on operands of type parameter type, and is in line with what the original generic proposal (Type Parameters Proposal) intended: an operation involving operands of generic type (i.e., whose type is a type parameter) should be valid if it is valid for any type in the respective type set(s).

Because of time constraints and the subtlety involved in devising appropriate rules for each language feature that may interact with generic operands, this approach was not chosen for many language features. For instance, for Send statements, the spec requires that

The channel expression's core type must be a channel, the channel direction must permit send operations, and the type of the value to be sent must be assignable to the channel's element type.

This rule relies on the notion of a core type. Core types offer a short cut for the spec: if a type is not a type parameter, its core type is just its underlying type. But for a type parameter, a core type exists if and only if all types in the type parameter's type set (that is, the type set described by the type parameter's constraint interface) have the same underlying type; that single type is the core type of the type parameter. For instance, interface{ ~[]int } has a core type ([]int), but the Constraint interface from above does not. (In reality, the definition of core types is subtly more complicated, see below.)

The notion of a core type is a generalization of the notion of an underlying type. Because of that, pre-generics most spec rules that relied on underlying types now rely on core types, with a few important exceptions like the ones mentioned earlier. If the rules for index expressions were relying on core types, the at example above would not be valid code. Because the rules for Slice expressions do rely on core types, slicing an operand of type P constrained by Constraint is not permitted, even though it could be valid and might be useful.

When it comes to channel operations and certain built-in calls (append, copy) the simplistic definition of core types is insufficient. The actual rules have adjustments that allow for differing channel directions and type sets containing both []byte and string types. These adjustments make the definition of core types rather complicated, and are only present to work around unacceptable restrictions that would be imposed by the language otherwise.

Proposal

Summary: Remove the notion of core types from the language specification in favor of dedicated prose in each section that previously relied on core types.

For example, rather than using a rule based on core types in the section on slice expressions, the proposal is to use appropriate prose similar to what is used for index expressions, which does not rely on core types (and which is more flexible as a result).

The proposed approach is as follows:

For each operation/language feature with rules based on core types, revert the relevant language spec section to essentially the Go 1.17 (pre-generics) prose, and add a type-parameter specific paragraph that describes how the rules apply to generic operands.
Remove the section on core types from the language spec.
Implement the necessary changes in the compiler.

The proposed changes to the spec can be reviewed in CL 621919 and are considered part of this proposal. (The exact prose is up for discussion and expected to be fine-tuned.)

[!NOTE] CL 621919 still contains references to core types in the section on type inference and unification. We plan to rewrite those sections as needed and remove those references as well. Since these parts of the spec are highly technical and detailed, we are less concerned about their exact prose: these sections are unlikely to be consulted by non-experts in the first place. To get started, we may simply replicate the notion of core types "in line" until we understand better what changes to type inference preserve backward-compatibility.

Discussion

Removing the notion of core types from the language specification has multiple benefits:

There is one less core concept (no pun intended) that needs to be learned and understood.
The specification of most language features can again be understood without worrying about generics.
The spec becomes easier to read and understand.
The individualized approach (specific rules for specific operations) opens the door to more flexible rules.

The changes are designed to be 100% backward-compatible.

Implementation

The relevant implementation changes primarily affect the compiler's type checker. The proposed changes to for-range statements currently include an implementation restriction for the range-over-func case; loosening or removing that restriction may require compiler back-end changes for an efficient implementation (currently not planned).

The relevant type checker changes have been prototyped and can be found in a stack of CLs ending in CL 618376.

Impact

Because the changes are designed to be 100% backward-compatible, implementing this proposal is expected to be unnoticeable for existing Go code.

Some code that currently is not permitted will become valid with this change. For instance, slice expressions, composite literals, and for-range statements will accept generic operands and types with less restricted type sets.

Analysis tools may need to be updated. We believe that this can be done incrementally, on a (language) feature-by-feature basis.

Tentative time line

This time line assumes that this proposal is uncontroversial and accepted fairly quickly.

Early November 2024: Proposal published.
End of 2024: Proposal accepted (hopefully).
Early February 2025: Proposal implemented at start of development cycle for Go 1.25.
August 2025: Proposal released in Go 1.25.

Future directions

The use of core types in the spec implied a somewhat rigid framework within which rules for language features were considered.

For instance, proposal #48522 is about permitting access to a struct field that is present in all structs in a type set (example). This is currently not permitted and the proposal was closed in favor of #63940, the precursor issue for this proposal.

If we accept this proposal, we will follow up with a proposal for more flexible field access, along the lines of #48522.

Type inference and type unification also rely on core types. Removing this dependency may enable type inference in some cases (such as #69153) where it currently fails.

gabyhelp commented 2 days ago

Related Issues and Documentation

_{(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)}