golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.94k stars 17.66k forks source link

proposal: Go 2: Add struct and interface properties in the style of C# #53155

Closed jsshapiro closed 2 years ago

jsshapiro commented 2 years ago

Summary: the absence of properties impedes source-level backwards compatibility, and somewhat restricts what can be expressed in interfaces without added syntactic cruft.

Compatibility concerns: Go source level: none. Cross-language and binary level: potentially.

Various forms of getter/setter patterns have been proposed and rejected before, and I suspect this one will be no different. The intended contribution here is to clearly describe a real-world use case that illustrates a source-level API compatibility problem in Go. I do not see a way to address this compatibility limitation without something like properties. In the interest of concreteness, I'll describe the specific issue and context where I tripped on this, but the problem I describe is a general problem for source-level API backwards compatibility.

Problem Statement For various reasons, I've been poking at a successor to pigeon. It is a transitional goal that existing pigeon grammars should migrate with minimal change. In particular, existing user-supplied code blocks should not require modification in order to be processed by the new tool. Pigeon code blocks are passed the pigeon parse state object (the *current type), which directly exposes structure fields. Because they are not guarded by getters and setters (of any sort), these fields have become part of a de facto API interface. Some of them were not especially well thought out from a space or runtime efficiency perspective, and are rarely accessed in practice. The new tool will maintain parse state a bit differently, but legacy API compatibility requires that these fields continue to "work" from a source-level perspective.

This is a a "compatibility pattern" that eventually arises whenever a concrete type having fields is exposed by an API. At some level, the root problem is that a concrete type was exposed where an interface should have been exposed instead. Once that is done, source compatibility perpetuates the API design error indefinitely.

Solution Sketch In C#, there is a notion of attributes. These implement a getter/setter pattern without requiring function call syntax at the point of access. Use occurrences of the attribute name are transparently translated to getter calls. Update occurrences (assignments) are transparently translated to setter calls. Their implementation is defined in terms of a high-level syntactic rewrite to methods with a specific name rewriting convention.

This approach can be lifted wholesale and unchanged from C# (right down to the syntax) for use in Go. For those not familiar with this corner of C# syntax, it is presented here in the C# guide. The surface syntax would want to be adapted to a more Go-like syntax.

Doing so addresses or mitigates four problems:

  1. It provides a recovery path for inadequately encapsulated APIs (at least in many cases).
  2. It implements the intermittently requested feature of fields within interfaces.
  3. It eliminates the syntactic overheads of current alternatives at use and update occurrences.
  4. It simplifies some concurrency patterns by making lock acquisition transparent.

I'm sure other objections will be raised, but here are the objections I see to adding properties in Go:

  1. They feel like a bell and whistle addition to an intentionally spare language. This, I suspect, is likely to be the main objection, and I think it is an objection worth considering carefully.
  2. In contrast to the .Net environment, properties are not a compatible replacement for fields when viewed from other languages. From the C perspective, getters and setters look like methods on an object, and the underlying concrete field still (or at least may still be) exposed.[^1] So they address a set of issues within Go code, but the solution will not extend naturally across some very popular language boundaries.
  3. This means that they introduce a new "level" of compatibility objectives to be considered by Go designers: go source compatibility or cross-language compatibility.
  4. Go prefers parsimonious surface syntax. I suspect one could do better than C# with some thought, but some syntactic bulk seems to be inherent in defining what amounts to a pair of procedures.
  5. If properties are treated as a type, rather than handled entirely as a high-level syntactic transformation, some work in the type checker may be needed. Offhand, I see no advantage to treating them as a type.
  6. The keywords get and set would either need to be reserved, or would need to be specified as syntactically significant only in the syntactic context of property definition. Either is straightforward. Given the conceptual weight often placed on these identifiers as prefixes or suffixes, the "only in property syntactic context" approach may be preferable.

These objections being noted, some recurring issues would be simplified by introducing properties:

  1. They make source-level backwards compatible revisions possible in the face of incompletely encapsulated APIs.
  2. They address the intermittently requested feature of fields within interfaces.

Closing Since I suspect this will be rejected quickly, I haven't yet attempted to adapt the C# property surface syntax to Go. If interest is strong enough, I'm happy to do so, and I suspect I'd be able to create a suitable change set for the Go compiler.

[^1]: Modern versions of C permit anonymous struct fields that enable low-level compatible structure layout without exposing private fields. I do not know whether this is effectively utilized at current Go/C boundaries as a way to enforce Go field visibility rules across language boundaries.

seankhliao commented 2 years ago

Please fill out https://github.com/golang/proposal/blob/master/go2-language-changes.md when proposing language changes

jsshapiro commented 2 years ago

Updated to use language changes template.

ianlancetaylor commented 2 years ago

I think we need to see some examples of what this would look like in Go. You say that we can use the C# syntax verbatim, which may well be true, but when I look at the C# code I see that it starts with public class Person and I assume that you are not proposing that. Thanks.

jsshapiro commented 2 years ago

Fair. And I definitely think the C# syntax would need to be adapted to be Go-like. Here's an attempted translation:

In C#, you would define a property within a class as follows:

   public double Hours
   {
       get { return _seconds / 3600; }
       set {
          if (value < 0 || value > 24)
             throw new ArgumentOutOfRangeException(
                   $"{nameof(value)} must be between 0 and 24.");

          _seconds = value * 3600;
       }
   }

As a first step to a Go-like surface syntax:

// We assume here that _seconds is an already-defined private field of the same struct.
struct TimePeriod {
  _seconds int
  Hours double {
    get { return _seconds / 3600 }
    set {
      if value < 0 || value > 24 {
        return error... // I actually don't know the Go idiom for out-of-range, sorry
      }

      _seconds = value *3600
    }
}

The problem with this, mainly, is that Go doesn't hold with this style of method definition. Offhand (and I'm very much making this up as I go), one approach might be:

struct TimePeriod {
  _seconds int
  // declare Hours to be a property - this syntax also okay in interfaces
  Hours double { get; set } // Has both a getter and a setter; either may be absent
}

// The get/set declaration means the compiler will do the needed rewrite
// at use/update occurrences, to GetHours and SetHours, which in turn
// requires definitions to resolve the references:
func (tp *TimePeriod) GetHours() int {
  return _seconds / 3600
}

// IIRC Go treats assignments as statements rather than expressions, so no return value here
func (tp *TimePeriod) SetHours(val int) {
    if value < 0 || value > 24 {
      // Not clear how to translate the raised exception, but that's a separate topic
    }

    _seconds = value *3600
  }
}

So the declaration identifies the field as a property, which triggers special handling at use and update occurrences, and also signals the requirement to supply the associated functions.

I'm tempted to suggest that the declaration part could be reduced to something more spare, but I'm of two minds on that:

As I say, I'm making this up as I go. Hopefully this is "good enough to suck" and we might iterate on it if we think such a feature is conceptually desirable.

The interesting part, really, is the change in behavior at field use and update occurrences. The thing the declaration part is doing that is actually important is signaling that the compiler has to do the rewrites at these locations.

beoran commented 2 years ago

You can already implement setters and getters in go that work like this minus an user defined operator for = . Go has no user defined operarors, and I think that is a good thing.

https://go.dev/play/p/RPOi7UU0u2u

// You can edit this code!
// Click here and start typing.
package main

import "fmt"

type Hours struct {
    _seconds *int
}

func (h Hours) Get() int {
    return *(h._seconds) / 3600
}

func (h *Hours) Set(val int) {
    *(h._seconds) = val * 3600
}

type TimePeriod struct {
    _seconds int
    Hours
}

func MakeTimePeriod() TimePeriod {
    t := TimePeriod{}
    t.Hours._seconds = &t._seconds
    return t
}

func main() {
    t := MakeTimePeriod()
    t.Hours.Set(2)
    fmt.Printf("Hours: %d\n", t.Hours.Get())
}
jsshapiro commented 2 years ago

@beoran: I'm certainly aware that the pattern you suggest is possible. It does not address the API compatibility issue that I have raised, which is the motivation for the proposal. Given a pre-existing API that has exposed fields, your proposal does not provide an evolution path that is source compatible with the existing API.

I mostly share your reservations about the operator overloading rabbit hole, because it is incredibly easy to get overloading wrong (in a whole bunch of ways). Properties aren't the same thing as operator overloading, and they do not carry the same design risk. Offhand, I cannot think of any "property enabled" language that has encountered major issues because properties are present in the language. In some cases, the ability to convert fields to properties has enabled very interesting behavior.

There are a number of Go APIs out there whose authors did not fully internalize the requirements for future proofing and didn't come up with perfect designs the first time. There will be more such APIs over time, if only because many new programmers will create new APIs that don't deal with future proofing either.

So the questions, to my mind, are:

  1. Can this kind of API be evolved without a language change? I do not see how to do it within the current language specification.
  2. If a language change is needed, what change should it be? Speaking as a programming language architect myself, my bias in this case is "don't invent something new when something that exists already solves the problem well."
  3. Is the issue important enough that we should address it at all or at this time?

The last one is may be the most interesting. Because the API evolution issue will become a source of increasing "design pressure" over time, some solution will eventually be needed. It's the type of thing that is part of the price of success for any programming language.

Reasonable people could certainly disagree. Eventually, I think those voices are going to get overridden by accreted code and evolution requirements. That doesn't necessarily have to mean today, but I believe that the question is "when and how" rather than "if".

beoran commented 2 years ago

If an API has some fields exposed which turn out to be undesirable after, then I would slap a depreciation comment on it and then make a next module version where it is gone.

It's not a smooth migration path, but converting the the field to a getter/setter only for backwards compatibility with a mistake seems like a mistake as well.

ianlancetaylor commented 2 years ago

In Go we generally want the code on the page to indicate the execution cost. A function call may take some unknown amount of time. An references to a variable or struct field, on the other hand, will not. When the user just refers to a field, they expect that it will simply load the field, and similarly for an assignment. This proposal would break this property.

There are various idioms for handling this in general. For example, name the field f and add methods F and SetF.

The emoji voting on the proposal is not in favor.

Therefore, this is a likely decline. Leaving open for four weeks for final comments.

apparentlymart commented 2 years ago

When I consider this from the (very reasonable) perspective of evolving an existing API with new capabilities while staying source compatible, it does still seem to have some rough spots:


Thinking about the above potential problems reminded me tangentially of the design of mutable index overloading in Rust. Notice that this trait is required to return an mutable borrow, which for our purposes here is essentially analogous to returning a pointer in Go. This guarantees that there must be some real location in memory that this index refers to.

If we instead allowed only hooking the "address of" for a field and required that code to return a pointer to a memory location then it could potentially return either a pointer into part of the receiver or a pointer to something on the heap that could then be read or written through, but again there would be no guarantee that two accesses would yield the same pointer. And even if that's fine, it does kinda seem to miss the point of allowing the type to hook into reads and writes of the field: once the type has exposed a pointer to a memory location, anything holding that pointer can read and write arbitrarily from that location without any opportunity to intervene.

Of course this proposal is talking about (essentially) overloading member access rather than indexing, but it seems like it turns up a comparable set of design challenges either way. The Rust community has been debating whether and how to allow non-reference-based indexing for a long time with many questions still unanswered; rust-lang/rfcs#997 seems like the best entry-point into all of those discussions.


Overall it seems to me like accessing a field in Go is just a fundamentally different thing to calling a function, and so I'm having trouble imagining ways to make a hidden method call behave exactly like reading from or writing to a field, such that I would be confident in asserting that my change from a regular field to a getter/setter pair would not be a breaking change to any existing caller. :thinking:

ianlancetaylor commented 2 years ago

No change in consensus.