proposal: spec: read-only types

jba commented 6 years ago

I propose adding read-only types to Go. Read-only types have two related benefits:

The compiler guarantees that values of read-only type cannot be changed, eliminating unintended modifications that can cause subtle bugs.
Copying as a defense against modification can be reduced, improving efficiency.

An additional minor benefit is the ability to take the address of constants.

This proposal makes significant changes to the language, so it is intended for Go 2.

All new syntax in this proposal is provisional and subject to bikeshedding.

Basics

All types have one of two permissions: read-only or read-write. Permission is a property of types, but I sometimes write "read-only value" to mean a value of read-only type.

A type preceded by ro is a read-only type. The identifier ro is pronounced row. It is a keyword. There is no notation for the read-write permission; any type not marked with ro is read-write.

The ro modifier can be applied to slices, arrays, maps, pointers, structs, channels and interfaces. It cannot be applied to any other type, including a read-only type: ro ro T is illegal.

It is a compile-time error to

modify a value of read-only type,
pass a read-only slice as the first argument of append,
use slicing to extend the length of a read-only slice,
or send to or receive from a read-only channel.

A value of read-only type may not be immutable, because it may be referenced through another type that is not read-only.

Examples:

A function can assert that it will not modify its argument.
```
func transmit(data ro []byte) { ... }
```
The compiler guarantees that the bytes of data will not be altered by transmit.
A method can return an unexported field of its type without fear that it will be changed by the caller.
```
type BufferedReader struct {
buf []byte
}

func (b *BufferedReader) Buffer() ro []byte {
return buf
}
```

This proposal is concerned exclusively with avoiding modifications to values, not variables. Thus it allows assignment to variables of read-only type.

var EOF ro error = errors.New("EOF")
...
EOF = nil

One could imagine a companion proposal that also used ro, but to restrict assignment:

ro var EOF = ... // cannot assign to EOF

I don't pursue that idea here.

Conversions

There is an automatic conversion from T to ro T. For instance, an actual parameter of type []int can be passed to a formal parameter of type ro []int. This conversion operates at any level: a [][]int can be converted to a []ro []int for example.

There is an automatic conversion from string to ro []byte. It does not apply to nested occurrences: there is no conversion from [][]string to []ro []byte, for example.

(Rationale: ro does not change the representation of a type, so there is no cost to adding ro to any type, at any depth. A constant-time change in representation is required to convert from string to ro []byte because the latter is one word larger. Applying this change to every element of a slice, array or map would require a complete copy.)

Transitivity

Permissions are transitive: a component retrieved from a read-only value is treated as read-only.

For example, consider var a ro []*int. It is not only illegal to assign to a[i]; it is also illegal to assign to *a[i].

Transitivity increases safety, and it can also simplify reasoning about read-only types. For example, what is the difference between ro *int and *ro int? With transitivity, the first is equivalent to ro *ro int, so the difference is just the permission of the full type.

The Address Operator

If v has type ro T, then &v has type *ro T.

If v has type T, then ro &v has type ro *T. This bit of syntax simplifies constructing read-only pointers to struct literals, like ro &S{a: 1, b: 2}.

Taking the address of constants is permitted, including constant literals. If c is a constant of type T, then &c is of type ro *T and is equivalent to

func() ro *T { v := c; return &v }()

Read-Only Interfaces

Any method of an interface may be preceded by ro. This indicates that the receiver of the method must have read-only type.

type S interface {
   ro Marshal() ([]byte, error)
   Unmarshal(ro []byte) error
}

If I is an interface type, then ro I is effectively the sub-interface that contains just the read-only methods of I. If type T implements I, then type ro T implements ro I.

Read-only interfaces can prevent code duplication that might otherwise result from the combination of read-only types and interfaces. Consider the following code from the sort package:

type Interface interface {
    Less(i, j int) bool
    Len() int
    Swap(i, j int)
}

func Sort(data Interface) bool {
    … code using Less, Len, and Swap …
}

func IsSorted(data Interface) bool {
    … code using only Less and Len …
}

type IntSlice []int
func (x IntSlice) Less(i, j int) bool { return x[i] < x[j] }
func (x IntSlice) Len() int { return len(x) }
func (x IntSlice) Swap(i, j int) { x[i], x[j] = x[j], x[i] }

func Ints(a []int) { // invoked as sort.Ints
    Sort(IntSlice(a))
}

func IntsAreSorted(a []int) bool {
    return IsSorted(IntSlice(a))
}

We would like to allow IntsAreSorted to accept a read-only slice, since it does not change its argument. But we cannot cast ro []int to IntSlice, because the Swap method modifies its receiver. It seems we must copy code somewhere.

The solution is to mark the first two methods of the interface as read-only:

type Interface interface {
    ro Less(i, j int) bool
    ro Len() int
    Swap(i, j int)
}

func (x ro IntSlice) Less(i, j int) bool { return x[i] < x[j] }
func (x ro IntSlice) Len() int { return len(x) }

Now we can write IsSorted in terms of the read-only sub-interface:

func IsSorted(data ro Interface) bool {
    … code using only Less and Len …
}

and call it on a read-only slice:

func IntsAreSorted(a ro []int) bool {
    return IsSorted(ro IntSlice(a))
}

Permission Genericity

One of the problems with read-only types is that they lead to duplicate functions. For example, consider this trivial function, ignoring its obvious problem with zero-length slices:

func tail1(x []int) []int { return x[1:] }

We cannot call tail1 on values of type ro []int, but we can take advantage of the automatic conversion to write

func tail2(x ro []int) ro []int { return x[1:] }

Thanks to the conversion from read-write to read-only types, tail2 can be passed an []int. But it loses type information, because the return type is always ro []int. So the first of these calls is legal but the second is not:

var a = []int{1,2,3}
a = tail1(a)
a = tail2(a) // illegal: attempt to assign ro []int to []int

If we had to write two variants of every function like this, the benefits of read-only types would be outweighed by the pain they cause.

To deal with this problem, most programming languages rely on overloading. If Go had overloading, we would name both of the above functions tail, and the compiler would choose which to call based on the argument type. But we do not want to add overloading to Go.

Instead, we can add generics to Go—but just for permissions. Hence permission genericity.

Any type inside a function, including a return type, may be preceded by ro? instead of ro. If ro? appears in a function, it must appear in the function's argument list.

A function with an ro? argument a must type-check in two ways:

a has type ro T and ro? is treated as ro.
a has type T and ro? is treated as absent.

In calls to a function with a return type ro? T, the effective return type is T if the ro? argument a is a read-write type, and ro T if a is a read-only type.

Here is tail using this feature:

func tail(x ro? []int) ro? []int { return x[1:] }

tail type-checks because:

With x declared as ro []int, the slice expression can be assigned to the effective return type ro []int.
With x declared as []int, the slice expression can be assigned to the effective return type []int.

This call succeeds because the effective return type of tail is ro []int when the argument is ro []int:

var a = ro []int{1,2,3}
a = tail(a)

This call also succeeds, because tail returns []int when its argument is []int:

var b = []int{1,2,3}
b = tail(b)

Multiple, independent permissions can be expressed by using ro?, ro??, etc. (If the only feasible type-checking algorithm is exponential, implementations may restrict the number of distinct ro?... forms in the same function to a reasonable maximum, like ten.)

In an interface declaration, ro? may be used before the method name to refer to the receiver.

type I interface {
  ro? Tail() ro? I
}

There are no automatic conversions from function signatures using ro? to signatures that do not use ro?. Such conversions can be written explicitly. Examples:

func tail(x ro? []int) ro? []int { return x[1:] }

var (
    f1 func(x ro? []int) ro? []int = tail  // legal: same type
    f2 func(ro []int) ro []int = tail      // illegal: attempted automatic conversion
    f3 = (func(ro []int) ro []int)(tail)   // legal: explicit conversion
)

Permission genericity can be implemented completely within the compiler. It requires no run-time support. A function annotated with ro? requires only a single implementation.

Strengths of This Proposal

Fewer Bugs

The use of ro should reduce the number of bugs where memory is inadvertently modified. There will be fewer race conditions where two goroutines modify the same memory. One goroutine can still modify the memory that another goroutine reads, so not all race conditions will be eliminated.

Less Copying

Returning a reference to a value's unexported state can safely be done without copying the state, as shown in Example 2 above.

Many functions take []byte arguments. Passing a string to such a function requires a copy. If the argument can be changed to ro []byte, the copy won't be necessary.

Clearer Documentation

Function documentation often states conditions that promise that the function doesn't modify its argument, or that extracts a promise from the caller not to modify a return value. If ro arguments and return types are used, those conditions are enforced by the compiler, so they can be deleted from the documentation. Furthermore, readers know that in a well-designed function, a non-ro argument will be written along at least one code path.

Better Static Analysis Tools

Read-only annotations will make it easier for some tools to do their job. For example, consider a tool that checks whether a piece of memory is modified by a goroutine after it sends it on a channel, which may indicate a race condition. Of course if the value is itself read-only, there is nothing to do. But even if it isn't, the tool can do its job by checking for writes locally, and also observing that the value is passed to other functions only via read-only argument. Without ro annotations, the check would be difficult (requiring examining the code of functions not in the current package) or impossible (if the call was through an interface).

Less Duplication in the Standard Library

Many functions in the standard library can be removed, or implemented as wrappers over other functions. Many of these involve the string and []byte types.

If the io.Writer.Write method's argument becomes read-only, then io.WriteString is no longer necessary.

Functions in the strings package that do not return strings can be eliminated if the corresponding bytes method uses ro. For example, strings.Index(string, string) int can be eliminated in favor of (or can trivially wrap) bytes.Index(ro []byte, ro []byte) int. This amounts to 18 functions (including Replacer.WriteString). Also, the strings.Reader type can be eliminated.

Functions that return string cannot be eliminated, but they can be implemented as wrappers around the corresponding bytes function. For example, bytes.ToLower would have the signature func ToLower(s ro? []byte) ro? []byte, and the strings version could look like

func ToLower(s string) string {
    return string(bytes.ToLower(s))
}

The conversion to string involves a copy, but ToLower already contains a conversion from []byte to string, so there is no change in efficiency.

Not all strings functions can wrap a bytes function with no loss of efficiency. For instance, strings.TrimSpace currently does not copy, but wrapping it around bytes.TrimSpace would require a conversion from []byte to string.

Adding ro to the language without some sort of permission genericity would result in additional duplication in the bytes package, since functions that returned a []byte would need a corresponding function returning ro []byte. Permission genericity avoids this additional duplication, as described above.

Pointers to Literals

Sometimes it's useful to distinguish the absence of a value from the zero value. For example, in the original Google protobuf implementation (still used widely within Google), a primitive-typed field of a message may contain its default value, or may be absent.

The best translation of this feature into Go is to use pointers, so that, for example, an integer protobuf field maps to the Go type *int. That works well except for initialization: without pointers to literals, one must write

i := 3
m := &Message{I: &i}

or use a helper function.

In Go as it currently stands, an expression like &3 cannot be permitted because assignment through the resulting pointer would be problematic. But if we stipulate that &3 has type ro *int, then assignment is impossible and the problem goes away.

Weaknesses of This Proposal

Loss of Generality

Having both T and ro T in the language reduces the opportunities for writing general code. For example, an interface method with a []int parameter cannot be satisfied by a concrete method that takes ro []int. A function variable of type func() ro []int cannot be assigned a function of type func() []int. Supporting these cases would start Go down the road of covariance/contravariance, which would be another large change to the language.

Problems Going from string to ro []byte

When we change an argument from string to ro []byte, we may eliminate copying at the call site, but it can reappear elsewhere because the guarantee is weaker: the argument is no longer immutable, so it is subject to change by code outside the function. For example, os.Open returns an error that contains the filename. If the filename were not immutable, it would have to be copied into the error message. Data structures like caches that need to remember their methods' arguments would also have to copy.

Also, replacing string with ro []byte would mean that implementers could no longer compare via operators, range over Unicode runes, or use values as map keys.

Subsumed by Generics

Permission genericity could be subsumed by a suitably general design for generics. No such design for Go exists today. All known constraints on generic types use interfaces to express that satisfying types must provide all the interface's methods. The only other form of constraint is syntactic: for instance, one can write []T, where T is a generic type variable, enforcing that only slice types can match. What is needed is a constraint of the form "T is either []S or ro []S", that is, permission genericity. A generics proposal that included permissions would probably drop the syntax of this proposal and use identifiers for permissions, e.g.

gen <T, perm Ro> func tail(x Ro []T) Ro []T { return x[1:] }

Missing Immutability

This proposal lacks a permission for immutability. Such a permission has obvious charms: immutable values are goroutine-safe, and conversion between strings and immutable byte slices would work in both directions.

The problem is how to construct immutable values. Literals of immutable type would only get one so far. For example, how could a program construct an immutable slice of the first N primes, where N is a parameter? The two easy answers—deep copying, or letting the programmer assert immutability—are both unpalatable. Other solutions exist, but they would require additional features on top of this proposal. Simply adding an im keyword would not be enough.

Does Not Prevent Data Races

A value cannot be modified through a read-only reference, but there may be other references to it that can be modified concurrently. So this proposal prevents some but not all data races. Modern languages like Rust, Pony and Midori have shown that it is possible to eliminate all data races at compile time. But the cost in complexity is high, and the value unclear—there would still be many opportunities for race conditions. If Go wanted to explore this route, I would argue that the current proposal is a good starting point.

References

Brad Fitzpatrick's read-only slice proposal

Russ Cox's evaluation of the proposal. This document identifies the problem with the sort package discussed above, and raises the problem of loss of generality as well as the issues that arise in moving from string to ro []byte.

Discussion on golang-dev

ianlancetaylor commented 6 years ago

I understand the desire for this kind of thing, but I am not particularly fond of this kind of proposal. This approach seems very similar to the const qualifier in C, with the useful addition of permission genericity. I wrote about some of my concerns with const in https://www.airs.com/blog/archives/428.

You've identified the problems well: this does not provide immutability, and it does not avoid data races. I would like to see a workable proposal for immutability, and I would love to see one that avoids data races. This is not those proposals.

Using ro in a function parameter amounts to a promise that the function does not change the contents of that argument. That is a useful promise, but it is one of many possible useful promises. Is there a reason beyond familiarity with C that we should elevate this promise into the language? Go programs often rely on documentation rather than enforcement. There are many structs with exported fields with documentation about who is permitted to modify those fields. Similarly we document that a Write method that implements io.Writer may not modify its argument slice. Why put one promise into the language but not the other?

In general this is an area where experience reports can help guide Go 2 development. Does this proposal help with real problems that Go programmers have encountered?

jba commented 6 years ago

I would like to see a workable proposal for immutability, and I would love to see one that avoids data races. This is not those proposals.

I'm continuing to think about those things, but I wanted to get this proposal out there for two reasons. One, I think any proposal for immutability will have this as a subset. ro T is a subtype of both T and im T, so it will likely show up in any reasonable proposal involving im. (Permission genericity gets around using ro for functions, but you still might want it for data. Consider a data structure that wants to store both T and im T.) It's probably not an accident that Rust, Pony and Midori all have read-only types in addition to immutable ones.

The second reason I wanted to share this is that it serves as a counterexample to anyone who thinks adding read-only types to Go is just a matter of adding a keyword.

Does this proposal help with real problems that Go programmers have encountered?

Yes. At the recent Google-internal Go conference, @thockin specifically asked for const, citing bugs in Kubernetes due to inadvertent modification of values returned from caches. I think Alex Turcu also mentioned that he wanted something like this for an internal video ads system.

ianlancetaylor commented 6 years ago

What do you think of a builtin freeze function that returns an immutable shallow copy of an object? That would fix the cache problem without modifying the type system. (The returned value would be immutable in that any attempt to modify it would cause a run time panic.)

jba commented 6 years ago

Out of curiosity, how does that work? And how does it detect modification of a nested value?

ianlancetaylor commented 6 years ago

I don't know exactly how it works, which is why I haven't written a proposal for it. One conceivable implementation would be to mmap a new page, copy the object in, and then to mprotect that page, but the difficulties are obvious.

For a nested value, you use freeze multiple times, as desired.

neild commented 6 years ago

I'm not following the distinction between values and variables in this proposal. Why is the modification of the value stored in a permitted below?

var a ro int
a = 1 // Modifying an ro int via a variable.

var b *ro int := &a
*b = 1 // Modifying an ro int via a pointer.

A nitpick: Pointers-to-constants are entirely orthogonal to the rest of this proposal and (IMO) distract from the meat of it. Go already has syntax for constructing non-zero pointers to compound types; providing a similar facility for non-compound types does not require the addition of read-only values to the language. e.g., #19966.

willfaught commented 6 years ago

@jba Could you accomplish the same thing with overriding type operations? Is it important that the read-only property be at the type level? For example, string (basically a read-only []byte) could be defined as something like this:

type string []byte

func (s string) []=(index int) byte {
    panic("not supported")
}

This doesn't require any changes to the type system, and seems to be backward-compatible at first glance.

jba commented 6 years ago

@neild ro int is the same thing as int (actually, I disallow it, but that could go either way). ints are already immutable: you can't modify an int, only copy and change it. So your code is equivalent to

var a int
a = 1 // Modifying an ro int via a variable.

var b *int := &a
*b = 1 // Modifying an ro int via a pointer.

and of course both of those assigments are equally legal. The assignment in

var c ro *int = &a
*c = 1

would not be, but c itself could be changed.

I'm trying to avoid proposing both a type modifier and what C would call "storage class,", out of hygiene. (See Ian's blog post that he linked to above for a criticism of how C const conflates those.)

jba commented 6 years ago

@willfaught I don't think operator overloading is a good fit for Go. One of the nice things about the language is that every bit of syntax has a fixed meaning.

willfaught commented 6 years ago

It seems identical to how methods and embedding work. Like the selector a.b, the operation a[b] could also be overridden. It would simplify things for operators to just be methods (that can be aggressively inlined).

neild commented 6 years ago

@jba Your proposal says:

Transitivity increases safety, and it can also simplify reasoning about read-only types. For example, what is the difference between ro int and ro int? With transitivity, the first is equivalent to ro *ro int, so the difference is just the permission of the full type.

The existence of *ro int implies the existence of ro int, doesn't it? If not, why not and what is the type of *p where p is a *ro int?

You also say:

It is a compile-time error to modify a value of read-only type,

I can't square this with it being legal to modify the value of c in your example:

var c ro *int = &a

The variable c has type ro *int. The value contained within c is a value of read-only type. Why can it be modified?

ianlancetaylor commented 6 years ago

@willfaught Operator overloading is a very different idea that should be discussed in a separate proposal, not this one.

jba commented 6 years ago

Transitivity increases safety, and it can also simplify reasoning about read-only types. For example, what is the difference between ro int and ro int? With transitivity, the first is equivalent to ro *ro int, so the difference is just the permission of the full type.

The existence of ro int implies the existence of ro int, doesn't it? If not, why not and what is the type of p where p is a *ro int?

That's a bug in my proposal. I chose a poor example. Replace int with []int.

It is a compile-time error to modify a value of read-only type,

I can't square this with it being legal to modify the value of c in your example:

var c ro *int = &a

The variable c has type ro *int. The value contained within c is a value of read-only type. Why can it be modified?

The it in your last sentence refers to the value of read-only type, the ro *int. That value cannot be modified; *c = 3 is illegal. But you can change the binding of c. There is nothing in my proposal that restricts the semantics of variable bindings.

The situation is analagous to

var s string = "x"
s = "y"

The value is immutable, but the variable binding is not.

neild commented 6 years ago

It is possible that I have misunderstood the spec, but this is not consistent with my understanding of variable assignment. s = "y" does not change the binding of s; it changes the value of the variable bound to s.

jba commented 6 years ago

I guess I'm using the word "binding" wrong. I was thinking variables are bound to their values, and you're saying identifiers are bound to variables, which have values. Anyway, you can change variable-value associations, but some values cannot be modified.

Spriithy commented 6 years ago

Why not reuse the already existing const keyword to ease readability and stick with Go's spirit of not obfuscating intent ?

My point here is that ro is an obfuscating keyword that hides intent to non-aware readers. Again, as you stated earlier, this is merely a suggestion and syntax comes last.

Other point, say I have a read only type for ints. Is such type declarable (as in type T = ro int) ?

If yes, do I declare an instance of such type using the var or const const keyword since it is a non modifiable type ?

type T = ro int

var x T = 55

// or

const y T = 98

Moreover, wouldn't it be enough to allow constant pointers ?

Other point, what about compound types ? Say, using these declarations

type S struct {
    Exported int
}

type RoS = ro S

Does this snippet compile ? If not, what errors are thrown ? If yes, what is the expected behavior ? Does it panic ? If yes, how does the runtime detects this ?

func main() {
    ros := &RoS{Exported: 55}
    p := &ros.Exported
    *p = 98
}

What about this one ?

func main() {
    ros := &RoS{Exported: 55}
    p := (*int)(unsafe.Pointer(&ros.Exported))
    *p = 98
}

jaekwon commented 6 years ago

I just want to point to two proposals, one for immutable slices and one for pointerized structs that I think in combination amounts to a simpler set of language changes than what is proposed here. Please take a look!

What is needed is a constraint of the form "T is either []S or ro []S", that is, permission genericity.

Check out the any modifier in the immutable slices proposal.

Pointerized structs

Here's a concrete example. Here is one way to control write-access to structs. Copying is trivial, you can just do var g Foo = f from anywhere, even outside the module that declares Foo.

type Foo struct {
  value interface{}
}
func (f *Foo) SetValue(interface{}) {...}
func (f Foo) GetValue() interface{} {...}

The other way is to protect the struct with a mutex:

type Foo struct {
  mtx sync.RWMutex
  value interface{}
}
func (f *Foo) SetValue(interface{}) {...} // Lock/Unlock
func (f *Foo) GetValue() interface{} {...} // RLock/RUnlock

Here's a full pointerized struct version:

type foo struct* {
  Value interface{}
}
func (f foo) GetValue() interface{} {...}

type Foo struct {
  mtx sync.RWMutex
  foo
}
func (f *Foo) SetValue(interface{}) {...} // Lock/Unlock
func (f *Foo) GetValue() interface{} {...} // RLock/RUnlock

f = Foo{...}
f.SetValue(...) // ok, f is addressable
g := f
g.SetValue(...) // ok, g is addressable
func id(f Foo) Foo { return f } // returns a non-addressable copy
id(g).SetValue(...) // compile-error, not addressable.
id(g).GetValue(...) // calls foo.GetValue, mtx not needed

Q: So why readonly slices? It seems natural to create a "view" into an arbitrary-length array of objects without copying. For one, it's a required performance optimization. Second, there's no way to mark any items of a slice to be externally immutable, as can be done with private struct fields. For these reasons, readonly slices appear to be natural and necessary (for lack of any alternative).

wora commented 6 years ago

I think this design would lead to significant complexity in practice, similar to C++ const. A couple of key issues:

The caller is free to modify the value while it looks like a constant to the callee.
If you read a field of ro T, what is the type of the field value? F or ro F?
Having libraries to consistently use this new feature can be very challenging and costly.

One cheap alternative is to introduce a documentary type annotation, which just document the value should not be changed. There is no enforcement, but it offers a design contract between caller and callee. Go doesn't provide in-process security anyway, a bad library can do arbitrary damage. I am not sure whether we need to guard it at language level.

jba commented 6 years ago

@Spriithy:

Why not reuse the already existing const keyword to ease readability and stick with Go's spirit of not obfuscating intent ?

const is about the identifier-value binding, while ro modifies types. I think it would be more confusing to conflate the two.

.. do I declare an instance of [an ro type] using the var or const const keyword since it is a non modifiable type ?

var, because the variable can still be set to a different value.

Moreover, wouldn't it be enough to allow constant pointers ?

No, ro is useful for anything that has structure, like maps and slices. You might want to return a map from a function without worrying that your callers will modify it, for example.

Does this snippet compile ? If not, what errors are thrown ? If yes, what is the expected behavior ? Does it panic ? If yes, how does the runtime detects this ?
func main() {
ros := &RoS{Exported: 55}
p := &ros.Exported
*p = 98
}
It fails to compile. p has type ro *int, so the assignment *p = 98 is illegal.

What about [using unsafe]?

Of course, all bets are off with unsafe.

jba commented 6 years ago

@jaekwon:

Check out the any modifier in the immutable slices proposal.

I don't see how any actually works. Say I have x, which may be an roPeeker or an rwPeeker. Now I do

if y, ok  := x.(interface{ Peek(int) any []byte }); ok {
   b := y.Peek(3)
   b[1] = 17 // ???
}

Can I assign to elements of b or not? Hopefully the compiler somehow knows and reports an error just in case x was an roSeeker. But I don't see how it knows that.

Here is one way to create immutable structs:

type Foo struct {
value interface{}
}
func (f *Foo) SetValue(interface{}) {...}
func (f Foo) GetValue() interface{} {...}

I don't understand this. What is immutable? Certainly not Foo—you can set its value field. (The field may as well be exported.) Is the thing I put in value immutable? Maybe; depends what I put there:

var f Foo
f.SetValue([]int{1})
x := f.GetValue()
x.([]int)[0] = 2 // Nope, not immutable.

jba commented 6 years ago

@wora:

I think this design would lead to significant complexity in practice, similar to C++ const.

I think it's a little less complex, but yes, I basically agree.

The caller is free to modify the value while it looks like a constant to the callee.

It doesn't look like a constant, it looks like a readonly value.

If you read a field of ro T, what is the type of the field value? F or ro F?

ro F

jaekwon commented 6 years ago

@jba

I don't see how any actually works. Say I have x, which may be an roPeeker or an rwPeeker. Now I do
if y, ok  := x.(interface{ Peek(int) any []byte }); ok {
   b := y.Peek(3)
   b[1] = 17 // ???
}
Can I assign to elements of b or not? Hopefully the compiler somehow knows and reports an error just in case x was an roSeeker. But I don't see how it knows that.

No, you can't. any means it might be read-only, so first you must cast to a writeable.

y  := x.(interface{ Peek(int) any []byte })
if wy, ok  := x.(interface{ Peek(int) []byte }); ok {
   b := wy.Peek(3)
   b[1] = 17
}

@jba

Here is one way to create immutable structs:
type Foo struct {
  value interface{}
}
func (f *Foo) SetValue(interface{}) {...}
func (f Foo) GetValue() interface{} {...}
I don't understand this. What is immutable? Certainly not Foo—you can set its value field. (The field may as well be exported.)

I meant, you pass by value (e.g. copy) to prevent others from writing to it. Immutable is an overloaded word... I was using it to refer to pass-by-copy semantics.

f := &Foo{value: "somevalue"}
f.SetValue("othervalue") // `f` is a pointer
g := *f
g.SetValue("another") // can't, g is a readonly copy.

The use-cases for ro struct{} overlap significantly for use-cases for g := *f, and the latter already exists. We don't need transitive ro as long as all field values are non-pointer types.

But I also acknowledge that Golang1 isn't perfectly suited for this kind of usage, because it forces you to write verbose and type-unsafe syntax to get the behavior you want... Here's an example with an (immutable) tree-like structure:

type Node interface {
    AssertIsNode()
}
type node struct {
  Left Node
  Right Node
}
func (_ node) AssertIsNode() {}

// Using the struct is cumbersome, but overall this has the behavior we want.

// Interfaces are pointer-like in how its represented in memory,
// copying is quick and efficient.
var tree Node = ...
maliciousCode(tree) // cannot mutate my copy

// But using this as a struct is cumbersome and type-unsafe.
var leftValue = tree.(node).Left.(node).Value

Maybe one way to make this easier is to declare a struct to be "pointerized"...

type Node struct* {
  Left Node
  Right Node
}

var n Node = nil // not the same as a zero value.
n.Left = ... // runtime error
n = Node{}
n.Left = ...
n.Right = ...

var n2 = n
n2.Left = ... // This won't affect `n`.
n2.Left.Left = ... // compile-time error, n2.Left is not addressable.

n.Left = n // circular references are OK.

Please check out https://github.com/golang/go/issues/23162

@jba Please check out the update to the last comment: https://github.com/golang/go/issues/22876#issuecomment-354379524

andlabs commented 6 years ago

Has anyone listed all the existing proposals for declaring a block of data to be stored in read-only memory?

iand commented 6 years ago

This recent blog post on const in C++ and D has some relevant discussion of the difficulties of implementing useful const/immutable concepts in programming languages.

ianlancetaylor commented 6 years ago

We are not certain about adding a type qualifier to the language. Go in general has a very simple type system. There is only one type qualifier at present: marking a channel as send-only or receive-only.

Also, I don't think anybody has address my comment from above:

Using ro in a function parameter amounts to a promise that the function does not change the contents of that argument. That is a useful promise, but it is one of many possible useful promises. Is there a reason beyond familiarity with C that we should elevate this promise into the language? Go programs often rely on documentation rather than enforcement. There are many structs with exported fields with documentation about who is permitted to modify those fields. Similarly we document that a Write method that implements io.Writer may not modify its argument slice. Why put one promise into the language but not the other?

Still, there is something to the ideas here, and the general concept might be worth pursuing. Keeping this proposal open for further discussion.

jba commented 6 years ago

Ian, let me try address your comment—why this promise and not another?—while also putting this proposal in context. This is an expansion of my earlier comment.

There's a lot of interest in type systems that offer data-race freedom. (Russ even mentioned them in his Go Resolutions for 2017). The languages Rust, Pony and Midori all have different ways of eliminating data races, but they share the idea of using type modifiers that restrict access to values.

The trick is picking the right set of modifiers so that programs are both expressive and not too painful to write. For example, if you chose to add just an immutability modifier to an imperative language like Go, you'd find that it is hard to construct immutable values that have structure. How would you create an immutable slice containing non-zero values, or an immutable linked list?

So these languages all have a few different modifiers, that when used together in certain patterns let you prove little theorems (at least, that's how I think of it). For example, if I create a reference to a value from fresh memory at a point in the program, only put immutable values into it, and never let a copy of the reference out of scope, then I can convert that reference to an immutable reference. That particular theorem lets me create an immutable []int containing any values I like.

But that theorem is weaker than it needs to be. If I create a reference to a value from fresh memory, only put immutable values into it, and never let a writable copy of the reference out of scope, then I can still convert that reference to an immutable reference. That extra power lets me pass my reference to other functions while I build it up, provided they promise not to modify it.

So "can't modify this value" seems to be a useful building block in the machinery of constructing data-race free programs. Of course, my one example doesn't prove that, but it is interesting that all three of those languages, Rust, Pony and Midori, have something like the ro of this proposal.

In short, while this proposal might not add enough value to pull its weight, I think it's a necessary stepping stone to data-race freedom.

ianlancetaylor commented 6 years ago

See also #20443.

wora commented 6 years ago

I think the concept of read-only is much more general other other promises. Anyone who uses computer, not just developers, has good understanding of read-only files or read-only documents. People have very little problem dealing with such things in their daily life.

Extending the concept to programming language is not a big cost for developers. C++ added constexpr must later in its lifecycle and it works reasonably well.

bcmills commented 6 years ago

Reading through @rsc's critique of @bradfitz's 2013 proposal, I'm struck by just how many of the issues Russ raised boil down to parametricity and/or metaprogramming.

The TrimSpace example in “Duplication and triplication”

  func bytes.TrimSpace(s []byte) []byte
  func strings.TrimSpace(s string) string
cannot be replaced by
  func strings.TrimSpace(s readonly []byte) readonly []byte
seems like it would be fixed by using parametricity instead:
func [T] TrimSpace(s T) T

The Join example

  func bytes.Join(x [][]byte, sep readonly []byte) []byte
  func strings.Join(x []string, sep readonly []byte) string
  func robytes.Join(x []readonly []byte, sep readonly []byte) []byte

is similar, and requires only that readonly string collapse to string:

func [T] Join(x []readonly T, sep readonly []byte) T

The issue of “Immutability and memory allocation” is more difficult: a notion of “read-only” without some stronger notion of “strict” or “unowned” does not suffice to prevent subtle aliasing bugs. (The race detector will catch a few of those, but certainly not all.) That hints at a stronger (and larger) version of this proposal, but the stronger and larger the proposal, the less likely it is to fit into the scale of changes for Go 2.

On the other hand, given that Go programs are already susceptible to subtle aliasing bugs, that strikes me as kind of a silly reason to reject an incremental safety improvement.

In the “Loss of generality” section, the TrimSpace/ToUpper example for seems too trivial (even more so if https://github.com/golang/go/issues/21498 is accepted): it's easy enough to make the types match up using a function literal.

    var convert = robytes.TrimSpace
    if wackyContrivedExample {
        convert = func(x []byte) []byte { return robytes.ToUpper(x) }
    }

We could even encode that directly in the type system by treating functions and methods that accept a readonly T as a subtype of functions that accept a T (and treating functions that return a T as a subtype of functions that return a readonly T). That would also address the example of Bytes and Peek methods in interfaces. We'd just need to be careful to limit covariance to functions (and not repeat the Java array-covariance mistake).

The final “Loss of generality” example, sort.IntSlice, is moot because of sort.SliceIsSorted, which would work with read-only slices as-is. The sort API seems to break with readonly slices only because it is already overspecified: it uses the same type for both reading and writing.

That said, it could also be addressed with a bit more (IMO simple) metaprogramming: a compile-time conditional to provide Swap only if the slice type is mutable.

type [T] CmpSlice T
func (x CmpSlice) Less(i, j int) bool { return x[i] < x[j] }
func (x CmpSlice) Len() int { return len(x) }
[if Mutable(T)] func (x CmpSlice[T]) Swap(i, j int) { x[i], x[j] = x[j], x[i] }

Where I'm going with all this is that, in my opinion, the decision for this proposal should depend on the outcome of https://golang.org/issue/15292. If Go 2 has workable generics, many of the problems Russ observed will become moot; if Go 2 has workable generics with compile-time reflection or other metaprogramming, then the only remaining issue will be that this proposal does not go far enough.

jba commented 6 years ago

@bcmills, my "permission genericity" was an attempt to address Russ's points without invoking full genericity. If we had full genericity, then permission genericity might not be necessary—but Midori found that both were needed. See http://joeduffyblog.com/2016/11/30/15-years-of-concurrency, search for "generic parameterization over permissions".

ianlancetaylor commented 6 years ago

@wora I don't really accept that kind of argument from analogy in general, but in this case I think the analogy is flawed anyhow. This proposal is not about constexpr (which is a kind of immutability, not a read-only type qualifier) and it's not about documents that can not be changed. It's about saying that a value can not be changed using a specific reference to that value, but the value can still be changed using other references.

@bcmills I have to say that I think it is extremely unlikely that Go 2 will provide generics with compile-time reflection or metaprogramming. Yours is the only generics proposal I've ever seen with anything close to that. All of my proposals have specifically not included it, which I personally regard as a feature.

bcmills commented 6 years ago

@jba, I guess part of my point is that this proposal can be made quite a bit simpler if we already have some form of generics in the language anyway.

bcmills commented 6 years ago

@ianlancetaylor Yes, I would be surprised if Go 2 had generics with any sort of general-purpose metaprogramming mechanism. However, I think this case is worth mentioning for two reasons:

Even without metaprogramming, simple generics address many of Russ's objections to Brad's original proposal.
This use-case, and potentially others like it, may be considerations when we decide how much (if any) metaprogramming to support (either in Go 2, or further in the future if we do not choose a design that precludes them in Go 2).

celestiallake commented 6 years ago

You have to take a look at closures. Go supports them widely. Don't know if there's any nice use for const keyword since that's just an expensive compile-time check.

JavierZunzunegui commented 5 years ago

Not sure if this is still under debate, but weighing in:

Focusing on Permission Genericity:

A function with an ro? argument a must type-check in two ways:

a has type ro T and ro? is treated as ro.

a has type T and ro? is treated as absent.

In calls to a function with a return type ro? T, the effective return type is T if the ro? argument a is a read-write type, and ro T if a is a read-only type.

Using tail as in your definition,

func tail(x ro? []int) ro? []int { return x[1:] }

what is the type of x in x := tail?

I doubt you want func(ro? []int) ro? []int, as you are only introducing ro-qualified types and not 'optional' ro?-qualified types. Which means the type of x must be be specified somehow and, more significantly, you have tail being different to x, i.e. type tail(a) may differ from type x(a).

As I see it you have to remove the genericity. There are two natural options. 1) limit to one ro? (no ro??) 2) remove ro? altogether

Either way the key difference is that any ro-able function foo (or even for every function, as an identity) has two forms: foo and ro foo. The key difference is the argument does not define the function (no genericity), i.e.

_ = foo // type without extra ro
_ = ro foo // type with additional ro

var a A // assume ro-able
_ = foo(a) // output is non-ro
_ = foo(ro A(a)) // output is still non-ro (no genericity)

// note in the below the syntax is (ro foo)(...), NOT ro (foo(...))
_ = ro foo(a) // BAD! won't compile
_ = ro foo(ro A(a)) // output is ro (no genericity)

Choosing between 1: limit to one ro? (no ro??) or 2: remove ro? altogether I am not sure on, 1 is more powerful put more inconvenient for the developer.

JavierZunzunegui commented 5 years ago

Focusing on Missing Immutability:

This proposal lacks a permission for immutability. Such a permission has obvious charms: immutable values are goroutine-safe, and conversion between strings and immutable byte slices would work in both directions.

The problem is how to construct immutable values. Literals of immutable type would only get one so far. For example, how could a program construct an immutable slice of the first N primes, where N is a parameter? The two easy answers—deep copying, or letting the programmer assert immutability—are both unpalatable. Other solutions exist, but they would require additional features on top of this proposal. Simply adding an im keyword would not be enough.

While correct, I think this statement misses that ro could actually bring us a form of immutability. The key is while ro is a developer-level feature, immutability would be a compiler-level feature, existing only for performance purposes. This means: no ~im~ or comparable syntax, and the developer effectively not knowing if their ro is also immutable (in that all immutable are ro but not all ros are immutable). The difference is only in performance, the logic for the two must be identical and any immutable ro could be treated as a plain ro and produce identical results.

The immutability could be summarized as: IF the compiler can assert all references to a ro variable are also ro themselves, it can treat the variable as immutable (in the absense of unsafe)

To achieve this, the compiler must effectively do escape analysis on ro's, i.e. a non ro-escaping ro can be treated as immutable, but one that ro-escapes can't.

An example:

func foo1() []int {
  return append([]int{1}, 2)
}
// x is ro, and can also be considered immutable
x := ro []int(foo1())

func bar2 func([]int){}
func foo2() []int {
  out := append([]int{1}, 2)
  bar2(out)
  return out
}
// y is ro but can't be considered immutable, the variable ro-escapes in bar
// (the example is so simple a smart compiler could actually identifies it doesn't escape, but I am not making that case here give go prioritizes compile time)
y := ro []int(foo2())

func bar3 func(ro []int){}
func foo3() []int {
  out := append([]int{1}, 2)
  bar3(out)
  return out
}
// z is ro and considered immutable, the variable escapes but as a ro, and that's OK
z := ro []int(foo2())

Note this means that immutable ros are not actually (necessarily) created as immutable, in it's simplest terms:

func foo() ro []int {
  out := []int{1} // not immutable, not even ro
  out = append(out, 1) // not immutable or ro (and in fact is being modified!)
  return out // ro and immutable, despite having been non-immutable (and non-ro) before. Has NOT been deep-copied
}

In this sense immutable ros are not identical to const (in the current go sense, not referening to C-style const). The guarantee is not once first allocated, this memory is unchanged, but rather from this point on, this memory is unchanged.

The benefits of this are exactly as defined in this proposal:

Such a permission has obvious charms: immutable values are goroutine-safe, and conversion between strings and immutable byte slices would work in both directions.

It is only that the gatekeeper to this performance gains is the compiler, not the developer - at least not explicitly. Write good ro code, and you'll be likely to get the advantages.

Concluding ro, as defined in this proposal, does (or rather may, if we choose such immutability approach) bring immutability performance gains. And whats best, since those gains are responsibility of the compiler and don't change the code output we can have ro added to go2 and start writing ro-compliant code, and only progressively add this kind of immutability support.

Quoting from the proposal:

[...] or letting the programmer assert immutability

Replace programmer for compiler, that's all I'm trying to say.

jba commented 5 years ago

what is the type of x in x := tail?

It is the exact type of tail, with the ro?s. I answer that in the original proposal. See the paragraph beginning "There are no automatic conversions...".

immutability would be a compiler-level feature, existing only for performance purposes.

But those performance gains come from how programmers write code. Say I'm writing a cache that accepts ro T values. Even if the compiler can prove immutability, I can't, so I have to copy them. Or say I'm calling a function Foo(int) ro []int. The compiler may be able to prove that the return value is immutable, but I have to code assuming it isn't.

JavierZunzunegui commented 5 years ago

But those performance gains come from how programmers write code. Say I'm writing a cache that accepts ro T values. Even if the compiler can prove immutability, I can't, so I have to copy them. Or say I'm calling a function Foo(int) ro []int. The compiler may be able to prove that the return value is immutable, but I have to code assuming it isn't.

Yes, the programmers still have a part to play in writing both safe and performant code - more so than if there was im in the language, but less than without ro support (at least with ro the caller of Foo cant change the []int, a decent gain).

Even if the compiler can prove immutability, I can't, so I have to copy them

No more than you may have to do in current (non-ro) golang. My suggestion on building immutability on top of ro is purely a performance concept, if the code was wrong with ro then my point about immutability changes nothing. If you can't trust the callers of your ro method to be sensible you have no choice but to copy it. Having said that, if you do copy it you will (or should) at least generate an immutable ro []int, so may still get some performance advantages whatever path you take.

If you want to ensure the ro is immutable to avoid copying (the ideal situation) maybe we can add something like:

//go:immutable

but I think generally that may introduce many issues as the compiler makes no promises that logically immutable ros will be trated as such (and it may take time before the compiler does a good job at it).

ghost commented 5 years ago

It would be nice to have read only maps as asked in slack how to do the following:

const database_config := map[string]string{ "host": "localhost", "port": "2114", "username": "foo", "password": "bar", "name": "db_wiki", } And was pointed to here since I'm totally new to Go.

ianlancetaylor commented 5 years ago

@Ookma-Kyi Constant map variables are an aspect of immutability and are covered by #6386. This proposal is less about immutability than it is about ensuring that functions don't change certain values.

jcburley commented 3 years ago

Interesting idea, but please don't repeat the bodge of having ro, like C's const, apply to whatever is to the left or right of the keyword.

As have others, decades of wrestling with C's const taught me to consistently place it so it modified only what was to the left, as in * const int meaning "a pointer via which an int will not be modified".

In that sense, var ro ... would logically mean the same thing as const, which is what consistent right-to-left reading (as Go already requires for things like [4][8]byte) would indicate, and thus would probably just be disallowed, as would func foo(a ro int) and func foo(a ro *int).

Having already worked on large codebases with plenty of const * const * const int and such, because the programmers didn't quite understand the rules or weren't sure readers would, I believe it'd simplify things to be strict about the direction to which ro pertains, versus C's "to the left, unless there's 'nothing there' [which means what, to the naive reader?], in which case to the right".

wora commented 3 years ago

I think a read only value is more along the lines of C++ constexpr. const is more or less a const alias to a value, but the value can be changed via other aliases. A constexpr is kinda immutable value ensured by the compiler and the runtime.

Given the simplicity focus of Go, I am not sure if this feature fits well with Go.

mihaigalos commented 2 years ago

Interesting idea. Here are some thoughts:

`ro` vs `mut`

I would like to see a harmonizing of read-only types in Golang with Rust and Vlang. The last 2 use mut instead to denote modifiable types. All other types are read-only by default and can only be initialized once.

The reasoning is that if you forget to specify any qualifier, the type is by default non-mutable (const, read-only) and adheres to the principle of "Make it easy to use correctly and hard to be used incorrectly".

Mutable receivers

This proposal is, imho, very nice because having a mutable or read-only datatype would simplify receivers to a great extent.

The pass-by-value or pass-by-reference receiver would not be needed anymore and can be deprecated in favor of mutable/non-mutable receivers. This is a much simpler concept to explain to new developers than say, memory locations and pointers.

Only pass-by-reference would be used and the compiler would error in the case of mutating a non-mutable receiver. The benefit here is obviously the elimination of an unnecessary copy, since pass-by-value receivers would be obsolete. The syntax can then be simplified to remove the pointer asterisk for receivers (or keep it for legacy purposes).

tv42 commented 2 years ago

@mihaigalos The language you are proposing seems to have very little intersection with Go1 as it already exists. Practically no existing Go program would work.

mihaigalos commented 2 years ago

Hi @tv42. Is this not the correct thread to discuss breaking changes? I thought that was what Go2 was all about - ignore the part with legacy purposes in my original post.

josharian commented 2 years ago

@mihaigalos useful background on breaking changes and Go2: https://go.googlesource.com/proposal/+/master/design/28221-go2-transitions.md

amery commented 1 year ago

having a ro or readonly behaving like const in C wouldn't be a breaking change and makes code more secure

jba commented 1 year ago

@amery, it would be a breaking change in some cases:

var f func(*T) = F

This breaks if

func F(*T) {...}

is changed to

func F(ro *T) {...}

amery commented 1 year ago

@amery, it would be a breaking change in some cases:
var f func(*T) = F
This breaks if
func F(*T) {...}
is changed to
func F(ro *T) {...}

even without new features you can't assume you can change F's signature and expect everyone using it to remain happy. to me key to be a breaking change is that old code stops working with the new release of the compiler. this is not the case here

golang / go