romshark / Go-1-2-Proposal---Immutability

A a Go 1/2 language feature proposal to immutability
https://github.com/golang/go/issues/27975
171 stars 4 forks source link

Immutability and engineering #21

Open beoran opened 6 years ago

beoran commented 6 years ago

This proposal covers a lot of ground, and would add a lot of complexity to Go, and that is why it is likely to be not accepted unless you can make the engineering case that in large code bases the benefits outweigh the complexity.

Alsom, it coveres several types or use cases of "immutability" whcih are actually different. It might be a good idea to split this proposal up between those different use cases.

The first use case I see is preventing accidental mutation of function argument values for types passed by pointer, indirectly for arrays, slices, maps and structs that contain pointers, though function argument variables. as I said before, I think the engineering case for this is relatively weak, since in my experience accidental mutation is rare.

The second use case I see, though is solving the string and []byte duplicity we now have in Go, because the former has immutable values while the latter hasn't. A proposal that could simplify the language, so we could do something like type string = immutable_value []byte, and then make it so that we can do away with the duplication of the strings and bytes package, on the other hand would be awesome. For that, though this proposal would have to solve the problem of result type co-variance somehow. You would need to be able to say something like func Split(s, sep immutable_value_or_mutable_value []byte) mutable_only_if_argument_where_mutable(s, sep)[]byte... I think that is a very hard nut to crack though.

Ediit:

A third use case would be immutable variables, that is, variables that can be assigned to only once. I don't thing that that is a very useful use case inside functions, but it might be useful to avoid re-assignments to package scope public variables.

romshark commented 6 years ago

I'd like to begin my answer with a quote of Russ Cox, one of the core Google Go team members, from his speech at the GopherConSG in May 2018:

https://youtu.be/F8nrpe0XWRg?t=3m12s

Software engineering is what happens to programming when you add time and other programmers. Programming means getting a program working. You have a problem to solve, you write some Go code, you run it, you get your answer, and your done. That's programming, and that's difficult enough by itself.

But what if that code needs to keep working day after day? What if 5 other programmer need to work on the code too? What if the code must adapt gracefully as requirements change? Well, then you start thinking about:

  • version control systems, to track how the code changes over time and to coordinate with the other programmers.
  • you add unit tests to make sure that the bugs you fix are not reintroduced over time, neither by you 6 months from now, nor by that new team member who's unfamiliar with the code.
  • you think about modularity and design patterns to divide the program into parts that team members can work on mostly independently.
  • you use tools to help you find bugs earlier.
  • you look for ways to make programs as clear as possible, so that bugs are less likely.
  • you make sure that small changes can be tested quickly even in large programs.

You're doing all of this, because your programming has turned into software engineering.

Nearly all of Go's distinctive design decisions were motivated by concerns about software engineering.

The concept of immutable types fits the goals of the Google Go team just perfectly:

We just need to refine the proposal to take into account past mistakes made by other languages which are described in more detail below.

Simple Languages vs Software Engineering

Humans are prone to errors! Engineering always tries to remove the human factor from an equation. Making humans responsible for remembering what's safe to be mutated and what's not is inherently error-prone and not a proper engineering approach. Human-written documentation can't be relied on either.

Bugs caused by mutable shared state may be rare, but they're of the most nasty ones because you might not even know you have them! They can remain undiscovered for years corrupting your data. Race detector and static code analyzers won't help you. In fact, any algorithmic intelligence won't be able to deduce your intentions, while humans could misinterpret them because both machines and humans can't make any assumptions based on ambiguous code:

A good programming language should provide instruments to fight mutable shared state, especially one that focuses on concurrency. Go was designed for both, experienced software engineers and less experienced programmers, thus:

Consistency is Key

I wrote this proposal because I was frustrated by how other proposals tried to introduce exceptional behavior only for certain situations like arguments or only certain types like slices and maps (completely leaving out pointer types).

If you think about arguments, variables, return values, fields, package-scope variables and receivers differently, then you'll end up with a very inconsistent and complex concept with more exceptions than rules. Inconsistency leads to much greater complexity and frustration, that's why immutability should be a property of types in general, no matter where those types are used, so whenever you see an immutable composite/scalar/interface type you know those rules universally apply:

Those rules consistently apply to any immutable arguments, variables, return values, fields, package-scope variables and function receivers. This gives the code authors a high level of accuracy when describing his/her intents to both other developers and him-/herself, which increases safety, readability and performance.

Performance Sacrifice

The most common way of safely passing a slice as a function argument I saw - involved copying, but copying is horribly inefficient. Consider the following benchmark: https://play.golang.org/p/X4ARzdqqDal

goos: darwin
goarch: amd64
pkg: test
BenchmarkRef-8      2000000000           0.90 ns/op
BenchmarkCopy-8       300000          4152 ns/op
PASS
ok      test    3.187s

Copying is almost x4.613 times slower than passing the slice by reference (it's passed by value, but this value is essentially just a pointer). Even with a small slice of 10 items it's still x100 times slower.

Sometimes you can fix this by wrapping the slice in a wrapper-struct that protects access to it: https://play.golang.org/p/ZRbGaGbP9us

goos: darwin
goarch: amd64
pkg: test
BenchmarkRef-8           2000000           851 ns/op
BenchmarkCopy-8           300000          5165 ns/op
BenchmarkWrapper-8       1000000          1097 ns/op
PASS
ok      test    5.270s

But not only is this still always slower - it's not even always possible. Sometimes you have no control over the reader function and you can't just make it accept a custom wrapper type. In those cases you have no option but to either rely on the function to not accidentally mutate your slice (which is dangerous) or create a copy (which is verbose and slow). You also lose the ability to iterate on the slice using the range operator.

Making Immutability Great Again

I should really have called this proposal "Make Immutability Great Again", not because of the current political agenda though, but because of the many mistakes other languages have done before.

JavaScript

JavaScript for example has a const qualifier, which is totally useless in the most cases. It doesn't prevent objects from being aliased and then mutated later on from a different part of the program:

const a = {name: "John"}
a.name = "Joe"
let b = a
b.name = "Jessy"
console.log(a.name) // Jessy

But most common bugs are caused by aliasing and shared state, not because somebody reassigned a variable. To solve this issue, one would make use of Object.freeze() which basically freezes an object making any further modifications to it impossible, but this approach has it's downsides too: It freezes the object only shallowly, meaning that any of the object references inside the frozen object are still mutable! This issue can also be solved by deep freezing, which has a significant effect on runtime performance and shall thus be used with utmost caution!

C/C++

C made const overcomplicated and unreadable. Both C and C++:

Conclusion

The concept of immutable types offers a consistent set of rules with almost no exceptions and perfectly fits the design philosophy of the Go programming language and the motivations of the its core developer team at Google. It just needs to take into account the experiences of other, failed implementation attempts from other languages. If we manage to solve the problems mentioned above then immutable types could become an integral part of Go and make software engineering in Go both safer and more efficient.

P.S. Check out #20, it proposes a potential solution to the qualifier verbosity problem.

200sc commented 6 years ago

I'll respond to a few points you make, noting first that these things may be true for languages where immutability was built in from the start.

It's a way to make programs as clear as possible through clearly defined intentions, so that bugs are less likely

You should resolve the verbosity problem for this proposal to add any clarity. Right now spreading const everywhere hurts readability in a way that similarly hurts clarity, and all of the clarity gains can be achieved with documentation today.

It also makes sure that small changes can be tested quickly, because you don't need to write extensive automated tests, just to verify that you have no shared mutable state running amok. Your compilation will just fail and tell you what you're doing wrong (your code linter could tell this even earlier).

I have never seen a test in Go that matches what you are describing. Can you point to examples?

It makes sure that something that was initially meant to be immutable - remains immutable over time (over thousands and thousands of commits and hundreds of pull requests) and doesn't get mutably aliased or directly mutated... neither by you 6 months from now nor by that new team member who's unfamiliar with the code nor by the open source contributor, who's totally unfamiliar with the code.

Not if you can make immutable type aliases. So long as you can make type ImmutableMap const map[string]string, someone down the line can go in and remove that const without breaking any code that expects that type.

The most common way of safely passing a slice as a function argument I saw - involved copying, but copying is horribly inefficient. Consider the following benchmark: https://play.golang.org/p/X4ARzdqqDal

You're choosing to manually copy elements instead of using the builtin copy, which hurts your performance.

suffer from const-poisoning

So does your proposal.

make immutable type definitions very verbose and hard to both read and write.

Subjective, but I don't think your definitions read any easier. You've essentially changed a left-bias to a right-bias.

beoran commented 6 years ago

Thanks for your enhusiastic reply! I think our main disagreement is centered on this point:

Bugs caused by mutable shared state may be rare, but they're of the most nasty ones because you might not even know you have them!

I think that because this kind of problem is rare, the costs outweigh the benefits.

If all types and variables became immutable by default in Go, all Go programs will have to be fully rewritten and litteref with 'mut' everywhere. Furthermore your proposal still has some of the same semantically problems const in C has, especially the problem of function result covariance. I don't mind a single instance of verbose syntax, but I do mind it when the same syntax has to be repeated over and over again just to try to avoid a problem that rarely occurs in reality.

What would convince me if you or anyone else could point me to at least 3 large open source go programs that experienced long standing bugs due to accidental mutation.

romshark commented 6 years ago

@200sc

You should resolve the verbosity problem for this proposal to add any clarity. Right now spreading const everywhere hurts readability in a way that similarly hurts clarity, and all of the clarity gains can be achieved with documentation today.

I totally agree about the verbosity of const, it really sucks, but so far it's the only option I can imagine for a backward-compatible Go 1.x implementation. For Go 2 however, I'm already having an alternative idea I call "Mutability Qualification Propagation" and it's explained in #20 in full detail. Essentially MQP allows us to to avoid verbosity in the most common cases without losing the ability to define mixed-mutability types (which is the case with transitive immutability proposed by Jonathan Amsterdam). It turns: mut [] mut [] mut * mut T into mut [][]*T and the mixed version: mut [] mut [] * T into mut [][] immut *T. Simply put: qualification is propagated to the right until it's canceled out.

Please consider checking out the full specification at #20

and all of the clarity gains can be achieved with documentation today.

I disagree on that one, and I've described it in the proposal. Documentation represents claims, not guarantees, relying on claims is the opposite of safety.

I have never seen a test in Go that matches what you are describing. Can you point to examples?

I don't think anyone writes tests like these. Mutating aliasing is a hell of a problem and testing it reliably is quite difficult.

Not if you can make immutable type aliases. So long as you can make type ImmutableMap const map[string]string, someone down the line can go in and remove that const without breaking any code that expects that type.

It's still easier to find fraudulent code changes by just checking what types and APIs have been changed. Currently Go code is ambiguous and finding a fraudulent change is noticeably harder.

BTW, to make your map definition save you'd actually have to write:

type ImmutableMap = const map [const string]const string

Yes, I can hear you cursing, and I totally agree that this is just ridiculously verbose! But again, it's the only backward-compatible way, unfortunately. #18 describes why it's necessary to make the strings const, by the way.

You're choosing to manually copy elements instead of using the builtin copy, which hurts your performance.

I am aware of the builtin copy function, but it's not always possible to use it, for example when you've got a slice of type Object struct instances that need to be copied with their Clone() Object method to avoid shallow copies (Optional fields are usually pointers, thus the Clone method is necessary to avoid aliasing those fields).

Even though the builtin copy is slightly faster, it's still almost x4 times slower than a plain reference, which should be obvious. It doesn't sound like much, but it adds up in certain situations. https://play.golang.org/p/pAviW2GdnX7

goos: darwin
goarch: amd64
pkg: test
BenchmarkRef-8                   2000000           688 ns/op
BenchmarkManualDeepCopy-8          50000         27007 ns/op
BenchmarkManualCopy-8             500000          3263 ns/op
BenchmarkBuiltinCopy-8            500000          2643 ns/op
PASS
ok      test    6.729s

When I was writing a server API for example I had to return the list of sessions of all currently connected clients as deep copies to avoid aliasing and mutations. This is very costly and could have been solved with immutable types.

suffer from const-poisoning

So does your proposal.

I'm working on it.

Subjective, but I don't think your definitions read any easier. You've essentially changed a left-bias to a right-bias.

C is rather Clockwise/Spiral and not left-biased.

Again, please consider checking out MQP in #20

romshark commented 6 years ago

@beoran

I think that because this kind of problem is rare, the costs outweigh the benefits.

This is a controversial statement. I personally disagree with it because experience has taught me that in a month from now all detailed knowledge about the code base I'm working on now - will be lost and the probability of shooting myself in the foot is pretty high if the code is poorly structured and ambiguous.

If all types and variables became immutable by default in Go, all Go programs will have to be fully rewritten and litteref with 'mut' everywhere.

Immutability by default would only be possible in Go 2, with Go 2.x you'll have to rewrite all your Go 1.x code anyway, a second version implies backward-incompatible changes.

MQP #20 could shorten the amount of muts you'll have to read & write.

Furthermore your proposal still has some of the same semantically problems const in C has, especially the problem of function result covariance. I don't mind a single instance of verbose syntax, but I do mind it when the same syntax has to be repeated over and over again just to try to avoid a problem that rarely occurs in reality.

True, working on it. Jonathan Amsterdam proposed the "Permission Genericity" concept and I'm thinking about integrating it into this proposal.

What would convince me if you or anyone else could point me to at least 3 large open source go programs that experienced long standing bugs due to accidental mutation.

That'd be a research on its own, which I unfortunately currently have no time for, but probably somebody will point out concrete detailed examples when reading this proposal, who knows?

beoran commented 6 years ago

This is a controversial statement.

I think we both had quite different experiences. Which is exactly why I am asking for concrete evidence. I would love to be proven wrong, but for now I remain unconvinced.

with Go 2.x you'll have to rewrite all your Go 1.x code anyway

Not really, the likely proposals for go2 are generics, and better error handling, which will only require minimal changes to go1 code that can probably be applied through a tool. Not so for mutation, that is undecideable by a parser, unless you just put mut everywhere, in which case it is useless.

Permission genericity looks interesting, however it will probably have to take go2 generics and contracts to consideration.

I understans collecting evidence is hard work, but it is neccesary for a good design. Would you be at least willing to provide concrete examples where you experienced problems with accidental mutation in your own projects, so we can think about what the minimal solution for those problems could be?