golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.46k stars 17.59k forks source link

proposal: spec: cannot assign to a field of a map element directly: m["foo"].f = x #3117

Open griesemer opened 12 years ago

griesemer commented 12 years ago
The non-addressability requirement of map index expressions makes sense by itself since
we don't want a surviving pointer to a map element.

But surely it should be ok to have an invisible temporary pointer p to the respective
field which is just used during the assignment (*p = x). It seems that this doesn't
introduce any more issues than what we have already when assigning to a larger map
element that cannot be assigned to atomically.

E.g., the following program should be legal:

package main

import "fmt"

var m = map[string]struct{x, y int} {
    "foo": {2, 3},
}

func main() {
    m["foo"].x = 4 // cannot assign to m["foo"].x
    fmt.Println(m)
}
gopherbot commented 11 years ago

Comment 1 by Robert.nr1:

Hi, what would be the best way to work around this until it gets fixed?
Is this the way to go? (pun intended.. ;)
var tmp = m["foo"]
tmp.x = 4 
m["foo"] = tmp
griesemer commented 11 years ago

Comment 2:

Yes, this is the "work-around". If m doesn't contain an entry with key "foo", tmp will
be the zero value for the map value type and this code still works (by setting the
respective entry for the first time).
Note that if direct assignment as in
m["foo"].x = 4
were permitted, one of the issues (in the language t.b.d.) would be what to do if
m["foo"] doesn't exist (non-trivial).
gopherbot commented 11 years ago

Comment 3 by Robert.nr1:

Okay, thanks for the quick reply!
Yes, I guess that requires some thinking, it's not obvious what should happen. 
Maybe it could be defined like this, if 'm["foo"]' does not exist when assigning
'm["foo"].x':
* Create the new "map member" 'm["foo"]'
* Then try to assign "x" to it
* It will fail, but that does not really mater that much since we would still get a
reasonable error message out of it?
Just a thought, thanks again for the quick reply.
ianlancetaylor commented 11 years ago

Comment 4:

The two cases really are different.  Given a map holding a struct
    m[0] = s
is a write.
    m[0].f = 1
is a read-modify-write.  That is, we can implement
    m[0] = s
by passing 0 and s to the map insert routine.  That doesn't work for
    m[0].f
Instead we have to fetch a pointer, as you suggest, and assign through the pointer.
Right now maps are unsafe when GOMAXPROCS > 1 because multiple threads writing to the
map simultaneously can corrupt the data structure.  This is an unfortunate violation of
Go's general memory safeness.  This suggests that we will want to implement a
possibly-optional safe mode, in which the map is locked during access.  That will work
straightforwardly with the current semantics.  But it won't work with your proposed
addition.
That said, there is another possible implementation.  We could pass in the key, an
offset into the value type, the size of the value we are passing, and the value we are
passing.
A more serious issue: if we permit field access, can we justify prohibiting method
access?
    m[0].M()
But method access really does require addressability.
And let's not ignore
    m[0]++
    m[0][:]
extemporalgenome commented 11 years ago

Comment 5:

I don't think this buys _too_ much, since even with this there'd be no way to "merge"
values which happen to be structs/maps. For example:
type S struct { f, g, h int }
m := map[int]S{0: S{1,2,3}}
// I want to update f and g, but not h
m[0] #= S{f:5, g:4} // some sort of squirrely 'merge' syntax
I'm certainly not proposing such a thing (besides which, it doesn't fit with any other
part of the language) -- the fact does remain that:
m[0].f = 5
m[0].g = 4
is shorter, but likely much less efficient even if Ian's offset-set functionality were
added, compared to:
s := m[0]
s.f = 5
s.g = 4
m[0] = s
The performance gap would certainly increase with the number of fields you want to
update.
All that said, if this feature is particularly desirable, then I think the simpler way
to handle the semantics is to keep the addressability requirement, and consider some
cases to be syntactic sugar:
m[0][:] already works for []int and *[...]int. Because values can be moved around in the
map implementation, m[0][:] should never work for [...]int.
`m[0].f = 1` already works if the value is a pointer type. Why is that a big deal? If
you want less memory fragmentation and better cache lining, use "2nd order allocators".
The hashmap implementation already uses pointer indirection anyway.
Alternatively, `m[0].f = 1` for 'non addressable' values could be _sugar_ for `tmp :=
m[0]; tmp.f = 1; m[0] = tmp`. Ian's optimization would be transparent, and
implementation dependent. Same for m[0]++
m[0].M() already works for value receivers and pointer values. For the same reason array
values shouldn't be sliceable, pointer receivers on non-pointer values should not be
allowed.
&m[0].f should be prohibited in all cases where it's currently prohibited.
rsc commented 11 years ago

Comment 6:

[The time for maybe has passed.]
rsc commented 10 years ago

Comment 7:

Labels changed: added go1.3maybe.

rsc commented 10 years ago

Comment 8:

Labels changed: added release-none, removed go1.3maybe.

rsc commented 10 years ago

Comment 9:

Labels changed: added repo-main.

dkinzer commented 9 years ago

I just had to use this work around in order to be able to assign some values to two struct fields simultaneously from a function that returns multiple values. Seems kind of an ugly work around. But I'm glad it works.

yaoshengzhe commented 9 years ago

I feel upset that Go still keep pointers, perhaps to better integrate with C ? However, issue like this could be solved elegantly if Go could take Java's approach, reference types (not exact the reference in C++, more like Java's) for all non-primitive types (including struct). Nothing offense, but as a Go developer, I was bitten several times by interface, struct and pointer... Well, you can blame me as I am an inexperience Go developer...

minux commented 9 years ago

This issue is not about pointers in Go. Please discuss that on the golang-nuts mailing list (but please note that we are not going to remove pointers from Go)

rsc commented 8 years ago

(Locked due to recent spam.)

ianlancetaylor commented 6 years ago

We should also consider array indexes for maps whose value type is an array type. m["key"][1] = 2.

vkuzmin-uber commented 6 years ago

I am willing to work on it if there are no objections. First I want to collect requirements, use-case, concerns. Then see what are the limitations of the current implementation (I have heard that in past Cgo implementation allowed it because implementation just didn't restrict it).

I will update this thread with details I get soon.

bradfitz commented 6 years ago

@vkuzmin-uber, this isn't a compiler or runtime implementation bug. This is a spec bug, changing the language specification to permit such code.

vkuzmin-uber commented 6 years ago

Yes, I completely understand it. Just mentioned that I have seen somewhere that CGo allowed it unintentionally. I meant that I need to check if current map API (runtime.map*) can support this feature w/o significant changes.

vkuzmin-uber commented 6 years ago

I want to refresh and even explain motivation of this proposal.

As first comment says Golang doesn’t allow the following code:

func main() {
    type T struct{x, y int}
    m := make(map[string]T)
    m["foo"].x = 4 // cannot assign to m["foo"].x
    fmt.Println(m["foo"].x)
}

“cannot assign to struct field m["foo"].x in map”

As result, compiler forces you to use copy of whole object:

func main() {
    type T struct{x, y int}
    m := make(map[string]T)
    t := m["foo"]
    t.x = 4
    m["foo"] = t
    fmt.Println(m["foo"].x)
}

This solution has overhead: 2 lookups, 2 copies instead of 1 lookup and in-place change. Note that you can’t take reference to m[k] because map values are not addressable https://github.com/golang/go/issues/11865

Lets consider another workaround - use *T that is allowed by Go 1.11:

func main() {
    type T struct{x, y int}
    m := make(map[string]*T)
    m["foo"] = &T{1, 5} // Initialization is required, otherwise it contains “nil” pointer
    m["foo"].x = 4
    fmt.Println(m["foo"].x)
}

You see, that you need to initialize m[k] always before you can read it or change. Inconvenience is more visible in the following example:

func main() {
    type T struct{x, y int}
    m := make(map[string]*T)
    m["foo"] = &T{}
    fmt.Println(m["foo"].x)
}

Here we initialize m[“foo”] with reference of default T{} value just to be able read it then.

Look at the code with map of structures:

func main() {
    type T struct{x, y int}
    m := make(map[string]T)
    fmt.Println(m["foo"].x)
}

It works and returns valid result 0.

We see here feature of map of structures - "auto-initialization" of elements of map (in fact, mapaccess returns "default zero object" but this is idea of what we get). But we still want to be able to update fields of elements if necessary in efficient way, w/o copy of whole object.

As result, I observe 2 Motivations to allow it:

  1. Elegant syntax. When we don’t need to do explicit lookup, copy, update and assign then.
  2. Efficiency. Even if we implement 1st as implicit actions, it may generate expensive code that is not obvious for author of code. Especially for cases like: m[“foo”].x = 4 m[“foo”].x += 1 m[“foo”].y = 3 and etc.

This is why I look for implementation of this proposal where 1 and 2 can be achieved together.

*Note, comment above mentioned that "map of structures" assumes more efficient memory access but this is too general assumption to be "motivation" at this moment.

gopherbot commented 6 years ago

Change https://golang.org/cl/141647 mentions this issue: cmd/compile: assign to a field of map element

vkuzmin-uber commented 6 years ago

Current map's internal API makes this change trivial. I have implemented a Proof of Concept and added simple test. Test passed.

randall77 commented 6 years ago

I'm in favor of this proposal.

The thing to worry about is that although this change is trivial to do today (thanks, Vladimir!) it might not be trivial with maps implemented some other way. For instance, in the current implementation map assignment returns the value by reference so it can be updated:

m := map[K]V{}
v := mapassign(m, k)  // type *V
v.f = 7

If we still did it the old way, passing the updated value in, it would be harder. We'd have to pass the offset and type of the portion of the value we'd like to update.

m := map[K]V{}
mapassign(m, k, offsetof(V.f), typeof(V.f), &7)

Further, if we allow op= changes to fields, the op would also need to be passed in.

Not the end of the world, but it's one more thing that constrains the map implementation.

mdempsky commented 6 years ago

I'm fine with this proposal, but don't feel strongly about it.

Also, assuming it's to be accepted, I second @ianlancetaylor's suggestion that array indexing should be allowed too. Moreover, I'd suggest that an arbitrary number of struct field and array indexing operations (including mixing them) are allowed. (The latter doesn't appear to have been explicitly suggested yet in the thread.)

vkuzmin-uber commented 5 years ago

Ian> We should also consider array indexes for maps whose value type is an array type. `m["key"][1] =

Matt> Also, assuming it's to be accepted, I second @ianlancetaylor's suggestion that array indexing should be allowed too.

I tried PoC it and think this deserves a separate Issue/Proposal where we can better consider pros/cons.

The reason - I see that map of fixed array doesn't behave like a map of structs. It seems:

m := map[int][10]int{} m[k][5] = 1

does some lazy allocation of fixed array and simple PoC change hits panic on assigning to "nil". Sure I can dig into it to understand, just thought it's better to separate and make a progress with map of structures first. Let me know if you are ok with separate proposal, and I will create a new one with link to this one.

Matt>Moreover, I'd suggest that an arbitrary number of struct field and array indexing operations (including mixing them) are allowed. (The latter doesn't appear to have been explicitly suggested yet in the thread.)

I updated PoC with tests that covers valid and invalid cases. I think this is what you meant (except map of fixed arrays cases):

Valid: m[k].x.y = v m[k].a[i] = v m[k].t.a[i].x.y = v

where a is a fixed array, i index of array

Invalid cases: m[k][i] = v m[k].t.another_map[key][i] = v

where i is an index of fixed array.

vkuzmin-uber commented 5 years ago

I'm in favor of this proposal.

The thing to worry about is that although this change is trivial to do today (thanks, Vladimir!) it might not be trivial with maps implemented some other way. For instance, in the current implementation map assignment returns the value by reference so it can be updated: Not the end of the world, but it's one more thing that constrains the map implementation.

Hi Keith,

I understand this concern and kept it in mind as one of the cons. But frankly, this limitation of language is so unusual so I think it was caused by initial implementation of map. So, first we need to understand if there is a value of the feature and I think yes:

ianlancetaylor commented 5 years ago

@vkuzmin-uber The language limitation is not caused by the initial implementation of map. It's caused by the fact that a map element is not addressable. A map element can not be addressable because it leads to confusing situations, like what should happen if you take the address of an element and then delete the key from the map, and then add it back again.

The language specifies that you can't assign to a value that is not addressable. That makes sense for a language like Go: if you can't take the address of a value, how can you change it? Of course that would make maps useless, so there is an exception: you can assign to a value if it is addressable, or if it is a map index (https://golang.org/ref/spec#Assignments).

This issue is about changing the language to add an additional exception to the assignment rules.

Thanks for being willing to work on it, but I want to stress that this is a language change proposal and that proposal has not yet been accepted. That status is not going to change just because we have an implementation. This language change is going to be accepted or rejected as part of an overall Go 2 process.

In particular, from a language perspective, I do not think that we can separate field assignment to map elements from indexing of map elements. You seem to suggest that they should be handled separately, but I don't think that is OK. When I say that I'm not referring to the implementation details, but rather to how the language is changed. We are going to make a language change decision first, then an implementation, not the other way around.

Thanks for your interest in this.

mdempsky commented 5 years ago

It occurs to me that allowing array indexing involves an ambiguity over whether assignments happen before or after array bounds checking. For example:

m := make(map[string][10]int)
i := 100

func() {
    defer func() { recover() }()
    m["k"][i] = 42
}()

println(len(m))

Does this print 0 (we bounds check i < 10 before adding an element to the map) or 1 (bounds check after)? Or is it up to the implementation to decide?

randall77 commented 5 years ago

@mdempsky We definitely want the bounds checks to happen before the slot in the map is allocated. It's very similar to

   m[k] = f()

If f panics. We never want to expose an allocated-but-not-yet-assigned map value.

mdempsky commented 5 years ago

@randall77 Yeah, I'm inclined to agree the bounds check should happen before the map element is allocated. It's not obvious to me though that the alternative (inserting a zero element into the map) is unreasonable. (I think exposing uninitialized memory is obviously unreasonable; not sure if that's what you had in mind by "not-yet-assigned".)

I think a better example for comparison is the first example from #23735, which emphasizes the question of whether (*p)[i] checks for nil or array index out of bounds first (and notably cmd/compile and gccgo differed). Except there we dodged addressing the question by pointing out the spec doesn't guarantee panic error text anyway.

vkuzmin-uber commented 5 years ago

@mdempsky , @randall77

Just to clarify, do we expect the same behavior of a similar case, when array is wrapped into struct:

m := make(map[string]struct{a [10]int})
i := 100

func() {
    defer func() { recover() }()
    m["k"].a[i] = 42
}()

println(len(m))
randall77 commented 5 years ago

@vkuzmin-uber : yes. Your example should print 0.

rsc commented 5 years ago

There's a lot of subtle corner cases here. To move forward with this, we need a real design doc.

pyang-1981 commented 5 years ago

Anyway, I think we should be fine to use m[key].field++ and like. It is programmer's responsibility to protect against data structure corruption, or refer to invalid addresses by using locks. Right now Go is forbiding a valid use that can be solved by the synchronization structure it already offers.

ianlancetaylor commented 4 years ago

Putting this on hold until and unless someone decides to write a complete design doc for the change.

seebs commented 4 years ago

Wow, this is surprisingly hard to phrase cleanly.

I think the goal is: Given any valid expression that includes a map index and would otherwise be addressable, we want to be able to use it in assignments, increments, and so on, as if it were addressable. One of the challenges is that a simplistic description of this might imply that this should apply to expressions which wouldn't be addressable for other reasons not involving the map index. (Like, if you have a map of function-returning-int, we don't want m[k]()++ to be allowed, even though m[k]() is a reasonable expression.)

Thought experiment: What about pointer receiver methods?

type Counter {
    X int
}
var seebsno *Counter
func (c *Counter) Inc() { c.X++ }
func (c *Counter) Stash() { seebsno = c } 
var m map[int]Counter
m[0].X++ // allowed by this proposal
m[0].Inc() // sure
m[0].Stash() // oops

If you don't allow pointer-receiver methods, it's Annoying. If you do, you have opened the floodgates and made the value act as though it's addressable.

Possible resolution: If you do a thing to a map-index expression that would require it to be addressable, the thing happens on an addressable copy, and then the contents of that copy are assigned back into the map. If you stashed a copy of the address and then later tried to use it, this is considered Own Fault.

But... That feels like a sneaky gotcha.

So I think the right answer may be "still don't allow pointer-receiver methods". In which case... it's tricky to express the thing that is being allowed. Assignment, increment, etc, say that the thing must be addressable or a map index expression. Go doesn't really have a codified distinction between "lvalue" and "addressable", which is what I think is at issue.

Hmm. Given m[k].sel, where m[k] is a map index expression, and .sel is any combination of selectors or index expressions, the result can be used in assignment or increment operations, and works as though m[k] were extracted to a temporary variable and then stored back.

m[k].sel += 3
// equivalent to
tmp := m[k]
tmp.sel += 3
m[k] = tmp

Except that the map index expression doesn't get evaluated twice.

mdempsky commented 4 years ago

Thought experiment: What about pointer receiver methods?

No, we don't want to allow users to create pointers to the variables within a map. If we allowed that, then we might as well just make map indexing addressable. In your example, neither m[0].Inc() nor m[0].Stash() should be allowed.

The Go spec already states "Each left-hand side operand must be addressable, a map index expression, or (for = assignments only) the blank identifier." We probably just need to extend "a map index expression" here similarly to how we already define addressability: "The operand must be addressable, that is, either a variable, pointer indirection, or slice indexing operation; or a field selector of an addressable struct operand; or an array indexing operation of an addressable array."

We would need to define somewhere what happens when assigning to just a sub-variable within a map somewhere. Probably the easiest way would be to just define that when assigning to a key not already in the map, a corresponding value is first zero-initialized just before the assignment.

There's also nothing special about ++. m[0].X++ is defines as meaning the same thing as m[0].X += 1, which in turn means the same as m[0].X = m[0].X + 1. If we allow assigning to m[0].X, then m[0].X += 1 and m[0].X++ are implicitly allowed too (since m[0].X as a value expression is already valid).

seebs commented 4 years ago

I think the behavior with uninitialized values automatically falls out of the way a map yields a zero value when you access a nonexistent key.

I agree that pointer receivers are a bad idea, but they don't necessarily strictly imply making the content of the map addressable; they might imply making addressable copies of the content.

I'm not actually sure what I expect to "really" happen under the hood from something like m[0].X++; I would not be surprised if it resulted in exactly what currently happens when we write the code explicitly, which is that the entire contents of m[0] are copied, X is modified in the copy, and then the entire copy is written back. On the other hand, if it could arrange to only have to copy (and write back) that individual value, that would be a potentially significant performance win in some cases.

theanine commented 1 year ago

I think it would also be useful to improve the error messages around these situations so it's clear what the problem is.

type foo struct {
    bar []time.Time
}

func baz(f map[int]foo, id int) {
    f[id].bar = append(f[id].bar, time.Now())
}

func main() {
    foobar := make(map[int]foo)
    baz(foobar, 1)
}

./main.go:14:2: cannot assign to struct field f[id].bar in map

^ useful error message, understandable how to fix the issue.

type foo struct {
    bar [1]time.Time
}

func baz(f map[int]foo, id int) {
    f[id].bar[0] = time.Now()
}

func main() {
    foobar := make(map[int]foo)
    baz(foobar, 1)
}

./main.go:14:2: cannot assign to f[id].bar[0] (value of type time.Time)

^ confusing error message, as the issue is related to the assignment to the map value and not a type issue.

gopherbot commented 1 year ago

Change https://go.dev/cl/492875 mentions this issue: go/types, types2: better error message for bad assignment

jieqiyue commented 2 months ago

666啊