godotengine / godot

Godot Engine – Multi-platform 2D and 3D game engine
https://godotengine.org
MIT License
90.21k stars 21.04k forks source link

Pool*Array accessed by reference (class property or array/dictionary value) can't be mutated with e.g. append/set/remove #15265

Closed aidanhs closed 3 years ago

aidanhs commented 6 years ago

Godot version: v3.0-beta2

OS/device including version: Linux x86_64

Issue description: When append_array is called on a PoolByteArray within a dictionary, nothing happens to the original array. If called on a variable, it works as expected.

Steps to reproduce: See minimal repro below

Minimal reproduction project:

  1. New project
  2. New node2d node
  3. Attach script
  4. extends Node2D
    func _ready():
    var x = {}
    x["a"] = PoolByteArray([])
    x["a"].append_array(PoolByteArray([1,2,3]))
    print(x["a"].size())
    
    var a = PoolByteArray([])
    a.append_array(PoolByteArray([1,2,3]))
    print(a.size())

Will print 0 and then 3.

I found https://github.com/godotengine/godot/issues/7073 and https://github.com/godotengine/godot/issues/5277, both of which are closed and seem to indicate that COW shouldn't happen in 3.0.

reduz commented 6 years ago

This is not really a bug, the problem is that PoolArrays are immutable (they are not shared when passed). It works this way to avoid having difficult to trace bugs when sending and retrieving data from servers.

To append data you generally just use the + operator, but some people complained they wanted something a bit more efficient, so the append functions were added, which breaks the immutability so they only work for local or class variables, not something inside a dictionary or array.

This should probably be better docummented.

aidanhs commented 6 years ago

Ok, after searching around for the efficiency complaints I've now discovered StreamPeerBuffer (on https://github.com/godotengine/godot/issues/9662) which looks far more like what I want.

I must admit, this is one of the most surprising behaviours I've seen in a programming language and it's tricky to get a mental model of how this works. After looking at the code, it seems that as soon as you access an item in array or dictionary, the value is immediately copied out and this copy may be by value (integer, Pool*Array) [a] or by reference (StreamPeerBuffer). The method is then called on the copied variable [b]. But...Pool*Array has some methods on it permitting mutation.

So it's the combination of three things: 1) having a copy by value type (instead of by reference) that 2) has methods on to allow you to mutate and 3) is copied out of collections by just accessing it, whether you're about to send it somewhere else or just run a method on it. If you remove any one of these you can point to loads of languages that do the same thing, but the combination seems fairly...unique and surprising.

To be honest it seems a bit odd to add a method for efficiency that requires you to do inefficient copying around to use it in certain cases - if you have a Pool*Array in a dict, the only way to append to it efficiently is with a fairly opaque dance requiring implementation knowledge [c]. At that point it feels like a pretty weak addition to the language - you can only use it "as intended" in some cases, in others your first attempt will do the wrong thing (my original post), your second attempt will do the inefficient thing (effectively +=) and only your third attempt with a good understanding of the underlying implementation will do what you want.

Anyway, those are my thoughts, make of them what you will :) It looks like at one point you were going to turn off COW by default (https://github.com/godotengine/godot/issues/5277#issuecomment-227136972) Did something change? (in fact, now that I've tracked down the rename from RawArray -> PoolBytesArray, I realise https://github.com/godotengine/godot/issues/7073 is an exact duplicate)

[a] Actually in the case of Pool*Array the copy is by reference, but it's reference counted and the copy-by-value is only actually 'realised' when modifying the data. For the purpose of resulting behaviour (though not efficiency), this is identical to copy by value

[b] There is one oddity - in Python x[1] = 5 desugars to a call to __setitem__. By contrast, in GDScript x[1] = 5 does not desugar to x.set for a PoolByteArray - the former works when a PoolByteArray is in a dictionary, the latter doesn't. I've not dug enough into the code to figure why this is.

[c] Example (untested) code:

var x = { "a": PoolByteArray([1,2,3]) } # setup
var tmp = x["a"] # reference copy
x["a"] = null # squash the original reference
tmp.append_array(PoolByteArray([1,2,3])) # in-place append to array with one ref
x["a"] = tmp # copy reference back in
tmp = null # squash temporary reference
reduz commented 6 years ago

Probably the way to fix this is making these reference counted in variant. They are not meant to be used much and generally GDScript rarely needs to access them, which is why not much work was put into this.

On Jan 5, 2018 11:21 PM, "aidanhs" notifications@github.com wrote:

Ok, after searching around for the efficiency complaints I've now discovered StreamPeerBuffer (on #9662 https://github.com/godotengine/godot/issues/9662) which looks far more like what I want.

I must admit, this is one of the most surprising behaviours I've seen in a programming language and it's tricky to get a mental model of how this works. After looking at the code, it seems that as soon as you access an item in array or dictionary, the value is immediately copied out and this copy may be by value (integer, PoolArray) [a] or by reference (StreamPeerBuffer). The method is then called on the copied variable [b]. But...PoolArray has some methods on it permitting mutation.

So it's the combination of three things: 1) having a copy by value type (instead of by reference) that 2) has methods on to allow you to mutate and 3) is copied out of collections by just accessing it, whether you're about to send it somewhere else or just run a method on it. If you remove any one of these you can point to loads of languages that do the same thing, but the combination seems fairly...unique and surprising.

To be honest it seems a bit odd to add a method for efficiency that requires you to do inefficient copying around to use it in certain cases - if you have a Pool*Array in a dict, the only way to append to it efficiently is with a fairly opaque dance requiring implementation knowledge [c]. At that point it feels like a pretty weak addition to the language - you can only use it "as intended" in some cases, in others your first attempt will do the wrong thing (my original post), your second attempt will do the inefficient thing (effectively +=) and only your third attempt with a good understanding of the underlying implementation will do what you want.

Anyway, those are my thoughts, make of them what you will :) It looks like at one point you were going to turn off COW by default (#5277 (comment) https://github.com/godotengine/godot/issues/5277#issuecomment-227136972) Did something change? (in fact, now that I've tracked down the rename from RawArray -> PoolBytesArray, I realise #7073 https://github.com/godotengine/godot/issues/7073 is an exact duplicate)

[a] Actually in the case of Pool*Array the copy is by reference, but it's reference counted and the copy-by-value is only actually 'realised' when modifying the data. For the purpose of resulting behaviour (though not efficiency), this is identical to copy by value

[b] There is one oddity - in Python x[1] = 5 desugars to a call to setitem. By contrast, in GDScript x[1] = 5 does not desugar to x.set for a PoolByteArray - the former works when a PoolByteArray is in a dictionary, the latter doesn't. I've not dug enough into the code to figure why this is.

[c] Example (untested) code:

var x = { "a": PoolByteArray([1,2,3]) } # setup var tmp = x["a"] # reference copy x["a"] = null # squash the original reference tmp.append_array(PoolByteArray([1,2,3])) # in-place append to array with one ref x["a"] = tmp # copy reference back in tmp = null # squash temporary reference

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/godotengine/godot/issues/15265#issuecomment-355716174, or mute the thread https://github.com/notifications/unsubscribe-auth/AF-Z2w0U2QUyZS_ve1UktKDdbJ3urkF4ks5tHti0gaJpZM4RQ7oy .

akien-mga commented 6 years ago

Not critical for the upcoming 3.0 release, so moving to the next milestone. A fix can still be cherry-picked for 3.0.x maintenance releases once available in the master branch.

timower commented 6 years ago

Another way to solve this in gdnative at least is to add the ability to get a reference to the array from the dictionary. The dictionary currently returns a variant reference in it's operator[], but there is no way to turn this reference in a reference to a pool array. Methods could be added to Variant to allow conversion to PoolByteArray references and such.

ghost commented 5 years ago

Is the intention to make this behavior possible at some point?

If not it should probably be throwing an error. I don't use pools often, less so in dictionaries, so when I do I always forget about this fact, and often waste a lot of time tracing back a trail of empty data to an attempt to append() a pool array in a dictionary.

akien-mga commented 4 years ago

I'll add some docs for this until it's properly fixed (if nobody beats me to it).

Beuc commented 3 years ago

(Hereby adding the PoolStringArray keyword so this shows up in github search results.)

IMHO "silently passing objects as values" needn't be documented, it needs to go away. This also goes against the base documentation:

In GDScript, only base types (int, float, string and the vector types) are passed by value to functions (value is copied). Everything else (instances, arrays, dictionaries, etc) is passed as reference.

KoBeWi commented 3 years ago

Fixed in 4.0

jitspoe commented 1 year ago

Any chance of this fix getting backported to 3.x? Running into this issue where I have a PoolVector3Array as part of a class storing a history of player positions (storing a lot of data, so I want it to be optimized), but when I access it from a class, appending does nothing:

func RecordPlayerPosition(PlayerPosition : Vector3):
    if (RecordPositionsThisFrame):
        var TestThing : PoolVector3Array = []
        TestThing.append(Vector3.UP)
        print("Test: ", TestThing)
        ContinuousInfo.PlayerPositions.append(PlayerPosition)
        print("PlayerPositions: ", ContinuousInfo.PlayerPositions)

TestThing works, ContinuousInfo.PlayerPositions does not -- always stays empty. Is there a way to work around this at least?

Edit: Seems I've found a workaround. I can add a function to the class to append this data:

    func AppendPlayerPosition(Position : Vector3):
        PlayerPositions.append(Position)
...
        ContinuousInfo.AppendPlayerPosition(PlayerPosition)
akien-mga commented 1 year ago

This can't be fixed in 3.x without rewriting a huge part of the engine, which is among other things what led to making 4.0.

The workaround is to used a temporary variable:

var pp = ContinuousInfo.PlayerPositions
pp.append(PlayerPosition)
ContinuousInfo.PlayerPositions = pp