Closed spurll closed 7 years ago
Here's an illustration of the issue that I'm talking about:
julia> str = "something"
"something"
julia> arr = [WeakRefString(pointer(str, 1), 4, 1), WeakRefString(pointer(str, 5), 5, 5)]
2-element Array{WeakRefString{UInt8},1}:
"some"
"thing"
julia> na = NullableArray{WeakRefString{UInt8}, 1}(arr, [false, false], Vector{UInt8}(str))
2-element NullableArrays.NullableArray{WeakRefString{UInt8},1}:
"some"
"thing"
julia> scp = copy(na)
2-element NullableArrays.NullableArray{WeakRefString{UInt8},1}:
"some"
"thing"
julia> na.parent
9-element Array{UInt8,1}:
0x73
0x6f
0x6d
0x65
0x74
0x68
0x69
0x6e
0x67
julia> scp.parent
0-element Array{UInt8,1}
Once we no longer have the original na
(and str
), the WeakRefString
s contained in the copy scp
will be in danger of garbage collection.
The fix I'm testing replaces the existing copy
method with:
function Base.copy{T, N}(X::NullableArray{T, N})
return NullableArray{T, N}(copy(X.values), copy(X.isnull), X.parent)
end
With the fix, we can see that copying a NullableArray
now results in a copy that has a parent
field. Additionally, we can see that the parent field references the same memory block as the original.
julia> str = "something"
"something"
julia> arr = [WeakRefString(pointer(str, 1), 4, 1), WeakRefString(pointer(str, 5), 5, 5)]
2-element Array{WeakRefString{UInt8},1}:
"some"
"thing"
julia> na = NullableArray{WeakRefString{UInt8}, 1}(arr, [false, false], Vector{UInt8}(str))
2-element NullableArrays.NullableArray{WeakRefString{UInt8},1}:
"some"
"thing"
julia> scp = copy(na)
2-element NullableArrays.NullableArray{WeakRefString{UInt8},1}:
"some"
"thing"
julia> na.parent
9-element Array{UInt8,1}:
0x73
0x6f
0x6d
0x65
0x74
0x68
0x69
0x6e
0x67
julia> scp.parent
9-element Array{UInt8,1}:
0x73
0x6f
0x6d
0x65
0x74
0x68
0x69
0x6e
0x67
julia> pointer(na.parent)
Ptr{UInt8} @0x0000000116aa9f98
julia> pointer(scp.parent)
Ptr{UInt8} @0x0000000116aa9f98
julia> pointer(na.values)
Ptr{WeakRefString{UInt8}} @0x0000000109baccd0
julia> pointer(scp.values)
Ptr{WeakRefString{UInt8}} @0x0000000116ac8d50
I'm running some tests now to ensure that this change doesn't have any unexpected repercussions.
getindex
also fails to maintain the parent
field:
julia> str = "something"
"something"
julia> arr = [WeakRefString(pointer(str, 1), 4, 1), WeakRefString(pointer(str, 5), 5, 5)]
2-element Array{WeakRefString{UInt8},1}:
"some"
"thing"
julia> na = NullableArray{WeakRefString{UInt8}, 1}(arr, [false, false], Vector{UInt8}(str))
2-element NullableArrays.NullableArray{WeakRefString{UInt8},1}:
"some"
"thing"
julia> na.parent
9-element Array{UInt8,1}:
0x73
0x6f
0x6d
0x65
0x74
0x68
0x69
0x6e
0x67
julia> na[:].parent
0-element Array{UInt8,1}
The PR I'm working on also solves this problem. All existing tests still pass on 0.5 and 0.6 (as well as the new ones I've added). Just making sure that I haven't missed anything else important, and that the changes don't interfere with anything else.
I noticed this issue when sorting a
DataTable
that contained a column of typeNullableArray{WeakRefString{UInt8}}
. Ultimatelysort
callscopy
, and after the smoke cleared I was left withWeakRefString
s that would occasionally become gibberish because garbage collection happened and there were no longer any open references to the block of data used by theWeakRefString
s.I'm putting together a PR that I hope will fix the issue with
copy
, but the issues withdeepcopy
(if you'll excuse the expression) run deeper.As the
parent
field is really just a reference, we don't want a deep copy of it. Currently, if youdeepcopy
, you just get an emptyparent
, which won't prevent garbage collection of any unsafe references (likeWeakRefString
s). If we wanteddeepcopy
to work in this case, it might end up looking something like this:In addition to not really being a deep copy, this has the unfortunate consequence that if you're copying a
NullableArray
that doesn't care about itsparent
field (which seems to be most of them), you end up with an original and a copy that share a reference to a singleparent
(which is just going to be an emptyVector{UInt8}
). It's maybe not a really big deal, but it certainly seems wrong.So my plan is to attempt to fix
copy
, and to leavedeepcopy
alone. I don't know if the long-term plan forNullableArrays
is to keepparent
around (my understanding is that it is not), but the fact that it isn't being copied is causing problems for us at the moment.