Closed theogf closed 1 year ago
Oh no there are actually more issues with the values of the Dictionary itself
I see the problem.
What we can do is take a shortcut if holes == 0
. Otherwise we need to use the kind of logic you see in rehash!
. I suggest one of the two following options:
rehash!
if holes != 0
and then continue the current logic.holes != 0
The disadvantage of the first one is it won't be thread safe when you have multiple readers (I'm not even sure it is safe under single-threaded concurrency).
Probably should go with the second with code like:
serialize_type(s, T, false)
if indices.holes == 0
return serialize(s, ind.values)
else
is = Vector{I}(undef, length(dict))
@inbounds for t in tokens(dict)
is[i] = ind.values[t]
end
return serialize(s, is)
end
It might be even better if we could stream out the indices in the second branch rather than collecting them in memory, but at least this is safe.
Will need equivalent code for dictionaries, and for deepcopy.
So I was thinking of an even more straightforward solution with dispatching deepcopy_internal
on the Dictionary
and collecting the values
themself.
Something like
deepcopy(d::Dictionary{I,T}) where{I,T} = Dictionary{I,T}(deepcopy(collect(keys(d))), deepcopy(collect(d)))
This way we just build a whole new Dictionary
and avoid shenanigans ensuring that the holes
and slots
are consistent?
So I just implemented my proposal
Base: 78.86% // Head: 79.33% // Increases project coverage by +0.47%
:tada:
Coverage data is based on head (
ce7b08b
) compared to base (7d596e8
). Patch coverage: 100.00% of modified lines in pull request are covered.
:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
@andyferris Would you rather have me use your solution instead?
@theogf news here?
Hi @andyferris, any update on this :) ?
Sorry I have been smashed.
Correct is more important the fast, so lets merge this. Thanks @theogf
Hey @andyferris !
It looks my
deepcopy
andserialize
implementation have a bug. I wrongly assumed thatindices.values == collect(indices)
. It is visibly not true and createundef
references when using nonisbits
structure.I used
collect(indices)
instead and took care of updating the tests accordingly.(version patch is bumped as well)