Open SimonDanisch opened 5 years ago
This appears to give ~10% on my benchmarks, but it would be way higher for more numerical data I guess. Added it to my test branch, see #34
This doesn't seem safe in general? What if x
goes out of scope before the unsafe_wrap
Array.
I tried this on VGG19 from Metalhead and got
julia> @time Metalhead.weights("vgg19.bson")
0.662478 seconds (1.65 M allocations: 631.681 MiB, 15.95% gc time)
Dict{Symbol,Any} with 38 entries:
Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x1f353f3d -- getindex at .\array.jl:730 [inlined]
getindex at .\subarray.jl:228 [inlined]
_show_nonempty at .\arrayshow.jl:386
in expression starting at no file:0
getindex at .\array.jl:730 [inlined]
getindex at .\subarray.jl:228 [inlined]
_show_nonempty at .\arrayshow.jl:386
#383 at .\arrayshow.jl:404
jl_apply_generic at /home/Administrator/buildbot/worker/package_win64/build/src\gf.c:2219
show_nd at .\arrayshow.jl:294
_show_nonempty at .\arrayshow.jl:403 [inlined]
show at .\arrayshow.jl:421
Guess it's unsafe indeed :P I was assuming, it's only used just before sending it to write, but it's a bit hard to tell where things go. Maybe it'd be better to implement a write that works with an reinterpret array efficiently.
Maybe it'd be better to implement a write that works with an reinterpret array efficiently.
Yeah, probably. https://github.com/MikeInnes/BSON.jl/pull/46 is at least a lot better than what we have now.
Yup, I was thinking to just submit something like #46, but then I got greedy :D
I'd say just merge #46 and leave this unmerged until we find something safer ;)
Interestingly, just defining
function reinterpret_(::Type{T}, x) where T
reinterpret(T, x)
end
I can load VGG19 pretty damn fast
julia> @time Metalhead.weights("vgg19.bson")
0.269944 seconds (1.97 k allocations: 548.196 MiB, 37.98% gc time)
Dict{Symbol,Any} with 38 entries:
but it crashes when testing BSON itself.
@KristofferC probably because reinterpret returns ReinterpretArray
which wraps the original array and lazily casts its contents when accessed.
After benchmarking BSON to figure out if it's a good replacement for JSON, I noticed that its by far the slowest. This PR fixes the performance partly:
It's still slower than CBOR, which also isn't doing the fastest it could - so I guess there are still some other low hanging fruits in here ;)