JuliaIO / BSON.jl

Other
158 stars 39 forks source link

Support large BSONs by using Int64 in case Int32 size range is exceeded. #74

Open racinmat opened 3 years ago

racinmat commented 3 years ago

Proposal how to fix #67 . Based on discussion https://julialang.slack.com/archives/C67TK21LJ/p1596027666030200 Test if data fit to Int32 range and use it, othwerwise use 64bit range. https://github.com/JuliaIO/BSON.jl/issues/67#issuecomment-616134874 described where specifically the cast is happening.

Rationale beihind this: BSON.jl already violates BSON specification by allowing saving arbitrary types. This follows same ides, as long as user makes sure input data follow BSON specification, it is compliant with the specification, but it also allows to persist data that don't follow BSON specification if it's required.

rofinn commented 3 years ago

An alternative approach, which might scale better, would be to partition large vectors/strings into multi-part objects and even breakup large docs into multi-part files. This would allow other loaders to still load and combine the data as necessary. I'd argue that violating the core BSON spec should be avoided and workarounds should remain backward compatible in some form.

rofinn commented 3 years ago

Alternatively, Preferences.jl could be maybe be used to allow projects to decide if they want strict BSON files?