JuliaIO / JLD2.jl

HDF5-compatible file format in pure Julia
Other
534 stars 82 forks source link

File created with earlier JLD2 version can't be opened with version 0.4.38 or later #549

Closed yha closed 4 months ago

yha commented 4 months ago

This file can't be opened with recent JLD2 versions, fixing the versions of other packages used in the file: https://www.dropbox.com/scl/fi/qih6zelvvzs1lkklb60l7/bad-jld2-file.jld2?rlkey=2bdtoexqyvcmgadv8boppod5j&dl=0

I'm not sure at this point which version of JLD2 was originally used to save it, but it was probably either 0.4.35 or 0.4.37. I have many files with similar structure, some of which fail similarly with newer JLD2 versions while other open successfully.

(@v1.9) pkg> activate --temp
  Activating new project at `/tmp/jl_5vlRa3`

julia> # Known good versions:

(jl_5vlRa3) pkg> add JLD2@0.4.37 FileIO@1.16.1 GeometryBasics@0.4.9 OffsetArrays@1.12.10
   Resolving package versions...
    Updating `/tmp/jl_5vlRa3/Project.toml`
⌃ [5789e2e9] + FileIO v1.16.1
⌃ [5c1252a2] + GeometryBasics v0.4.9
⌃ [033835bb] + JLD2 v0.4.37
⌃ [6fe1bfb0] + OffsetArrays v1.12.10
    Updating `/tmp/jl_5vlRa3/Manifest.toml`
⌅ [79e6a3ab] + Adapt v3.7.2
  [187b0558] + ConstructionBase v1.5.4
  [9a962f9c] + DataAPI v1.16.0
  [e2d170a0] + DataValueInterfaces v1.0.0
  [411431e0] + Extents v0.1.2
⌃ [5789e2e9] + FileIO v1.16.1
⌃ [46192b85] + GPUArraysCore v0.1.5
  [cf35fbd7] + GeoInterface v1.3.3
⌃ [5c1252a2] + GeometryBasics v0.4.9
  [c8e1da08] + IterTools v1.10.0
  [82899510] + IteratorInterfaceExtensions v1.0.0
⌃ [033835bb] + JLD2 v0.4.37
  [692b3bcd] + JLLWrappers v1.5.0
  [1914dd2f] + MacroTools v0.5.13
⌃ [6fe1bfb0] + OffsetArrays v1.12.10
  [bac558e1] + OrderedCollections v1.6.3
  [aea7be01] + PrecompileTools v1.2.0
  [21216c6a] + Preferences v1.4.1
  [189a3867] + Reexport v1.2.2
  [ae029012] + Requires v1.3.0
  [90137ffa] + StaticArrays v1.9.2
  [1e83bf80] + StaticArraysCore v1.4.2
  [09ab397b] + StructArrays v0.6.17
  [3783bdb8] + TableTraits v1.0.1
  [bd369af6] + Tables v1.11.1
  [3bb67fe8] + TranscodingStreams v0.10.3
  [5ae413db] + EarCut_jll v2.2.4+0
  [0dad84c5] + ArgTools v1.1.1
  [56f22d72] + Artifacts
  [2a0f44e3] + Base64
  [ade2ca70] + Dates
  [f43a241f] + Downloads v1.6.0
  [7b1f6079] + FileWatching
  [b77e0a4c] + InteractiveUtils
  [b27032c2] + LibCURL v0.6.3
  [76f85450] + LibGit2
  [8f399da3] + Libdl
  [37e2e46d] + LinearAlgebra
  [56ddb016] + Logging
  [d6f4376e] + Markdown
  [a63ad114] + Mmap
  [ca575930] + NetworkOptions v1.2.0
  [44cfe95a] + Pkg v1.9.2
  [de0858da] + Printf
  [3fa0cd96] + REPL
  [9a3f8284] + Random
  [ea8e919c] + SHA v0.7.0
  [9e88b42a] + Serialization
  [6462fe0b] + Sockets
  [fa267f1f] + TOML v1.0.3
  [a4e569a6] + Tar v1.10.0
  [cf7118a7] + UUIDs
  [4ec0a83e] + Unicode
  [e66e0078] + CompilerSupportLibraries_jll v1.0.5+0
  [deac9b47] + LibCURL_jll v7.84.0+0
  [29816b5a] + LibSSH2_jll v1.10.2+0
  [c8ffd9c3] + MbedTLS_jll v2.28.2+0
  [14a3606d] + MozillaCACerts_jll v2022.10.11
  [4536629a] + OpenBLAS_jll v0.3.21+4
  [83775a58] + Zlib_jll v1.2.13+0
  [8e850b90] + libblastrampoline_jll v5.8.0+0
  [8e850ede] + nghttp2_jll v1.48.0+0
  [3f19e933] + p7zip_jll v17.4.0+0
        Info Packages marked with ⌃ and ⌅ have new versions available, but those with ⌅ are restricted by compatibility constraints from upgrading. To see why use `status --outdated -m`

julia> using GeometryBasics, OffsetArrays, JLD2, FileIO

julia> load("bad-jld2-file.jld2")
Dict{String, Any} with 3 entries:
  "midpoints" => OrderedDict{Int64, OffsetMatrix{Point2{Float64}, Matrix{Point2{Float64}}}}(1=>[[NaN, NaN] [NaN, NaN] … [NaN, NaN] [NaN, NaN]; [NaN, NaN] [NaN, NaN] … [NaN, NaN] [NaN, NaN]; … ; [306.439, 2203…
  "conf"      => OrderedDict{Int64, OffsetVector{Union{Missing, Float64}, Vector{Union{Missing, Float64}}}}(1=>[missing, missing, missing, missing, missing, missing, missing, missing, missing, missing  …  0.1…
  "iters"     => OrderedDict{Int64, OffsetVector{Int64, Vector{Int64}}}(1=>[0, 0, 0, 0, 0, 0, 0, 0, 0, 0  …  2, 2, 3, 2, 0, 2, 0, 2, 3, 0])

In a new Julia session:

(@v1.9) pkg> activate --temp
  Activating new project at `/tmp/jl_k7nkp0`

(jl_k7nkp0) pkg> add JLD2@0.4.38 FileIO@1.16.1 GeometryBasics@0.4.9 OffsetArrays@1.12.10
   Resolving package versions...
    Updating `/tmp/jl_k7nkp0/Project.toml`
⌃ [5789e2e9] + FileIO v1.16.1
⌃ [5c1252a2] + GeometryBasics v0.4.9
⌃ [033835bb] + JLD2 v0.4.38
⌃ [6fe1bfb0] + OffsetArrays v1.12.10
    Updating `/tmp/jl_k7nkp0/Manifest.toml`
⌅ [79e6a3ab] + Adapt v3.7.2
  [187b0558] + ConstructionBase v1.5.4
  [9a962f9c] + DataAPI v1.16.0
  [e2d170a0] + DataValueInterfaces v1.0.0
  [411431e0] + Extents v0.1.2
⌃ [5789e2e9] + FileIO v1.16.1
⌃ [46192b85] + GPUArraysCore v0.1.5
  [cf35fbd7] + GeoInterface v1.3.3
⌃ [5c1252a2] + GeometryBasics v0.4.9
  [c8e1da08] + IterTools v1.10.0
  [82899510] + IteratorInterfaceExtensions v1.0.0
⌃ [033835bb] + JLD2 v0.4.38
  [692b3bcd] + JLLWrappers v1.5.0
  [1914dd2f] + MacroTools v0.5.13
⌃ [6fe1bfb0] + OffsetArrays v1.12.10
  [bac558e1] + OrderedCollections v1.6.3
  [aea7be01] + PrecompileTools v1.2.0
  [21216c6a] + Preferences v1.4.1
  [189a3867] + Reexport v1.2.2
  [ae029012] + Requires v1.3.0
  [90137ffa] + StaticArrays v1.9.2
  [1e83bf80] + StaticArraysCore v1.4.2
  [09ab397b] + StructArrays v0.6.17
  [3783bdb8] + TableTraits v1.0.1
  [bd369af6] + Tables v1.11.1
  [3bb67fe8] + TranscodingStreams v0.10.3
  [5ae413db] + EarCut_jll v2.2.4+0
  [0dad84c5] + ArgTools v1.1.1
  [56f22d72] + Artifacts
  [2a0f44e3] + Base64
  [ade2ca70] + Dates
  [f43a241f] + Downloads v1.6.0
  [7b1f6079] + FileWatching
  [b77e0a4c] + InteractiveUtils
  [b27032c2] + LibCURL v0.6.3
  [76f85450] + LibGit2
  [8f399da3] + Libdl
  [37e2e46d] + LinearAlgebra
  [56ddb016] + Logging
  [d6f4376e] + Markdown
  [a63ad114] + Mmap
  [ca575930] + NetworkOptions v1.2.0
  [44cfe95a] + Pkg v1.9.2
  [de0858da] + Printf
  [3fa0cd96] + REPL
  [9a3f8284] + Random
  [ea8e919c] + SHA v0.7.0
  [9e88b42a] + Serialization
  [6462fe0b] + Sockets
  [fa267f1f] + TOML v1.0.3
  [a4e569a6] + Tar v1.10.0
  [cf7118a7] + UUIDs
  [4ec0a83e] + Unicode
  [e66e0078] + CompilerSupportLibraries_jll v1.0.5+0
  [deac9b47] + LibCURL_jll v7.84.0+0
  [29816b5a] + LibSSH2_jll v1.10.2+0
  [c8ffd9c3] + MbedTLS_jll v2.28.2+0
  [14a3606d] + MozillaCACerts_jll v2022.10.11
  [4536629a] + OpenBLAS_jll v0.3.21+4
  [83775a58] + Zlib_jll v1.2.13+0
  [8e850b90] + libblastrampoline_jll v5.8.0+0
  [8e850ede] + nghttp2_jll v1.48.0+0
  [3f19e933] + p7zip_jll v17.4.0+0
        Info Packages marked with ⌃ and ⌅ have new versions available, but those with ⌅ are restricted by compatibility constraints from upgrading. To see why use `status --outdated -m`

julia>

julia> using GeometryBasics, OffsetArrays, JLD2, FileIO

julia> load("bad-jld2-file.jld2")
Error encountered while load File{DataFormat{:JLD2}, String}("bad-jld2-file.jld2").

Fatal error:
ERROR: MethodError: Cannot `convert` an object of type Vector{Pair{Int64, OffsetMatrix{Point2{Float64}, Matrix{Point2{Float64}}}}} to an object of type OrderedCollections.OrderedDict{Int64, OffsetMatrix{Point2{Float64}, Matrix{Point2{Float64}}}}

Closest candidates are:
  convert(::Type{OrderedCollections.OrderedDict{K, V}}, ::AbstractDict) where {K, V}
   @ OrderedCollections ~/.julia/packages/OrderedCollections/9C4Uz/src/ordered_dict.jl:100
  convert(::Type{T}, ::T) where T<:AbstractDict
   @ Base abstractdict.jl:565
  convert(::Type{T}, ::AbstractDict) where T<:AbstractDict
   @ Base abstractdict.jl:567
  ...

Stacktrace:
  [1] rconvert(T::Type, x::Vector{Pair{Int64, OffsetMatrix{Point2{Float64}, Matrix{Point2{Float64}}}}})
    @ JLD2 ~/.julia/packages/JLD2/u57Vt/src/data/custom_serialization.jl:9
  [2] jlconvert(#unused#::JLD2.ReadRepresentation{OrderedCollections.OrderedDict{Int64, OffsetMatrix{Point2{Float64}, Matrix{Point2{Float64}}}}, JLD2.CustomSerialization{Array, JLD2.RelOffset}}, f::JLD2.JLDFile{JLD2.MmapIO}, ptr::Ptr{Nothing}, header_offset::JLD2.RelOffset)
    @ JLD2 ~/.julia/packages/JLD2/u57Vt/src/data/custom_serialization.jl:56
  [3] read_scalar(f::JLD2.JLDFile{JLD2.MmapIO}, rr::JLD2.ReadRepresentation{OrderedCollections.OrderedDict{Int64, OffsetMatrix{Point2{Float64}, Matrix{Point2{Float64}}}}, JLD2.CustomSerialization{Array, JLD2.RelOffset}}, header_offset::JLD2.RelOffset)
    @ JLD2 ~/.julia/packages/JLD2/u57Vt/src/dataio.jl:37
  [4] read_data(f::JLD2.JLDFile{JLD2.MmapIO}, rr::Any, read_dataspace::Tuple{JLD2.ReadDataspace, JLD2.RelOffset, JLD2.DataLayout, JLD2.FilterPipeline}, attributes::Vector{JLD2.ReadAttribute})
    @ JLD2 ~/.julia/packages/JLD2/u57Vt/src/datasets.jl:238
  [5] read_data(f::JLD2.JLDFile{JLD2.MmapIO}, dataspace::JLD2.ReadDataspace, datatype_class::UInt8, datatype_offset::Int64, layout::JLD2.DataLayout, filters::JLD2.FilterPipeline, header_offset::JLD2.RelOffset,
 attributes::Vector{JLD2.ReadAttribute})
    @ JLD2 ~/.julia/packages/JLD2/u57Vt/src/datasets.jl:194
  [6] load_dataset(f::JLD2.JLDFile{JLD2.MmapIO}, offset::JLD2.RelOffset)
    @ JLD2 ~/.julia/packages/JLD2/u57Vt/src/datasets.jl:125
  [7] getindex(g::JLD2.Group{JLD2.JLDFile{JLD2.MmapIO}}, name::String)
    @ JLD2 ~/.julia/packages/JLD2/u57Vt/src/groups.jl:109
  [8] getindex
    @ ~/.julia/packages/JLD2/u57Vt/src/JLD2.jl:483 [inlined]
  [9] loadtodict!(d::Dict{String, Any}, g::JLD2.JLDFile{JLD2.MmapIO}, prefix::String)
    @ JLD2 ~/.julia/packages/JLD2/u57Vt/src/loadsave.jl:154
 [10] loadtodict!
    @ ~/.julia/packages/JLD2/u57Vt/src/loadsave.jl:153 [inlined]
 [11] (::JLD2.var"#100#101")(file::JLD2.JLDFile{JLD2.MmapIO})
    @ JLD2 ~/.julia/packages/JLD2/u57Vt/src/fileio.jl:39
 [12] jldopen(::Function, ::String, ::Vararg{String}; kws::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ JLD2 ~/.julia/packages/JLD2/u57Vt/src/loadsave.jl:4
 [13] jldopen
    @ ~/.julia/packages/JLD2/u57Vt/src/loadsave.jl:1 [inlined]
 [14] #fileio_load#99
    @ ~/.julia/packages/JLD2/u57Vt/src/fileio.jl:38 [inlined]
 [15] fileio_load(f::File{DataFormat{:JLD2}, String})
    @ JLD2 ~/.julia/packages/JLD2/u57Vt/src/fileio.jl:37
 [16] #invokelatest#2
    @ ./essentials.jl:819 [inlined]
 [17] invokelatest
    @ ./essentials.jl:816 [inlined]
 [18] action(::Symbol, ::Vector{Union{Base.PkgId, Module}}, ::Formatted; options::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ FileIO ~/.julia/packages/FileIO/BE7iZ/src/loadsave.jl:219
 [19] action
    @ ~/.julia/packages/FileIO/BE7iZ/src/loadsave.jl:196 [inlined]
 [20] action(::Symbol, ::Vector{Union{Base.PkgId, Module}}, ::Symbol, ::String; options::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ FileIO ~/.julia/packages/FileIO/BE7iZ/src/loadsave.jl:185
 [21] action
    @ ~/.julia/packages/FileIO/BE7iZ/src/loadsave.jl:185 [inlined]
 [22] load(::String; options::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ FileIO ~/.julia/packages/FileIO/BE7iZ/src/loadsave.jl:113
 [23] load(::String)
    @ FileIO ~/.julia/packages/FileIO/BE7iZ/src/loadsave.jl:109
 [24] top-level scope
    @ REPL[4]:1
Stacktrace:
 [1] handle_error(e::MethodError, q::Base.PkgId, bt::Vector{Union{Ptr{Nothing}, Base.InterpreterIP}})
   @ FileIO ~/.julia/packages/FileIO/BE7iZ/src/error_handling.jl:61
 [2] handle_exceptions(exceptions::Vector{Tuple{Any, Union{Base.PkgId, Module}, Vector}}, action::String)
   @ FileIO ~/.julia/packages/FileIO/BE7iZ/src/error_handling.jl:56
 [3] action(::Symbol, ::Vector{Union{Base.PkgId, Module}}, ::Formatted; options::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ FileIO ~/.julia/packages/FileIO/BE7iZ/src/loadsave.jl:228
 [4] action
   @ ~/.julia/packages/FileIO/BE7iZ/src/loadsave.jl:196 [inlined]
 [5] action(::Symbol, ::Vector{Union{Base.PkgId, Module}}, ::Symbol, ::String; options::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ FileIO ~/.julia/packages/FileIO/BE7iZ/src/loadsave.jl:185
 [6] action
   @ ~/.julia/packages/FileIO/BE7iZ/src/loadsave.jl:185 [inlined]
 [7] load(::String; options::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ FileIO ~/.julia/packages/FileIO/BE7iZ/src/loadsave.jl:113
 [8] load(::String)
   @ FileIO ~/.julia/packages/FileIO/BE7iZ/src/loadsave.jl:109
 [9] top-level scope
   @ REPL[4]:1
JonasIsensee commented 4 months ago

Hi @yha, this is a problem with some erroneously converted AbstractDicts. The problem was introduced in some 1.4.3x version and removed shortly after.

To load your files you can define the conversion:

rconvert(::Type{T}, x::Vector{ <: Pair}) where {T<:OrderedDict} = T(x)
EliLazarus commented 4 months ago

I'm getting a similar error, but only on the Github tests for my package (I added a unit test using data in a JLD2 file). Maybe it's a similar problem?

┌ Warning: type Core.Pair{Any,Any} does not exist in workspace; reconstructing
└ @ JLD2 ~/.julia/packages/JLD2/oYEEg/src/data/reconstructing_datatypes.jl:607
┌ Warning: type Array{JLD2.ReconstructedMutable{Symbol("Pair{Any,Any}"), (:first, :second), Tuple{Any, Any}},1} does not exist in workspace; reconstructing
└ @ JLD2 ~/.julia/packages/JLD2/oYEEg/src/data/reconstructing_datatypes.jl:492
Error encountered while load FileIO.File{FileIO.DataFormat{:JLD2}, String}("/home/runner/work/MPSGE.jl/MPSGE.jl/test/./gams/DAAData.jld2").

Fatal error:
ERROR: LoadError: LoadError: MethodError: Cannot `convert` an object of type JLD2.ReconstructedPrimitive{Symbol("Array{JLD2.ReconstructedMutable{Symbol(\"Pair{Any,Any}\"), (:first, :second), Tuple{Any, Any}},1}"), UInt64} to an object of type Dict{Any, Any}
Closest candidates are:
  convert(::Type{T}, ::T) where T<:AbstractDict at abstractdict.jl:523
  convert(::Type{T}, ::AbstractDict) where T<:AbstractDict at abstractdict.jl:525
  convert(::Type{T}, ::T) where T at essentials.jl:205

Because it's the automated github tests, I'm not quite sure how to change the version, and I'm not replicating the issue locally. The data is a Dict of DenseAxisArrays.

JonasIsensee commented 4 months ago

Hi @EliLazarus ,

your issue is different. You created a file with a recent version of Julia, where the type Pair is defined in Core where as in julia 1.6 the type was still defined in Base. Therefore, JLD2 cannot find it when reconstructing.

You could avoid the problem by re-generating the file in julia 1.6.

To "just" re-save, you can do

# julia +1.6
@eval Core Pair=$(Base.Pair)
data = load("DAAData.jld2");
save("DAAData.jld2", data);

(but I wouldn't recommend evalling names into Core as part of your CI...)

yha commented 4 months ago

Hi @yha, this is a problem with some erroneously converted AbstractDicts. The problem was introduced in some 1.4.3x version and removed shortly after.

Is this referring to https://github.com/JuliaIO/JLD2.jl/issues/536? I tried solving my issue by sticking to version 0.4.37, to avoid having to convert a lot of older JLD2 files, but now it seems like I've hit #536 with some of those files when using this version.

To load your files you can define the conversion:

rconvert(::Type{T}, x::Vector{ <: Pair}) where {T<:OrderedDict} = T(x)

So to fix non-ordered dicts that I also have saved I need this line for general AbstractDict?

rconvert(::Type{T}, x::Vector{ <: Pair}) where {T<:AbstractDict} = T(x)

?

JonasIsensee commented 4 months ago

Hi @yha, this is a problem with some erroneously converted AbstractDicts. The problem was introduced in some 1.4.3x version and removed shortly after.

Is this referring to #536? I tried solving my issue by sticking to version 0.4.37, to avoid having to convert a lot of older JLD2 files, but now it seems like I've hit #536 with some of those files when using this version.

No, this referred to #492 which defined a conversion that was applicable to too many types. This was later reverted.

So to fix non-ordered dicts that I also have saved I need this line for general AbstractDict?

rconvert(::Type{T}, x::Vector{ <: Pair}) where {T<:AbstractDict} = T(x)

?

yes, that should help.