JuliaIO / JLD2.jl

HDF5-compatible file format in pure Julia
Other
546 stars 85 forks source link

Saving and loading (anonymous) functions #258

Open hendri54 opened 3 years ago

hendri54 commented 3 years ago

When saving and then loading a struct that contains a function I now get a MethodError. MWE:

using FileIO, JLD2

f1() = 1.0;

mutable struct Foo
    x :: Float64
    f :: Function
end

x = Foo(1.0, f1);

fPath = "test1.jld2";
save(fPath, Dict("data" => x));

y = load(fPath)

The error is:

ERROR: LoadError: MethodError: no method matching typeof(f1)()
Stacktrace:
 [1] handle_error(::MethodError, ::File{DataFormat{:JLD2}}) at /Users/lutz/.julia/packages/FileIO/2fEu2/src/error_handling.jl:82
 [2] handle_exceptions(::Array{Any,1}, ::String) at /Users/lutz/.julia/packages/FileIO/2fEu2/src/error_handling.jl:77
 [3] load(::Formatted; options::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{,Tuple{}}}) at /Users/lutz/.julia/packages/FileIO/2fEu2/src/loadsave.jl:210
 [4] load at /Users/lutz/.julia/packages/FileIO/2fEu2/src/loadsave.jl:187 [inlined]
 [5] #load#16 at /Users/lutz/.julia/packages/FileIO/2fEu2/src/loadsave.jl:136 [inlined]
 [6] load(::String) at /Users/lutz/.julia/packages/FileIO/2fEu2/src/loadsave.jl:136
 [7] top-level scope at /Users/lutz/Documents/projects/p2019/college_stratification/CollegeStrat/src/temp1.jl:15
 [8] include(::String) at ./client.jl:457
 [9] top-level scope at REPL[9]:1
in expression starting at /Users/lutz/Documents/projects/p2019/college_stratification/CollegeStrat/src/temp1.jl:15

Replacing the function with, say, and Int avoids the error.

This is on Julia 1.5.2, JLD2 0.3.0, FileIO 1.4.4.

JonasIsensee commented 3 years ago

Hi @hendri54 ,

thanks for this bug report! The simple example made the problem very clear.

The problem was introduced in v0.3.0 and, until this is fixed, it might be best if you continue working with JLD2 v0.2.4.

The problem is in https://github.com/JuliaIO/JLD2.jl/blob/d6bd50889667312d90b5c80b830bb797354706a7/src/data.jl#L1339

This was changed to make custom serialization possible. Previously this was a @generated function.

I made a PR #260 that restores the correct behavior in this case but I don't know enough about eval, functions, and code generations to know whether this is actually a good fix or will just produce more problems in the future.

hendri54 commented 3 years ago

Thank you for the suggestion. 0.2.4 works for now.

And thank you for all the work you are putting into the package.

SamuelBrand1 commented 3 years ago

@JonasIsensee I've had the same issue, and also want to thank you for your work on this package. Good luck dealing with this issue.

JonasIsensee commented 3 years ago

This should be fixed in v0.3.1 that I just tagged.

hendri54 commented 3 years ago

Thanks again!

bjarthur commented 1 year ago

saving functions is not really fixed, for if you quit out julia and start a fresh REPL, then you can not call that saved function.

so after executing the above code in the OP, if you then immediately call the loaded function it works:

julia> y["data"].f()
1.0

but if you quit and then re-load it doesn't:

julia> using FileIO, JLD2

julia> fPath = "test1.jld2";

julia> y = load(fPath)
┌ Warning: type Main.Foo does not exist in workspace; reconstructing
└ @ JLD2 /groups/scicompsoft/home/arthurb/.julia/packages/JLD2/Pi1Zq/src/data/reconstructing_datatypes.jl:461
┌ Warning: type Main.#f1 does not exist in workspace; reconstructing
└ @ JLD2 /groups/scicompsoft/home/arthurb/.julia/packages/JLD2/Pi1Zq/src/data/reconstructing_datatypes.jl:369
Dict{String, Any} with 1 entry:
  "data" => var"##Main.Foo#312"(1.0, var"##Main.#f1#313"())

julia> y["data"].f()
ERROR: MethodError: objects of type JLD2.ReconstructedTypes.var"##Main.#f1#313" are not callable
Stacktrace:
 [1] top-level scope
   @ REPL[5]:1

this is with julia 1.8, JLD2 0.4.23

bjarthur commented 1 year ago

and it gets worse if you redefine that struct before loading:

julia> mutable struct Foo
           x :: Float64
           f :: Function
       end

julia> using FileIO, JLD2

julia> fPath = "test1.jld2";

julia> y = load(fPath)
┌ Warning: type Main.#f1 does not exist in workspace; reconstructing
└ @ JLD2 /groups/scicompsoft/home/arthurb/.julia/packages/JLD2/Pi1Zq/src/data/reconstructing_datatypes.jl:369
Error encountered while load File{DataFormat{:JLD2}, String}("test1.jld2").

Fatal error:
ERROR: MethodError: Cannot `convert` an object of type JLD2.ReconstructedTypes.var"##Main.#f1#312" to an object of type Function
Closest candidates are:
  convert(::Type{T}, ::T) where T at Base.jl:61
Stacktrace:
bjarthur commented 1 year ago

the only hack i know of is to save the Expr corresponding to the function and eval it after loading:

julia> using JLD2

julia> struct Foo
         f :: Expr
       end

julia> f1 = :(()->1.0)
:(()->begin
          #= REPL[3]:1 =#
          1.0
      end)

julia> x = Foo(f1)
Foo(:(()->begin
          #= REPL[3]:1 =#
          1.0
      end))

julia> fPath = "test1.jld2";

julia> save(fPath, Dict("data" => x));

julia> y = load(fPath)
Dict{String, Any} with 1 entry:
  "data" => Foo(:(()->begin…

julia> f2 = eval(y["data"].f)
#1 (generic function with 1 method)

julia> f2()
1.0

and it works in a fresh REPL:

julia> using JLD2

julia> fPath = "test1.jld2";

julia> y = load(fPath)
┌ Warning: type Main.Foo does not exist in workspace; reconstructing
└ @ JLD2 /groups/scicompsoft/home/arthurb/.julia/packages/JLD2/Pi1Zq/src/data/reconstructing_datatypes.jl:461
Dict{String, Any} with 1 entry:
  "data" => var"##Main.Foo#312"(:(()->begin…

julia> f2 = eval(y["data"].f)
#1 (generic function with 1 method)

julia> f2()
1.0

and it doesn't break if you redefine the struct:

julia> struct Foo
         f :: Expr
       end

julia> using JLD2

julia> fPath = "test1.jld2";

julia> y = load(fPath)
Dict{String, Any} with 1 entry:
  "data" => Foo(:(()->begin…

julia> f2 = eval(y["data"].f)
#1 (generic function with 1 method)

julia> f2()
1.0
JonasIsensee commented 1 year ago

I did not mean to claim that saving functions was fixed.

Functions in julia are complex objects thanks to multiple dispatch / specializations / generated functions / wrapped variables / references to globals / and potentially being retrieved from a system image. To make it worse, the internals change between minor releases in julia. (internals after all) This makes long-term storage of functions pointless and (general) short-term storage an insurmountable task.

The julia serializer has limited support for storing functions but is not generally compatible with JLD2. If you would like to store objects containing anonymous functions, I would recommend using the serializer. (You can check out #377 for an attempt to embed binary blobs (jls files) into JLD2 files)

The error message you posted above does appear to be a bug. There should be a more graceful failure mode.

bjarthur commented 1 year ago

The julia serializer has limited support for storing functions

by limited do you mean none? i can't get even a simple case to work:

julia> using Serialization

julia> f1() = 1.0;

julia> serialize("f1.sth", f1)

quit and open a new REPL:

julia> using Serialization

julia> f1 = deserialize("f1.sth")
ERROR: UndefVarError: #f1 not defined
Stacktrace:
  [1] deserialize_datatype(s::Serializer{IOStream}, full::Bool)
    @ Serialization /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/Serialization/src/Serialization.jl:1364
  [2] handle_deserialize(s::Serializer{IOStream}, b::Int32)
    @ Serialization /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/Serialization/src/Serialization.jl:866
  [3] deserialize(s::Serializer{IOStream})
    @ Serialization /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/Serialization/src/Serialization.jl:813
  [4] handle_deserialize(s::Serializer{IOStream}, b::Int32)
    @ Serialization /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/Serialization/src/Serialization.jl:873
  [5] deserialize(s::Serializer{IOStream})
    @ Serialization /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/Serialization/src/Serialization.jl:813
  [6] handle_deserialize(s::Serializer{IOStream}, b::Int32)
    @ Serialization /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/Serialization/src/Serialization.jl:919
  [7] deserialize
    @ /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/Serialization/src/Serialization.jl:813 [inlined]
  [8] deserialize(s::IOStream)
    @ Serialization /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/Serialization/src/Serialization.jl:800
  [9] open(f::typeof(deserialize), args::String; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Base ./io.jl:384
 [10] open
    @ ./io.jl:381 [inlined]
 [11] deserialize(filename::String)
    @ Serialization /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/Serialization/src/Serialization.jl:810
 [12] top-level scope
    @ REPL[2]:1

i'll also chime in with the others and thank you for your work on JLD2!

JonasIsensee commented 1 year ago

oh, I did not know that part. Anonymous functions "work":

julia> using Serialization

julia> λ = x->x^2;

julia> serialize("λ.jls", λ)

# New session

julia> using Serialization

julia> λ = deserialize("λ.jls")
#1 (generic function with 1 method)

julia> λ(2)
4

This looked fine, but the following still fails:

julia> struct A; x::Int; end

julia> λ = x->A(x)
#1 (generic function with 1 method)

julia> serialize("λ.jls", λ)
# new session
julia> using Serialization

julia> λ = deserialize("λ.jls")
#1 (generic function with 1 method)

julia> λ(2)
ERROR: UndefVarError: A not defined
Stacktrace:
   [1] (::Serialization.__deserialized_types__.var"#1#2")(x::Int64)
     @ Main ./REPL[5]:1