JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.68k stars 5.48k forks source link

Functions do not properly redefine with FunctionWrappers in v1.10? #52635

Open ChrisRackauckas opened 10 months ago

ChrisRackauckas commented 10 months ago

I don't quite understand what's going on here so it's a bit hard to title it. I found it in a test regression in OrdinaryDiffEq.jl https://github.com/SciML/OrdinaryDiffEq.jl/pull/2093 which started giving the wrong error. Seems innocuous, but digging in made it weirder. The MWE I cam up with is:

# Setup
using FunctionWrappers, ForwardDiff, OrdinaryDiffEq

const a = Float64[1.0]
function lorenz!(du, u, p, t)
    du[1] = 10.0(u[2] - u[1])
    a[1] = t
    du[2] = u[1] * (28.0 - u[3]) - u[2]
    du[3] = u[1] * u[2] - (8 / 3) * u[3]
end
u = [1.0; 0.0; 0.0]
du = copy(u)
t = 0.0
dualgen(::Type{T}) where {T} = ForwardDiff.Dual{ForwardDiff.Tag{DiffEqBase.OrdinaryDiffEqTag, T}, T, 1}
T = Float64
T1 = Float64
dualT = dualgen(T)

# Version 1
R = Nothing
A = Tuple{Vector{dualgen(T)}, Vector{dualgen(T)}, SciMLBase.NullParameters, Float64}
ff = FunctionWrappers.FunctionWrapper{R, A}(SciMLBase.Void(lorenz!))

ff(du,u,SciMLBase.NullParameters(),t)

Which gives:

julia> ff(du,u,SciMLBase.NullParameters(),t)
ERROR: TypeError: in cfunction, expected Union{}, got a value of type Nothing
Stacktrace:
 [1] macro expansion
   @ FunctionWrappers C:\Users\accou\.julia\packages\FunctionWrappers\Q5cBx\src\FunctionWrappers.jl:137 [inlined]
 [2] do_ccall(f::FunctionWrappers.FunctionWrapper{…}, args::Tuple{…})
   @ FunctionWrappers C:\Users\accou\.julia\packages\FunctionWrappers\Q5cBx\src\FunctionWrappers.jl:125
 [3] (::FunctionWrappers.FunctionWrapper{Nothing, Tuple{…}})(::Vector{Float64}, ::Vararg{Any})
   @ FunctionWrappers C:\Users\accou\.julia\packages\FunctionWrappers\Q5cBx\src\FunctionWrappers.jl:144
 [4] top-level scope
   @ REPL[27]:1
Some type information was truncated. Use `show(err)` to see complete types.

And you might go, that's not an MWE! Delete stuff. Well... deleting stuff and replacing it with the same thing... is fine. Here's the definition of Void: https://github.com/SciML/SciMLBase.jl/blob/v2.11.7/src/utils.jl#L477-L483 . Let's just remove a bit of package code by copy pasting that into the MWE:

struct Void{F}
    f::F
end
function (f::Void)(args...)
    f.f(args...)
    nothing
end
ff = FunctionWrappers.FunctionWrapper{R, A}(Void(lorenz!))
ff(du,u,SciMLBase.NullParameters(),t)

Tada! That works fine. What about getting rid of SciMLBase.NullParameters, that's just a singleton so what about nothing:

R = Nothing
A = Tuple{Vector{dualgen(T)}, Vector{dualgen(T)}, Nothing, Float64}
ff = FunctionWrappers.FunctionWrapper{R, A}(SciMLBase.Void(lorenz!))

ff(du,u,nothing,t)

That works fine too!

So it seems this exact combination of arguments, and only using the version of things in the precompiled (?) package, causes the error. And I don't know what this error really is. It's a v1.10 only issue, so I'm posting here though it's also maybe FunctionWrappers.jl related?

ChrisRackauckas commented 10 months ago

Umm it went away now. Even more confused, but okay.

ChrisRackauckas commented 10 months ago

Okay nope, even more confused. You have to run the failing test before running the MWE?

using FunctionWrappers, ForwardDiff, OrdinaryDiffEq, Test

const a = Float64[1.0]
function lorenz!(du, u, p, t)
    du[1] = 10.0(u[2] - u[1])
    a[1] = u[2]
    du[2] = u[1] * (28.0 - u[3]) - u[2]
    du[3] = u[1] * u[2] - (8 / 3) * u[3]
end
u0 = [1.0; 0.0; 0.0]
tspan = (0.0, 1.0)
prob = ODEProblem(lorenz!, u0, tspan)
@test_throws OrdinaryDiffEq.FirstAutodiffJacError solve(prob, Rosenbrock23())

function lorenz!(du, u, p, t)
    du[1] = 10.0(u[2] - u[1])
    a[1] = t
    du[2] = u[1] * (28.0 - u[3]) - u[2]
    du[3] = u[1] * u[2] - (8 / 3) * u[3]
end
prob = ODEProblem(lorenz!, u0, tspan)

# Throws the wrong error
@test_throws OrdinaryDiffEq.FirstAutodiffTgradError solve(prob, Rosenbrock23())

u = [1.0; 0.0; 0.0]
du = copy(u)
t = 0.0
dualgen(::Type{T}) where {T} = ForwardDiff.Dual{ForwardDiff.Tag{DiffEqBase.OrdinaryDiffEqTag, T}, T, 1}
T = Float64
T1 = Float64
dualT = dualgen(T)

# Version 1
R = Nothing
A = Tuple{Vector{dualgen(T)}, Vector{dualgen(T)}, SciMLBase.NullParameters, Float64}
ff = FunctionWrappers.FunctionWrapper{R, A}(SciMLBase.Void(lorenz!))

ff(du,u,SciMLBase.NullParameters(),t)

If you omit the test runs before it, it will not error.

ChrisRackauckas commented 10 months ago
using FunctionWrappers, ForwardDiff, OrdinaryDiffEq, Test

const a = Float64[1.0]
function lorenz!(du, u, p, t)
    du[1] = 10.0(u[2] - u[1])
    a[1] = u[2]
    du[2] = u[1] * (28.0 - u[3]) - u[2]
    du[3] = u[1] * u[2] - (8 / 3) * u[3]
end

function lorenz!(du, u, p, t)
    du[1] = 10.0(u[2] - u[1])
    a[1] = t
    du[2] = u[1] * (28.0 - u[3]) - u[2]
    du[3] = u[1] * u[2] - (8 / 3) * u[3]
end
u0 = [1.0; 0.0; 0.0]
tspan = (0.0,1.0)
prob = ODEProblem(lorenz!, u0, tspan)

# Throws the wrong error
@test_throws OrdinaryDiffEq.FirstAutodiffTgradError solve(prob, Rosenbrock23())

This throws the right error and the subsequent call works fine, but the following does not:

using FunctionWrappers, ForwardDiff, OrdinaryDiffEq, Test

const a = Float64[1.0]
function lorenz!(du, u, p, t)
    du[1] = 10.0(u[2] - u[1])
    a[1] = u[2]
    du[2] = u[1] * (28.0 - u[3]) - u[2]
    du[3] = u[1] * u[2] - (8 / 3) * u[3]
end
u0 = [1.0; 0.0; 0.0]
tspan = (0.0, 1.0)
prob = ODEProblem(lorenz!, u0, tspan)
@test_throws OrdinaryDiffEq.FirstAutodiffJacError solve(prob, Rosenbrock23())

function lorenz!(du, u, p, t)
    du[1] = 10.0(u[2] - u[1])
    a[1] = t
    du[2] = u[1] * (28.0 - u[3]) - u[2]
    du[3] = u[1] * u[2] - (8 / 3) * u[3]
end
u0 = [1.0; 0.0; 0.0]
tspan = (0.0,1.0)
prob = ODEProblem(lorenz!, u0, tspan)

# Throws the wrong error
@test_throws OrdinaryDiffEq.FirstAutodiffTgradError solve(prob, Rosenbrock23())

and the subsequent call errors like in the OP.

I think what this is suggesting is that function re-definition #265 stuff with FunctionWrappers somehow broke. It requires that the internal function wrapper with SciMLBase.Void(lorenz!) is called with the first function, and then when the function is re-defined and the function wrapper is rebuilt it does not seem to be correctly using the redefinition of the function, but instead reverts back to the first definition.

vtjnash commented 10 months ago

Sounds like a FunctionWrappers issue then? I don't think it ever gained full correct support for #265

ChrisRackauckas commented 10 months ago

It wasn't miscompiling on v1.9. You can see from our AD and ModelingToolkit downstream tests here https://github.com/SciML/OrdinaryDiffEq.jl/actions/runs/7333458887/job/19968912271?pr=2093 that it's a v1.10 only issue and not v1.9. I'm not sure if that requires a fix here or in FunctionWrappers.jl but this is the kind of issue I'd like to find a way to patch ASAP.

pablosanjose commented 9 months ago

Perhaps related to the issue https://github.com/yuyichao/FunctionWrappers.jl/issues/30 ? There the problem takes the form of a segfault upon running this snippet in 1.10 (linux)

using LinearAlgebra, FunctionWrappers
function test()
    function sfunc()
        e = eigen([1.0 + 0im ;;])
        return e
    end
    E = LinearAlgebra.Eigen{Complex{Float64}, Complex{Float64}, Matrix{Complex{Float64}}, Vector{Complex{Float64}}}
    asolver = FunctionWrappers.FunctionWrapper{E,Tuple{}}(sfunc)
    return asolver
end
s = test()
s()
using Symbolics
s()
lxvm commented 4 months ago

I have also been writing a code that sometimes had the same error with a cfunction, but now reproducibly segfaults for me, like so

[225045] signal (11.1): Segmentation fault
in expression starting at /home/lxvm/.julia/dev/AutoBZCore/aps_example/aps_example.jl:78
unknown function (ip: 0x7f4ee54c3af0)
unknown function (ip: 0x7f4ee54c3971)
macro expansion at /home/lxvm/.julia/packages/FunctionWrappers/Q5cBx/src/FunctionWrappers.jl:137 [inlined]
do_ccall at /home/lxvm/.julia/packages/FunctionWrappers/Q5cBx/src/FunctionWrappers.jl:125 [inlined]
FunctionWrapper at /home/lxvm/.julia/packages/FunctionWrappers/Q5cBx/src/FunctionWrappers.jl:144 [inlined]
...

So far I've narrowed it down to redefining a function contained in a mutable struct that gets wrapped with a FunctionWrapper, although in my case I only get the segfault when there are two layers of function wrappers. I'll try and distill it into a MWE since the actual code is a complicated recursive calculation. (This is also on Julia 1.10.4)

lxvm commented 4 months ago

I was able to reduce my segfault to the following, which I seem to trigger reliably with cis, although not with other functions like sin or cos. (I'm not sure why certain functions trigger this while others don't)

$ julia --project
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.10.4 (2024-06-04)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using FunctionWrappers: FunctionWrapper

julia> scalarize(x) = sum(x)
scalarize (generic function with 1 method)

julia> function work(x, dat)
           ws1 = dat[1] + dat[2]
           ws2 = dat[3] + dat[4]
           return cis(x[1]) * ws1 + cis(x[2]) * ws2
       end
work (generic function with 1 method)

julia> func = let dat = ones(4)
           x -> scalarize(work(x, dat))
       end
#1 (generic function with 1 method)

julia> x = (3.0,0.0)
(3.0, 0.0)

julia> y = func(x)
0.02001500679910917 + 0.2822400161197344im

julia> fw = FunctionWrapper{typeof(y),typeof((x,))}(func)
FunctionWrapper{ComplexF64, Tuple{Tuple{Float64, Float64}}}(Ptr{Nothing} @0x00007fe8d42829d0, Ptr{Nothing} @0x00007fe8cd51e930, Base.RefValue{var"#1#2"{Vector{Float64}}}(var"#1#2"{Vector{Float64}}([1.0, 1.0, 1.0, 1.0])), var"#1#2"{Vector{Float64}})

julia> fw(x)
0.02001500679910917 + 0.2822400161197344im

julia> scalarize(x) = sum(x)
scalarize (generic function with 1 method)

julia> fw(x)

Which crashes with

``` [315643] signal (11.1): Segmentation fault in expression starting at REPL[10]:1 unknown function (ip: 0x7fe8d4282b90) unknown function (ip: 0x7fe8d4282abb) macro expansion at /home/lxvm/.julia/packages/FunctionWrappers/Q5cBx/src/FunctionWrappers.jl:137 [inlined] do_ccall at /home/lxvm/.julia/packages/FunctionWrappers/Q5cBx/src/FunctionWrappers.jl:125 [inlined] FunctionWrapper at /home/lxvm/.julia/packages/FunctionWrappers/Q5cBx/src/FunctionWrappers.jl:144 unknown function (ip: 0x7fe8d4286189) _jl_invoke at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined] ijl_apply_generic at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/gf.c:3077 jl_apply at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/julia.h:1982 [inlined] do_call at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/interpreter.c:126 eval_value at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/interpreter.c:223 eval_stmt_value at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/interpreter.c:174 [inlined] eval_body at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/interpreter.c:617 jl_interpret_toplevel_thunk at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/interpreter.c:775 jl_toplevel_eval_flex at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/toplevel.c:934 jl_toplevel_eval_flex at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/toplevel.c:877 jl_toplevel_eval_flex at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/toplevel.c:877 jl_toplevel_eval_flex at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/toplevel.c:877 ijl_toplevel_eval_in at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/toplevel.c:985 eval at ./boot.jl:385 [inlined] eval_user_input at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:150 repl_backend_loop at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:246 #start_repl_backend#46 at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:231 start_repl_backend at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:228 _jl_invoke at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined] ijl_apply_generic at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/gf.c:3077 #run_repl#59 at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:389 run_repl at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:375 jfptr_run_repl_91737.1 at /local/home/lxvm/.julia/juliaup/julia-1.10.4+0.x64.linux.gnu/lib/julia/sys.so (unknown line) _jl_invoke at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined] ijl_apply_generic at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/gf.c:3077 #1013 at ./client.jl:432 jfptr_YY.1013_82703.1 at /local/home/lxvm/.julia/juliaup/julia-1.10.4+0.x64.linux.gnu/lib/julia/sys.so (unknown line) _jl_invoke at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined] ijl_apply_generic at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/gf.c:3077 jl_apply at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/julia.h:1982 [inlined] jl_f__call_latest at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/builtins.c:812 #invokelatest#2 at ./essentials.jl:892 [inlined] invokelatest at ./essentials.jl:889 [inlined] run_main_repl at ./client.jl:416 exec_options at ./client.jl:333 _start at ./client.jl:552 jfptr__start_82729.1 at /local/home/lxvm/.julia/juliaup/julia-1.10.4+0.x64.linux.gnu/lib/julia/sys.so (unknown line) _jl_invoke at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined] ijl_apply_generic at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/gf.c:3077 jl_apply at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/julia.h:1982 [inlined] true_main at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/jlapi.c:582 jl_repl_entrypoint at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/src/jlapi.c:731 main at /cache/build/builder-amdci4-0/julialang/julia-release-1-dot-10/cli/loader_exe.c:58 unknown function (ip: 0x7fe8d5246249) __libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line) unknown function (ip: 0x4010b8) Allocations: 725817 (Pool: 725128; Big: 689); GC: 1 Segmentation fault ```

In a simpler example, I also noticed an incorrect result without a crash

julia> using FunctionWrappers: FunctionWrapper

julia> scalarize(x) = sum(x)
scalarize (generic function with 1 method)

julia> work(x) = cis(x[1]) + cis(x[2])
work (generic function with 2 methods)

julia> func = x -> scalarize(work(x))
#3 (generic function with 1 method)

julia> x = (3.0,0.0)
(3.0, 0.0)

julia> y = func(x)
0.010007503399554585 + 0.1411200080598672im

julia> fw = FunctionWrapper{typeof(y),typeof((x,))}(func)
FunctionWrapper{ComplexF64, Tuple{Tuple{Float64, Float64}}}(Ptr{Nothing} @0x00007fd996d08da0, Ptr{Nothing} @0x00007fd9951e8020, Base.RefValue{var"#3#4"}(var"#3#4"()), var"#3#4")

julia> fw(x)
0.010007503399554585 + 0.1411200080598672im

julia> scalarize(x) = sum(x)
scalarize (generic function with 1 method)

julia> fw(x)
6.9451963975005e-310 + 6.9451963974847e-310im

I don't get either error using v1.6.7, so this is probably due to 1.10, as mentioned above, and most likely not due to FunctionWrappers.jl?

P.S. Before simplifying to this segfault and correctness error, I had seen a variety of cryptic error messages, including the OP's in cfunction, expected Union{}... as well a BoundsError about indexing a 48-codeunit String that the stacktrace would say came from a call to size. I hope these observations can help narrow down the problem and possibly contribute to a patch fix in v1.10