SciML / SciMLSensitivity.jl

A component of the DiffEq ecosystem for enabling sensitivity analysis for scientific machine learning (SciML). Optimize-then-discretize, discretize-then-optimize, adjoint methods, and more for ODEs, SDEs, DDEs, DAEs, etc.
https://docs.sciml.ai/SciMLSensitivity/stable/
Other
331 stars 70 forks source link

(Seemingly) trivial change to the [missing physics tutorial](https://docs.sciml.ai/Overview/stable/showcase/missing_physics/) example causes a `Warning: EnzymeVJP` to trigger #889

Closed Sleort closed 1 year ago

Sleort commented 1 year ago

(I hope this is the right place to report this bug/peculiar behavior)

I was playing around with the missing physics tutorial in the documentation when I came across this (to an outsider) strange behavior:

If I add a third (dummy) component to the u-vector in the Lotka-Volterra equations, like this:

function lotka!(du, u, p, t)
    α, β, γ, δ = p
    du[1] = α * u[1] - β * u[2] * u[1]
    du[2] = γ * u[1] * u[2] - δ * u[2]
    du[3] = 0
end

and update u0 to

u0 = [5.0 * rand(rng, 2); 0]

and the neural network U input and output dimensions to 3:

U = Lux.Chain(Lux.Dense(3, 5, rbf), Lux.Dense(5, 5, rbf), Lux.Dense(5, 5, rbf), Lux.Dense(5, 3))

and make the hybrid model I want to learn be

function ude_dynamics!(du, u, p, t, p_true)
    û = U(u, p, st)[1] # Network prediction
    du[1] = p_true[1] * u[1] + û[1]
    du[2] = -p_true[4] * u[2] + û[2]
    du[3] = 0
end

then the solution, if we ignore the third dimension, should be the same as the original one, right? However, doing this triggers a bunch of the following warnings (errors?):

┌ Warning: EnzymeVJP tried and failed in the automated AD choice algorithm with the following error. (To turn off this printing, add `verbose = false` to the `solve` call)
└ @ SciMLSensitivity ~/.julia/packages/SciMLSensitivity/zGhCS/src/concrete_solve.jl:23
Enzyme execution failed.
Mismatched activity for:   store i64 %9, i64 addrspace(10)* %8, align 8, !dbg !9, !tbaa !23 const val:   %9 = load i64, i64 addrspace(11)* %7, align 8, !dbg !9, !tbaa !23
Type tree: {[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@double, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer, [-1,24]:Integer, [-1,25]:Integer, [-1,26]:Integer, [-1,27]:Integer, [-1,28]:Integer, [-1,29]:Integer, [-1,30]:Integer, [-1,31]:Integer, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer}
You may be using a constant variable as temporary storage for active memory (https://enzyme.mit.edu/julia/stable/#Activity-of-temporary-storage). If not, please open an issue, and either rewrite this variable to not be conditionally active or use Enzyme.API.runtimeActivity!(true) as a workaround for now

when I am trying to train the network in the hybrid model.

Is this a bug? If not, an explanation in the tutorial would be helpful...

ChrisRackauckas commented 1 year ago

However, doing this triggers a bunch of the following warnings (errors?):

They are just warnings. They are harmless but it's to tell you an optimization (i.e. using Enzyme) has been disabled. But it should still automatically use ReverseDiffVJP in this example (with tape compilation), so I presume it just runs fine?

This warning was added recently to help find out where extra performance optimizations are turned off. The warning test is supposed to be benign, if you have any ideas for how we can improve the text let us know.

Sleort commented 1 year ago

Got it. Yeah, sure, I got the expected output. It was just surprising that these warnings suddenly showed up when I changed the model in the way described above. There were no warnings in the first place...

Also, the warnings were a bit confusing to me, as I (thought I) used Zygote, not Enzyme, for AD in this example. Basically, the whole warning text threw me kind of off (the references to Enzyme, all the technical details following "Enzyme execution failed", the message "You may be using a constant variable as temporary storage for active memory (https://enzyme.mit.edu/julia/stable/#Activity-of-temporary-storage). If not, please open an issue [...]", etc.). Maybe it would help if you, in the beginning of the warning, more clearly stated that/when the following text can be ignored? I.e.: "If you are not using Enzyme for automatic differentiation, the following can be ignored." Or something like that?

ChrisRackauckas commented 1 year ago

Also, the warnings were a bit confusing to me, as I (thought I) used Zygote, not Enzyme, for AD in this example.

The adjoint for Zygote in this case is defined to solve the adjoint ODE equation, which is setup to try Enzyme in the VJP equations, and fallback to something slower if required.

Maybe it would help if you, in the beginning of the warning, more clearly stated that/when the following text can be ignored? I.e.: "If you are not using Enzyme for automatic differentiation, the following can be ignored." Or something like that?

https://github.com/SciML/SciMLSensitivity.jl/pull/900 should make it more clear.

ChrisRackauckas commented 1 year ago

Closing due to nicer warning from https://github.com/SciML/SciMLSensitivity.jl/pull/900