Nondifferentiated positional args beyond x?

gdalle commented 1 year ago

@mohamed82008 and I had a council meeting before the next release, and we reconsidered the question of nondifferentiated positional arguments, introduced in #89 by @thorek1 (thanks for the contribution!).

Our current position is that the risk of a user mistake (expecting a derivative where there will be none) outweighs the practical benefits. This is reinforced by the fact that:

we don't yet have a short MWE of the things that are harder to do without this feature
while some things may be harder to do, they remain far from impossible with a bit of additional sweat

As things stand, the "positional args" feature will be rolled back before v0.5. Still, we are opening this issue to add more pros and cons, so feel free to weigh in!

thorek1 commented 1 year ago

here is a short MWE which works with the current version:


using ForwardDiff: ForwardDiff
using ImplicitDifferentiation: ImplicitFunction, identity_break_autodiff
using Zygote: Zygote, ZygoteRuleConfig
using RuntimeGeneratedFunctions
RuntimeGeneratedFunctions.init(@__MODULE__)

function mypower(x::AbstractArray, p)
    return identity_break_autodiff(abs.(x)) .^ p
end

RTGconditions = :(conditions(x, y, p) = y .^ (1 / p) .- abs.(x))
test_conditions = @RuntimeGeneratedFunction(RTGconditions)

RTGfunct = :(function test_implicit(X)
    impl = ImplicitFunction(mypower, test_conditions)
    return impl(X,0.5)
end)

test_implicit = @RuntimeGeneratedFunction(RTGfunct)

test_implicit([1.0,2.0])

ForwardDiff.jacobian(test_implicit,[1.0,2.0])

Zygote.jacobian(test_implicit,[1.0,2.0])[1]

if you don't have non differentiated positional arguments you need to write something along these lines:

function implicit(X::Vector{<: Real},p::Float64,conditions::Function)
    ImplicitFunction(x->mypower(x,p), (x,y) -> conditions(x,y,p))
end

function mypower(x::AbstractArray, p)
    return identity_break_autodiff(abs.(x)) .^ p
end

RTGconditions = :(conditions(x, y, p) = y .^ (1 / p) .- abs.(x))
test_conditions = @RuntimeGeneratedFunction(RTGconditions)

RTGfunct = :(function test_implicit(X)
    impl = implicit(X,.5,test_conditions)
    return impl(X)
end)

test_implicit = @RuntimeGeneratedFunction(RTGfunct)

test_implicit([1.0,2.0])

ForwardDiff.jacobian(test_implicit,[1.0,2.0])

Zygote.jacobian(test_implicit,[1.0,2.0])[1]

I find the latter less intuitive and having more unnecessary syntax.

these examples work only in these ways because of the constraints of RuntimeGeneratedFunctions: no opaque closure, no kwargs

Irrespective of this non differentiated positional arguments are a natural and intuitive extension and add to usability in my view. The risk of confusion can be diminished by documenting the interface. In my view this risk of confusion is small given that users of ImplicitDifferentiation will be aware of the syntax of other Julia AD interfaces and they allow derivatives wrt one element only.

gdalle commented 1 year ago

Thanks for the MWE!

I find the latter less intuitive and having more unnecessary syntax.

That may be true but the difference in LOCs is very small, so for advanced Julia users I think this is okay. I had never even looked up RuntimeGeneratedFunctions.jl before today, and it definitely doesn't seem like the kind of thing a casual Julia user would employ.

these examples work only in these ways because of the constraints of RuntimeGeneratedFunctions: no opaque closure, no kwargs

Outside of the framework of this weird package, can you think of another example where what is achievable with args... cannot be achieved with kwargs...?

other Julia AD interfaces [...] allow derivatives wrt one element only.

That is not necessarily the case: ChainRules.jl allows derivatives for all positional arguments, but it returns meaningful tangents for each of them. Indeed ForwardDiff.jl only works with a single argument, but as a result it doesn't allow passing multiple. If we kept the feature we're discussing, we would be in a weird middle ground where we accept arguments but do not differentiate them, and that's what I fear will be confusing.

users of ImplicitDifferentiation will be aware of the syntax

I don't want this to be a package for experts, the whole point is to make it as accessible as possible. So the standard that I'm aiming for is that even beginners should understand intuitively how it works.

thorek1 commented 1 year ago

@ChainRules cool feature :) I didn't know about it. I take the common ground in interface between forward and reverse mode is: diffable 1st positional arg, non-diffable positional args; non-diffable kwargs

In that case, it is confusing for reverse mode users that they have to use ComponentArrays or similar tricks to get their other positional args diffable and I see that you documented that well.

In my view the non-diffable args are a feature forward mode users might be missing (even though they can go around it with kwargs). I see that providing it might confuse reverse mode users but then again you explain already how they have to handle other diffable args using ComponentArrays and one more explanation wouldn't confuse users much more.

In case you don't include non diffable args I would explain how to use non-diffable args (the kwargs option and the non-kwargs workaround) in case users want to do so.

From a non-expert perspective I would argue that the package should be as feature rich ((non-)diffable args + kwargs, (automatic) backend and solver choice) as reasonably possible and hide all the technical details behind a user-friendly interface.

@LOCs in my case I have about 10 arguments so it gets a bit more ugly (larger code blocks I posted in other threads)

gdalle commented 1 year ago

To be honest I still wasn't convinced, but then I remembered the key difference between positional and keyword arguments: keyword arguments do not participate in dispatch. Therefore, if a user needs some form of multiple dispatch for forward or conditions, this is the only way to enable it.

gdalle commented 1 year ago

Case closed, we're keeping optional args :)

JuliaDecisionFocusedLearning / ImplicitDifferentiation.jl

Nondifferentiated positional args beyond x? #101