confused about interface

tpapp commented 6 years ago

Currently, the package does two things:

support an interface for retrieving values with derivatives; as the former are calculated anyway
allow preallocated buffers for results.

In my own benchmarks, I find the allocation cost dwarfed by the actual calculations. It can also lead to bugs, eg if the MutableDiffResult is accidentally reused. With the introduction of StaticArrays (https://github.com/JuliaDiff/DiffBase.jl/commit/1f15e9c3347be65e0e8fc4309a13507e4d24cdac) the interface became even more complicated, because they don't share structure, yet use an interface which pretends to modify its arguments.

Also, the relevant subtype of DiffResult is determined by the input type. It is conceivable that a function could take a StaticVector yet return a Vector.

Perhaps it would be better to use an interface which just takes an abstract type which tells a single evaluator function (eg in ForwardDiff.jl) what the user wants calculated, and returns that, eg

resulttype = DiffResult(; value = true, jacobian = true) # type information saves these
value_and_deriv = ForwardDiff.evaluate(resulttype, f, x) # result based in 1st argument

I came across this when writing a simple wrapper that just takes a function, and returns a function which returns value & derivative (short source here), and found it very complicated to support everything.

jrevels commented 6 years ago

yet use an interface which pretends to modify its arguments

There's no pretending involved. The interface does have the license to modify mutable arguments, hence the postfix !. The functions are always called in the same way (e.g. callers must always treat the output as a new object), regardless of implementation, so the API requires no DiffResult-subtype-dependent coding on the part of the user. The documentation is explicit about this, though that section title should probably be changed since it's confusing.

Looking at it again, the likely reason why it's confusing is that I revealed too much implementation information in those docs. Users shouldn't know about and shouldn't depend on mutability behavior. Those subtypes are essentially just internal performance optimizations. Maybe it would be a good idea to lie to the user and omit the ! for those functions...

Perhaps it would be better to use an interface which just takes an abstract type which tells a single evaluator function (eg in ForwardDiff.jl) what the user wants calculated, and returns that, eg

This is similar to what I originally did back in the days of ForwardDiff v0.1. At the time that I switched over to the DiffResult API, it made sense to couple result configuration with result buffer preallocation since mutability was implied. I guess in the face of all these immutability optimizations, though, we've come full circle 😛

tpapp commented 6 years ago

Sorry, I did not mean to imply that the documentation is not well written. My point was that an occasionally modifying ! function interface is more complex than a purely functional one.

I am wondering the buffer preallocation is a performance gain major enough to justify this interface. Your application may differ from mine, but when profiling (quite complex) functions via ForwardDiff, I found that the function core takes most of the time, and the processing of the results is practically immeasurable.

I understand that there is a trade-off between optimizations and clean interfaces. You are most likely in a better position to make this call, I was just making a suggestion from my perspective.

jrevels commented 6 years ago

No worries, I understood your point - I was commenting that I, myself, think the current docs/function names might be confusing 😛 I agree that decoupling result configuration from buffer preallocation is a good idea.

I am wondering the buffer preallocation is a performance gain major enough to justify this interface.

It definitely depends on the use case. The main case where this interface should be useful is when 1) one is evaluating the jacobian of the same function at many different points and 2) the number of operations in the target function is ~linear with respect to the dimensionality of the input/output.

JuliaDiff / DiffResults.jl

confused about interface #4