SciML / SciMLBase.jl

The Base interface of the SciML ecosystem
https://docs.sciml.ai/SciMLBase/stable
MIT License
127 stars 92 forks source link

add integrand interface #497

Closed lxvm closed 11 months ago

lxvm commented 11 months ago

Hi,

As discussed in https://github.com/SciML/Integrals.jl/issues/169 and https://github.com/SciML/Integrals.jl/pull/175, Integrals.jl needs integrand wrapper types to support features such as inplace evaluation, batched evaluation, and inplace and batched evaluation at the same time. This pr adds 3 structs which support these 3 distinct interfaces and are intended as an eventual replacement for the batch and nout keywords to IntegralProblem. (I don't think that those need to be removed just yet, since we can dispatch on these wrapper types to opt into new behaviors.)

  1. InplaceIntegrand(f!, result::AbstractArray): the caller provides an inplace function f!(y, x, p) and an array result which will store the solution (i.e. it has the same shape as the integrand output array y, but could have a different type due to units of the limits).
  2. BatchIntegrand(f!, y::AbstractVector, x::AbstractVector; max_batch): the caller provides an inplace function f!(y, x, p) of the form y .= f.(x, Ref(p)) and pre-allocated buffers that parallelize the calculation of the integrand over multiple quadrature nodes. For multi-dimensional integrands, the elements of x can be StaticArrays
  3. InplaceBatchIntegrand(f!, result::AbstractArray, y::AbstractArray, x::AbstractVector; max_batch): the caller provides an inplace function f!(y, x, p) that evaluates an array-valued inplace integrand at multiple points simultaneously. The solution is written to result. The buffer y is of size (size(result)..., size(x)...) and should be resize-able in its outer dimension (i.e. an ElasticArray)

I would appreciate any feedback on this pr. In particular, for multidimensional batched integrands it is nice to allow x to be a vector of SVectors instead of a Matrix, which is the current style. For the C libraries that only use Matrix we can use reinterpret(reshape, ...) to bring it into SVector form.

ChrisRackauckas commented 11 months ago

There's no need for it to be a completely separate system from what already exists. Some uniformity would be good here. Make it IntegralFunction and BatchIntegralFunction, both as AbstractSciMLFunctions following a similar form to what is done in the others. In-place should just be the first argument and can be determined automatically using the machinery that's used in the other SciMLFunctions. That would make this feel much more in place (pun intended).

lxvm commented 11 months ago

I hadn't used the AbstractSciMLFunctions before and I'm glad this is a solved problem. I'll try implementing the interface although most of the interface may be unused.

On the note of uniformity across the ecosystem, is there a SciML integral operator? These would act on differential forms instead of arrays, and perhaps there are applications?

lxvm commented 11 months ago

@ChrisRackauckas I added IntegralFunction and BatchIntegralFunction. I wasn't very comfortable with relying on isinplace because for in-place integrands only the caller knows the array element type to allocate and also the BatchIntegralFunction requires a 3-argument function for both in-place and out-of-place forms. So the interface assumes a function is out-of-place unless a result array is provided. This is all in the documentation. Does it look good to you?

ChrisRackauckas commented 11 months ago

I took a pass through it:

  1. We will need some things like observed in the future, but for now leave it off. We generally add things as the interface is built out, and the MTK side of Integrals.jl is just missing for now so we can ignore it until it's there.
  2. I'm a little weary that we cannot do proper dispatch checking on the batch version, but we at least we can test too many/too few arguments for appropriate errors.
  3. The extra values were renamed prototype values to match other parts of the interfaces. What's expected then is that init will make a copy so that the solves are thread-safe. Something we need to keep in mind with the implementation (I don't think we've done much thread safety in the testing since there's also C programs involved).

Let me know if there's anything here you'd disagree with.

codecov[bot] commented 11 months ago

Codecov Report

Merging #497 (a6fd63a) into master (434e752) will increase coverage by 39.85%. Report is 29 commits behind head on master. The diff coverage is 84.74%.

@@             Coverage Diff             @@
##           master     #497       +/-   ##
===========================================
+ Coverage    0.00%   39.85%   +39.85%     
===========================================
  Files          50       50               
  Lines        3635     3756      +121     
===========================================
+ Hits            0     1497     +1497     
+ Misses       3635     2259     -1376     
Files Changed Coverage Δ
src/SciMLBase.jl 57.14% <ø> (+57.14%) :arrow_up:
src/scimlfunctions.jl 55.10% <79.48%> (+55.10%) :arrow_up:
src/problems/basic_problems.jl 88.73% <95.00%> (+88.73%) :arrow_up:

... and 39 files with indirect coverage changes

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

lxvm commented 11 months ago

Hi Chris, this looks greatly improved! Thank you for looking into it

I made a few tweaks and here these are my thoughts:

  1. Sounds good, and I think MTK would be useful, especially for distinguishing integrand parameters from integration variables.
  2. For the batch version, the only type check we can do is on the output_prototype. I think we should decide right away how to handle this. I don't know of any pure Julia libraries that offer inplace batched integrands, but the way the C libraries handle this is that the output is a vector when batching out-of-place and the output is a matrix when batching in-place. Tentatively, we could check output_prototype isa AbstractVector when batching out-of-place and output_prototype isa ElasticArray && integral_prototype isa AbstractArray && ndims(output_prototype) == ndims(integral_prototype) + 1. Packages like ArraysOfArrays.jl could wrap an ElasticArray into a vector of views, and it is an elegant solution we could handle as well, but I think we need to decide on the number of dimensions of output_prototype and I vote for the former, i.e. ndim > 1 (in line with C libraries), instead of the latter, i.e ndim == 1 (which is a possibility using Julia). I might be backwards-thinking on this.
  3. I renamed integrand_prototype to integral_prototype, since I believe the program should write the result of the integral to the prototype because it is in-place evaluation. The only other difference is that the integrand and the integral could differ in units. About thread safety, I don't think any integration library does multi-threaded integrand evaluation because of thread-safety issues, but with the BatchIntegralFunction the user does the parallelization themselves.

Let me know if this looks good to you, especially the proposal in 2.

ChrisRackauckas commented 11 months ago

About thread safety, I don't think any integration library does multi-threaded integrand evaluation because of thread-safety issues, but with the BatchIntegralFunction the user does the parallelization themselves.

It's more if the user calls solve while multithreading. If this uses the same caches between the different solves (because of the same problem), then there will be clashing and incorrect results.

For the batch version, the only type check we can do is on the output_prototype. I think we should decide right away how to handle this. I don't know of any pure Julia libraries that offer inplace batched integrands, but the way the C libraries handle this is that the output is a vector when batching out-of-place and the output is a matrix when batching in-place. Tentatively, we could check output_prototype isa AbstractVector when batching out-of-place and output_prototype isa ElasticArray && integral_prototype isa AbstractArray && ndims(output_prototype) == ndims(integral_prototype) + 1. Packages like ArraysOfArrays.jl could wrap an ElasticArray into a vector of views, and it is an elegant solution we could handle as well, but I think we need to decide on the number of dimensions of output_prototype and I vote for the former, i.e. ndim > 1 (in line with C libraries), instead of the latter, i.e ndim == 1 (which is a possibility using Julia). I might be backwards-thinking on this

That might be overthinking it. I think we can just match the C libraries unless we gain some extra feature or performance.

lxvm commented 11 months ago

It's more if the user calls solve while multithreading. If this uses the same caches between the different solves (because of the same problem), then there will be clashing and incorrect results.

I see, yes we should avoid race conditions when convenient and that goes with the "prototype" naming convention. I reverted the name back to integrand_prototype since that is what makes sense now.

That might be overthinking it. I think we can just match the C libraries unless we gain some extra feature or performance.

Sounds good. For now the documentation recommends the output_prototype have an extra batching dimension compared to integrand_prototype, but we don't check any types since this may need to be done downstream by individual algorithms with unique APIs.

I'm happy with how this looks and would be interested in following up with some PRs on the Integrals.jl side to use these containers to add features. Do most SciML packages wrap the user's input into a SciMLFunction, or expect the user to provide it themselves?

ChrisRackauckas commented 11 months ago

Do most SciML packages wrap the user's input into a SciMLFunction, or expect the user to provide it themselves?

I missed that part. The problem type always has a default dispatch to wrap in the simplest function type. The IntegralProblem should default to wrapping the f in a IntegralFunction

lxvm commented 11 months ago

Ok, I added the wrapper, but this would be breaking behavior unless we make IntegralFunction callable. I notice some function types have these backwards compatibility methods, so should I define those?

If we are OK with breaking changes, I would also introduce the following:

ChrisRackauckas commented 11 months ago

I notice some function types have these backwards compatibility methods, so should I define those?

No need for those.

but this would be breaking behavior unless we make IntegralFunction callable.

Yes make it callable and just forward the args.

remove the nout and batch keywords to IntegralProblem, since these can be obtained from output_prototypes and BatchIntegralFunctions

Yes let's go for it.

Replace the limits [lb, ub] with a single domain argument, similar to how Symbolics.jl handles integrals. Also add a method for converting [lb, ub] arguments into a domain

That sounds good.

lxvm commented 11 months ago

Alright, I committed the breaking changes. Now I should open a pr in Integrals.jl to migrate to this interface

lxvm commented 11 months ago

@ChrisRackauckas I added the deprecation methods and added the 2-argument out-of-place form for BatchIntegralFunction. In the deprecation method, I'm not sure if I'm handling the oop nout>1 case correctly, so I might need some time to review and test that.

ChrisRackauckas commented 11 months ago

The simple IntegralProblem(f, (lb,ub), p) dispatch for non-AbstractIntegralFunction is something that should still be considered standard and make work without error or warning. I think that's handled correctly, but it's a bit hard to see.

However, I think it's safe to merge this into the v2.0. The reason is because Integrals.jl is the only consumer (that I know of) of the SciMLBase.jl stuff, and so labeling it as a major means that we can do the breaking release and it won't actually be put into action until a PR bumps the lower bound on Integrals.jl. That will give us a little leeway for testing it out in implementation without being breaking to anyone's workflows until it's clear that it works.

Because of that, I'm going to merge. Let's do it, and we can fix what comes up during the bump of Integrals.jl

lxvm commented 11 months ago

I think there will be a problem with how the batch keyword is handled. Like nout, it should be given a default value, zero, if it is unset. I can't fix it for another hour, but with that fix I think we should merge

lxvm commented 11 months ago

@ChrisRackauckas The obvious bugs are now fixed in the IntegralProblem constructor. I'm happy with how it is