SciML / SciMLBase.jl

The Base interface of the SciML ecosystem
https://docs.sciml.ai/SciMLBase/stable
MIT License
127 stars 92 forks source link

Add `DataIntegralProblem` #491

Closed IlianPihlajamaa closed 11 months ago

IlianPihlajamaa commented 11 months ago

See this issue.

codecov[bot] commented 11 months ago

Codecov Report

Merging #491 (1e960d1) into master (03cdaed) will decrease coverage by 23.98%. Report is 7 commits behind head on master. The diff coverage is 100.00%.

@@             Coverage Diff             @@
##           master     #491       +/-   ##
===========================================
- Coverage   57.27%   33.30%   -23.98%     
===========================================
  Files          50       50               
  Lines        3703     3681       -22     
===========================================
- Hits         2121     1226      -895     
- Misses       1582     2455      +873     
Files Changed Coverage Δ
src/SciMLBase.jl 68.42% <ø> (-3.01%) :arrow_down:
src/problems/basic_problems.jl 87.71% <100.00%> (-0.75%) :arrow_down:

... and 32 files with indirect coverage changes

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

IlianPihlajamaa commented 11 months ago

Also added 2 constructors:

function IntegralProblem(f, x::AbstractVector{<:Number}, args...; kwargs...)
    IntegralProblem{isinplace(f, 3)}(f, x, args...; kwargs...)
end
function IntegralProblem(y::AbstractArray, x::AbstractVector{<:Number}, args...; dim::Int=1, kwargs...)
    IntegralProblem{false}(y, x, args...; dim=Val(dim), kwargs...)
end

The first for integrating a callable f over some set of sample points x, and the second for integrating an Array y over some dimension dim. I chose to put x second in both of these (contrary to what is usual in many packages/languages), because this way it is consistent with the IntegralProblem(f, lb, ub) ordering. Let me know if you agree.

IlianPihlajamaa commented 11 months ago

Oops, there was a small mistake, test SciMLBase now passes locally, sorry!

IlianPihlajamaa commented 11 months ago

There are still method ambiguities, but I don't know how to get rid of them...

lxvm commented 11 months ago

If I may comment on this pr, I think adding a separate DataIntegralProblem would be preferable to modifying the existing IntegralProblem. I see two reasons for this:

  1. These are distinct mathematical problems since an IntegralProblem is continuous, i.e. we are given a function f and a domain [lb, ub] and we evaluate f wherever we please for any quadrature, and a DataIntegralProblem is discrete, i.e. the values of x and f.(x) are the inputs. The difference is that in the first case, choosing the integration points and weights together carefully can produce high-order accurate quadratures that are much more accurate than those available for unstructured grids (e.g. trapezoidal rule). This is due to mathematical connections between integration and interpolation.
  2. The algorithms currently used to solve IntegralProblems only accept functions with domains, not data, so none of the existing implementation of IntegralProblems could be reused for DataIntegralProblems. The current design of this pr, which extends IntegralProblems with more options could lead to more confusion, errors, and complicated autodiff implementation.

I hope this is useful feedback and I am happy to see a pr like this because I think this feature would be very useful.

IlianPihlajamaa commented 11 months ago

Feedback is always welcome of course. Personally, i am impartial to whether or not it should be separated in the API. Perhaps others could weigh in? (@ChrisRackauckas ) From an implementation point of view, separate structs are definitely easier.

ChrisRackauckas commented 11 months ago

Separate it for the reasons @lxvm mentioned. When I last reviewed this PR it was separated to have a separate IntegralDataProblem for those reasons. What happened and why was it changed?

Indeed overlapping them doesn't make sense because their dispatches need to be separate, along with most of their data, so it's not quite clear what would be gained by keeping them together. The AD passes indeed need to be very different too, so I do not understand why it would be part of IntegralProblem. But again, I thought they were already separate in https://github.com/SciML/SciMLBase.jl/pull/491/commits/07e569877035b1d2490684c92ad9794c865dca76 ?

IlianPihlajamaa commented 11 months ago

The reason I merged them into the same struct was because, for the case

solve(IntegralProblem(f::Function, x::AbstractVector), Trapezoidal())

There needed to be an x-field in IntegralProblem anyway. And so I thought that it therefore wasnt necessary to split the two. I hadnt considered the AD angle.

In hindsight

  1. In the case of integrating a function, the specification of the precise grid (as opposed to a given number of points is probably never useful. So it isn't necessary to put this extra field in.
  2. For integrating data it is probably better abyway to have a separate api, as argued above.

So I will revert the changes (apart from the added tests) when I have time, reintroducing the DataIntegralProblem.

ChrisRackauckas commented 11 months ago

Thanks for the explanation! I'll wait for the revert to re-review.

IlianPihlajamaa commented 11 months ago

Do you prefer IntegralDataProblem or DataIntegralProblem? The last one sounds slightly better to me, but I'm not a native English speaker.

Anyway, it should be ready for review now.

lxvm commented 11 months ago

Perhaps SampledIntegralProblem? Both https://docs.scipy.org/doc/scipy/tutorial/integrate.html#integrating-using-samples and https://github.com/dextorious/NumericalIntegration.jl use the word 'sampling' so it might be more familiar to newcomers, and 'sampled' implies that the integrand has already been evaluated

IlianPihlajamaa commented 11 months ago

SampledIntegralProblem

I like it.

ChrisRackauckas commented 11 months ago

👍 on SampledIntegralProblem

ChrisRackauckas commented 11 months ago

I'm a bit weary on the dim part, but let's see how this goes.