Open ctessum opened 1 year ago
I guess I should link to the system that's causing the actual problem: https://data.earthsci.dev/dev/geosfp/#Using-data-from-GEOS-FP
This is also related to this PR: https://github.com/SciML/MethodOfLines.jl/pull/240
I understand that a lot of this code is in MethodOfLines.jl instead of this package, but the observed function that's highlighted in the screenshot above is in this package so I'm posting the issue here. In the larger system that this issue is trying to represent, observed takes up a much larger portion of the overall time (like, pretty much all of it), and it seemed like what was happening was that a new observed function had to be generated and compiled for each variable at each grid point, and then type interference had to be run each time it was called, and that seemed to be taking up a lot of time.
So I guess my question is whether this seems like a possible explanation for the problem, and if so if there's a way to statically type the observed function and/or possibly use one function that gets called for each variable rather than generating a new function for each variable.
It's possible, and somewhat related to the issue of how lowering currently occurs via scalars and gets O(n^2) amount of code on 2D PDEs. Function folding is required there, and that's the reason for large compile times, and is the biggest piece of work right now that is going on in MTK. One that is cleaned up, then I think preserving structure in observation functions and Jacobians is next. But it's all one thread of preserving structure in the code generator.
Hello!
I'm trying to debug an issue where I have a PDE system that takes a very long time for
solve
to run at a low point count (equivalent to ~1000 equations) and results in an out-of memory error even on an HPC node with 40gb memory when using a a still-relatively-low point count (equivalent to ~25000 equations).Trying to boil it down as much as possible, it's similar to this advection system:
The code above generates the following profile, zoomed in to the actual
solve
call:If I'm reading this correctly, the actual time-stepping is represented by
solve_up
is taking a little more than a quarter of the time, and the rest of the time is taken up byPDETimeSeriesSolution
, which I understand to be everything that happens between finishing the time stepping and returning the solution.I understand that a lot of this code is in
MethodOfLines.jl
instead of this package, but theobserved
function that's highlighted in the screenshot above is in this package so I'm posting the issue here. In the larger system that this issue is trying to represent,observed
takes up a much larger portion of the overall time (like, pretty much all of it), and it seemed like what was happening was that a newobserved
function had to be generated and compiled for each variable at each grid point, and then type interference had to be run each time it was called, and that seemed to be taking up a lot of time.So I guess my question is whether this seems like a possible explanation for the problem, and if so if there's a way to statically type the observed function and/or possibly use one function that gets called for each variable rather than generating a new function for each variable.
I'd also be happy to post the larger system where the problem is more obvious, but I haven't done so here because it takes several hours (or more) to run each time so is difficult to reason with.
Thanks!