Evaluation of DistributedCellFields at points

emvanderlinde commented 1 year ago

I'm trying to evaluate my distributed solution at given x,y coordinates. In series the code works, but in parallel I cannot. I think the error is caused by the fact that the evaluation of CellFields is only implemented for DistributedCellPoints.

Has anyone faced this issue before? Or do you know how to solve this?

@fverdugo @amartinhuertas @JordiManyer @oriolcg

JordiManyer commented 1 year ago

We can probably find a way, but you should be more specific on what exactly you want to do.

In short, the algorithm to locate arbitrary points in a mesh is not implemented in parallel. Although I dont think it should be, since that would be quite inneficient. In general, you should avoid this kind of things.

So what you'll have to do is have each processor search in its part of the mesh. Or better still: pre-sort your points per processor and then have each processor work on its own points at the same time.

If you have an array of points, then you should distribute it so that every processor gets the points in its part of the mesh, then use the serial code in each processor to evaluate.

Again, I might be able to be more specific if you share some more details.

emvanderlinde commented 1 year ago

Hi Jordi,

Thank you for your quick response. I am modelling a domain which is split by a level set function (Navier-Stokes flow and ground water flow). The level set function changes during the computations based on empirical sediment transport equations. These sediment transport equations, require the velocity gradient at the given locations along the level set boundary. Because this boundary moves in time, the points at which I need to evaluate my results also change in time. (see overview image below)

What I am doing in series is:

1) I store the gradient from the previous time step.

2) I determine the change of level set function at evaluation points (based on the current location of the level set function).

oriolcg commented 1 year ago

Hi @JordiManyer, I believe that the most efficient way to implement this is, as you mentioned, do a first selection of points per process and then apply the serial interpolation. Do you have an idea on how this first search could be done in an efficient way?

fverdugo commented 1 year ago

I guess it will depend on the number of points. If the vector is small you provably can replicate it on all processes and find which points are actually in each domain:

x_points = ...
map_parts(uh.fields) do uh
    for x in x_points
        uh(x)
    end
end

If not, you will provably need to distribute x_points, compute the bounding boxes of all subdomains, replicate the bounding boxes on all processes, find the bounding boxes each point falls in (in parallel), redistribute x_points such that each point is local in a subdomain whose bounding box contains the point, etc.... You can even use a kde tree for the sub-domain bounding boxes ... not straight forward as you see.

oriolcg commented 1 year ago

Yes, this should work for our application of interest. @emvanderlinde You can try to implement it.

In any case, having the interpolate feature for a DistributedCellField is something would be nice to have.

JordiManyer commented 1 year ago

If the interface moves continuously between timepoints, you can probably take advantage of that:

At the first time step, you simply replicate the vector of points in all subdomains and try to find the cell to which every point belongs. This might be costly, but you only have to do it once.
At every other timestep k, you update the cell locations for each point by starting a local search (starting the search in the last known cell of the point) in the processor containing the point in iteration k-1 (and only do it for local points), then communicate ghosts accross nearest neighbors (I believe this is taken care of by the PVector object and a simple exchange! call).

So basically you keep updating a distributed vector of indices containing the (local) cell ids of each point, with which you can create a DistributedCellPoint instance where to evaluate your DistributedCellField. I think this is by far the most efficient way to do it, but it is also quite involved. Although I think all of the communications are taken care of by the PartitionedArrays interface.

gridap / GridapDistributed.jl

Evaluation of DistributedCellFields at points #113