fverdugo / PartitionedArrays.jl

Large-scale, distributed, sparse linear algebra in Julia.
MIT License
116 stars 13 forks source link

Missing feature: `find_owner` for `LocalIndices` #115

Open JordiManyer opened 1 year ago

JordiManyer commented 1 year ago

I believe the method find_owner is missing for the structure LocalIndices.

Is this correct, or am I missing something?

fverdugo commented 1 year ago

Hi @JordiManyer,

It is not implemented since LocalIndides has not enough information to compute the result without communications.

Another option is to keep (optionally) a vector of global to owner ids replicated in all procs. This is not scalable of course, but in some cases the user computes such quantity anyway, e.g., when partitioning a serial graph with metis.

fverdugo commented 1 year ago

Related with this, I think a new constructor like this would be handy in some situations:

partiton_from_colors(ranks,global_to_color)

global_to_color (aka global_to_owner) is the output of metis i.e. a sub-domain color per global cell id.

If the user uses this constructor, perhaps it is wise to store global_to_owner somewhere and then we can use it to implement find_owner.

JordiManyer commented 1 year ago

Hi @fverdugo I understand the issue. However, I think something like this would be useful in cases where all global_ids are indeed know locally:

function find_owner(indices,global_ids,::Type{<:LocalIndices})
    # WARNING: When the global_ids are neither owned nor ghosts, the function will
    # return the owner as 0... Should we fetch unknown owners? This would require communications...
    map(indices,global_ids) do partition, gids
        lids = global_to_local(partition)[gids]
        local_to_owner(partition)[lids]
    end
end

When the gids are not known, the owner would be set to 0. We could maybe add an @assert or something similar that checks for consistence.

JordiManyer commented 1 year ago

For a bit of context, I am currently updating GridapPETSc to work with v0.3, and find myself having to translate the following:

A = PSparseMatrix(I,J,V,ids,ids,ids=:global)

for which I want to use the method psparse!. When trying to discover the rows, it is complaining about the missing method.

JordiManyer commented 1 year ago

Related with this, I think a new constructor like this would be handy in some situations

Yes, I've been doing something similar when building mock distributed models for GridapDistributed. It is indeed useful, if only to try things.

fverdugo commented 1 year ago

A = PSparseMatrix(I,J,V,ids,ids,ids=:global)

I see the problem. I would suggest to building the sparse matrix via the low level constructor instead of psparse!. It looks that your ids already contain the required ghost values, so it should be possible.

I am temped to accept your solution above but I find a bit dangerous to return 0s from this function and then seeing the problem somewhere else where it is perhaps more difficult to figure out what went wrong.

My preferred option would be to add a new kw-argument in psparse! (discover_cols=true) that skips the generation of the col ids (assuming the the user already provided the final ones) just like we already have with the row ids. I think it makes sense to have this option for the cols if we already have it for the rows and it would provably fix also your problem.

fverdugo commented 1 year ago

See here:

https://github.com/fverdugo/PartitionedArrays.jl/blob/74e04a3a39ea8b94a7b18f76a5e648b5e40c8636/src/p_sparse_matrix.jl#L481

JordiManyer commented 1 year ago

My preferred option would be to add a new kw-argument in psparse! (discover_cols=true) that skips the generation of the col ids (assuming the the user already provided the final ones) just like we already have with the row ids. I think it makes sense to have this option for the cols if we already have it for the rows and it would provably fix also your problem.

This works, but you might end up with sub-optimal communications since you are not selecting exactly the gids you need for the specific matrix entries. This is not an issue for small tests, but it's not ideal in the general case. I still think it's a good idea to implement it, and I'll prepare a PR with these changes.

I am temped to accept your solution above but I find a bit dangerous to return 0s from this function and then seeing the problem somewhere else where it is perhaps more difficult to figure out what went wrong.

What about something like this:

function find_owner(indices,global_ids,::Type{<:LocalIndices})
    msg = "Error: LocalIndices does not have enough local information to find owners."
    map(indices,global_ids) do partition, gids
        lids   = global_to_local(partition)[gids]
        owners = local_to_owner(partition)[lids]
        @assert all(owners .!= 0) msg
        return owners
    end
end

We could even add a kwarg check_consistency=true that allows the user to deactivate the @assert to his/her own risk (i.e for production purposes).

fverdugo commented 1 year ago

Hi @JordiManyer,

psparse! adds extra ghost ids if needed, it does not remove unnecessary ones. Even by modifying find_owner you will not fix the issue of having extra ghosts, unnecessary communication, etc.

You provably need a new function "remove_ghosts" if not available already then call psparse! with discover_rows = discover_cols = false rather than changing find_owner.

JordiManyer commented 1 year ago

@fverdugo Ok, nice to know. Then I guess the PR is ready for review, with the new kwarg changes as agreed.

fverdugo commented 1 year ago

@JordiManyer

even better: we can implement a new function intersection_ghost which is the counterpart of union_ghost (i.e. same function signature) and then add a new kw-arg in psparse! combine_ghost=union_ghost which allows you to chose from union or intersection (or whatever other logic rule the user wants to provide) to setup the row/col ids. (same rule for both? or do we want separate rules for rows and cols?). This is in addition of discover_row/col.

JordiManyer commented 1 year ago

That would indeed be nice, specially for FEM-related assemblies. I might have time to implement it at some point, same for partiton_from_colors(ranks,global_to_color). I'll make a note and hopefully get back to it. Could you assign me this issue? Thanks!

fverdugo commented 1 year ago

See PR: https://github.com/fverdugo/PartitionedArrays.jl/pull/120