vatai / mpk

Matrix powers kernel
1 stars 0 forks source link

Implement LAPJV (input generaton) in PDMPK #33

Closed vatai closed 4 years ago

vatai commented 4 years ago

dmpk already generates the "communication table" between phases which is the input of the LAPJV algorithm which optimises the partition label assignment to reduce communication. In dmpk this is achieved by implementing (yet) another method to simulate the computation in a phase. The earlier version fixed a (new) partitioning, and counted the communication with to that (new) partitioning, by comparing the "current" partition and the source partition (i.e. where the needed data was stored), and counted communication if these were different. This resulted in 0 "self-communication" (0s in the diagonal). The updated version, also fixes a partitioning, but doesn't compare the current and source partitions, just counts everything needed from previous phases phases.

This should be implemented in the pdmpk as well.

vatai commented 4 years ago

A potential pitfall I encountered when updating dmpk was that I tried to skip the comm_table and sum up the communication directly. This is not possible, since there usually is some redundancy in the required data, i.e. multiple targets may need the same data. This resulted in approximately 10x larger values in the communication sums tables. Instead, the comm_table needed to be filled and then based on that, the communication counted.

vatai commented 4 years ago

Will have to use this: https://stackoverflow.com/questions/1939953/how-to-find-if-a-given-key-exists-in-a-c-stdmap

if ( m.find("f") != m.end() ) {
  //  Found! Add to comm_table!
}
vatai commented 4 years ago

I had some "pitfalls" with this in the C version. So first we'll have to create something like a std::map<src_tgt_t, std::set<idx_lvl_t>> comm_table, or a std::vector<std::set<idx_lvl_t>> comm_table.

The first case might be easier to understand, a map from (src, tgt) partition pairs to a set of (index, level) pairs, i.e. which vertices (=index+level) are transmitted from src to tgt partition. (The second case is similar, but there comm_table[src * npart + tgt] would be the set).

After can the comm_sums (communication sums) be computed, such that comm_sums[{src, tgt}] == comm_table[{src, tgt}].size(). Then finding the minimum (right?) of these values and subtracting it from every value in comm_sums[]. This can then be passed to lapjv().

utsavsinghal5 commented 4 years ago

I reread the communication storing part. So if any vertex is needed in the upcoming phase(partially or fully) the methods ProcAdjacent and AddToInit always ends up checking that if they are in same partition or not. So if and only if any of the vertex is in loop ProcAdjacent or AddToInit, it will be part of our required comm_table. So we can just do the calculation here.

vatai commented 4 years ago

Yes, but we cannot do the calculation in ProcAdjacent and AddToInit, or at least not without a few ifs which would avoid the side effects (modifications of the paritals, levels mcsr etc) of those methods. I really hope it can be implemented in such a way that nothing is modified, because if we need to modify something that would need to be reverted before the actual ProcPhase is called which does record/modify things.

I've added some proposed implementations skeleton in emil-WIP branch. That is approach IMHO.

It also implements a kind of verification (which should be removed eventually) which check if the actual recorded communication is equal to the counted communication (except the self-communication of course, which should be zero for the actual communication but usually positive for the counted communication).

vatai commented 4 years ago

I might be close to a solution. It is in the emil-WIP branch. It is not yet perfect but I think I know what is the problem and might also know the solution: I've implemented comm_table as a (src, tgt) -> set of (idx, level) pairs, but this is wrong. If there is a partial vertex (index idx level lvl) which is transmitted, then it is possible the the "non-partial" version of it might also be needed?

Now that I've written this down, I'm not so sure any more, anyway, I'm leaving this here as a reminder.

vatai commented 4 years ago

This is almost done. Two important things not to forget