Closed wolfram77 closed 2 weeks ago
There does not seem to be any issue with the logic for refinement phase.
Now I notice that RE-1
, which indicates the step after placing all vertices into isolated communities. However, see that RE-1
actually has 6
disconnected communities above. This means that the 6 disconnected communities are outside of the communities marked for refinement, and thus cannot be resolved by the refinement phase.
The question now is, why are those disconnected communities not marked for refinement?
It seems we must mark the communities touched by the batch update as changed - not surprising. Jai Shri Ganesh!
Still seeing some disconnected communities if I don't mark all touched communities.
Now, still seeing some disconnected communities, even with Static Leiden. Need to check. Is it a regression?
I replaced parallel refine with a sequential one, still seeing disconnected communities!
The code used to debug is as follows.
if (DYNAMIC && isFirst) {
size_t DC = countValue(communitiesDisconnectedOmp(x, ucom), char(1));
size_t C = communities(x, ucom).size();
printf("LM-1: %zu / %zu disconnected communities\n", DC, C);
}
if (DYNAMIC && isFirst) {
size_t DC = countValue(communitiesDisconnectedOmp(x, ucom), char(1));
size_t C = communities(x, ucom).size();
printf("LM-2: %zu / %zu disconnected communities\n", DC, C);
}
if (DYNAMIC && isFirst) {
size_t DC = countValue(communitiesDisconnectedOmp(x, ucom), char(1));
size_t C = communities(x, ucom).size();
size_t CC = countValue(cchg, B(1));
printf("RE-0: %zu / %zu disconnected communities [changed = %zu]\n", DC, C, CC);
}
if (DYNAMIC && isFirst) {
size_t DC = countValue(communitiesDisconnectedOmp(x, ucom), char(1));
size_t C = communities(x, ucom).size();
printf("RE-1: %zu / %zu disconnected communities\n", DC, C);
}
if (DYNAMIC && isFirst) {
size_t DC = countValue(communitiesDisconnectedOmp(x, ucom), char(1));
size_t C = communities(x, ucom).size();
printf("RE-2: %zu / %zu disconnected communities\n", DC, C);
}
Some disconnected communities can appear in the input due to edge deletions after applying the batch update (upon the graph). When the local-moving phase of the Leiden algorithm is run, some new disconnected communities may be introduced. However, the refinement phase should be able to split the disconnected communities. This is, however, not happening in some cases.
See below screenshot, for example.
Note that the disconnected are essentially random, and possibly different from the disconnected communities that were in the input (we do a subset renaming, so community IDs change).
Lets see what happens if we refine all communities instead.