Closed biowilliam closed 4 years ago
Just to be sure that I understand what the problem is. If you have an edgelist such as
0 1
3 4
node 2 is an isolate. The algorithm clusters nodes 0-4, and outputs a cluster for each node 0-4, and you end up with something like
0 1
1 1
2 3
3 2
4 2
You would have expected it to not output a cluster for node 2 in this case?
I think this is more easily solved as either a pre-processing step (make sure there are no isolates in your networks) or a post-processing step (ignore the isolates). Other people may actually expect the output to include a cluster for node 2 in such cases (this would also be my expectation). Not outputting the isolates would only complicate the matter I think.
Or would you simply prefer to get a warning that there are isolates or something?
You got the issue well. Yes, I agree that it is better to include the isolates in the final output as it is now, even better with a warning to highlight this isolated node issue especially for those who are new to network analysis. I spend quite some time to figure out the mismatch and just would like to share with the community for what I found. Thanks for your wonderful continuous improvement from SLM to Leiden.
Input edgelists ; output cluster information for each nodes.
I have isolates in the network and so the edgelists will not contain any information about them. However the RunNetworkClustering.jar would assign cluster numbers sequentially for all nodes including those that do not exist in the input edgelists. That is, NetworkClustering will fill the gap nodes as isolated clusters from your input network. I have attached my edgelist input (node ranges from 0 to 76 without 36) and clusters output here (contain cluster for node 36).
Networkcluster warning example-edgelist.txt clusters.txt