Open ghost opened 3 years ago
The problem seems to be caused by the edges with zero weights. For the time being you can circumvent the problem by simply removing those edges. This can be done without any problem since edges with zero weights have no effect anyway.
Nonetheless, this is a bug that should be corrected. We will provide a fix at some later time.
In particular, the root cause of the problem is that we check whether a neighboring cluster is already added in these conditionals:
We should probably use a boolean array isClusterAdded
instead of relying on the edge weight. This is the most robust way to address this issue I think. @neesjanvaneck, what do you think?
Ah, I see. Makes perfect sense. I didn't intend to have any edges with zero weight. I'm happy to check for that condition and remove those edges, which are meaningless. Would you like me to close this issue (as far as I'm concerned, it's already resolved)?
No problem, I imagine that you did not check this condition prior to running the algorithm. No, let's leave the issue open, as the program shouldn't crash on this input.
I suppose a smaller change would simply be to filter out edges with zero weight in the code that reads the edges from the file. This way, you can enforce your own precondition, and leave the downstream code unchanged. The advantage is that it would avoid adding another variable for book-keeping.
Yes, we would probably do that as well. However, the program still should not crash, even if an edge of zero weight would somehow be included.
The following command
java -cp ~/Bioinformatics/networkanalysis/build/libs/networkanalysis-1.1.0-5-ga3f342d.jar nl.cwts.networkanalysis.run.RunNetworkClustering -q CPM -m 1 -w -o /tmp/edges_clustering.txt edges.txt
runs just fine. However, if the -q option is set to "Modularity", I get the following crash:I've attached the input file here edges.txt I'm not sure what property of this input file would violate the preconditions of the software. The nodes are 0-indexed numbers, there are no duplicate edges, the lefthand node ID is always less than the righthand node ID, etc.