aicenter / ShoDi

0 stars 0 forks source link

Incorrect distance matrix with the slow algorithm #8

Closed neumannjan closed 4 years ago

neumannjan commented 4 years ago

The fast algorithm produces correct results, whereas the slow algorithm produces incorrect results.

For the provided adjacency matrix, the first 10 values of the very first row of the distance matrix are:

:( I have yet to figure out why.

F-I-D-O commented 4 years ago

Ok, first question, how do you know which version is correct?

neumannjan commented 4 years ago

@F-I-D-O I compared the output of the fast version with the expected result (dm.csv file) you provided me via ownCloud. Also, the result for the slow algorithm contains lots of strangely repeated values, so I expect it to be a memory issue somewhere? Or maybe the algorithm itself is somehow broken

neumannjan commented 4 years ago

It works for smaller graphs, the results are identical for both algorithms

F-I-D-O commented 4 years ago

Ok, then please find the bug int the slow alg. and fix it. The slow algorithm is used inside other preprocessing algorithms, so we cannot discard it right now, can we?

neumannjan commented 4 years ago

Just found out that the slow algorithm operates correctly in c72bbd8. Edit: also newer b4b2f30 Edit: 8a88735 is the commit to blame

Ok, then please find the bug int the slow alg. and fix it. The slow algorithm is used inside other preprocessing algorithms, so we cannot discard it right now, can we?

We could probably replace the slow algorithm with the fast one in many places, but I'm not sure if all, since a very similar (basically identical) Dijkstra algorithm is also implemented separately elsewhere for other purposes.

F-I-D-O commented 4 years ago

Ok, so if it worked correctly before, please just fix it :)

neumannjan commented 4 years ago

FINALLY fixed in dfb43d3. This took me really long to figure out.

The issue was with the CSV reader. It was the following:

My solution:

F-I-D-O commented 4 years ago

Nice, interpreting text input is always tricky. I think that interpreting unexpected values as no edge is better than as zero weight edges. But it should be reported as a warning in log at least.

neumannjan commented 4 years ago

I think that interpreting unexpected values as no edge is better than as zero weight edges. But it should be reported as a warning in log at least.

I agree. If you look at my new solution though, it is done so.

F-I-D-O commented 4 years ago

I agree. If you look at my new solution though, it is done so.

No, it's not. The exception handling is correct. But there is no reporting. The incorrect input file format should be reported to the standard output.

neumannjan commented 4 years ago

Fixed in a7d57df.

F-I-D-O commented 4 years ago

solved