MDAnalysis / mdanalysis

MDAnalysis is a Python library to analyze molecular dynamics simulations.
https://mdanalysis.org
Other
1.33k stars 652 forks source link

Documentation for reconstructing square distance matrix from `self_distance_array` is incorrect. #4731

Open hmacdope opened 1 month ago

hmacdope commented 1 month ago

The documentation for self_distance_array has the following as the suggested way to reconstruct a full NxN distance matrix from the flattened upper triangular (-diagonal) N*(N-2) // 2 result.

for i in range(n):
    for j in range(i + 1, n):
        k += 1
        dist[i, j] = d[k]

This is incorrect as the increment must come after the assignment, otherwise you attempt to access N+1 the element of the array on final iteration.

The below is correct.

for i in range(n):
    for j in range(i + 1, n):
        dist[i, j] = d[k]
        k += 1
orbeckst commented 1 month ago

Shouldn't we also add the transpose right away

dist[i, j] = dist[j, i] = d[k]

if we want the full NxN matrix?

hmacdope commented 1 month ago

Yea that as well

tylerjereddy commented 1 month ago

Any reason not to suggest using squareform for this?

orbeckst commented 1 month ago

No good reason — I didn't know about squareform.

There's the pedagogical advantage of showing exactly what the relationship between input and output is but the scipy docs also contain it.

One would just need to check that our output of self_distance_array is actually compatible with the scipy convention.