Open mattwthompson opened 2 years ago
There is another project retworkx
that has some good benchmarks on distance/pair calculations and subgraph isomorphisms. It looks less mature that the graph-tool but they have a JOSS paper in review.
igraph
is also mentioned and appears to perform well.
@mattwthompson : It would be helpful to provide some timing and profiling data justifying the importance of this change. Do you have (1) a benchmark case, (2) timing data, and (3) profiling data clearly demonstrating the need and pinning blame on a specific NetworkX operation that would clearly be improved by an alternative implementation?
It's not clear to me, for example, that such a need actually exists, what its priority is, or whether it is an algorithmic (rather than a performance issue, where gains due to optimizations are usually much smaller).
There's no compelling need in the existing toolkit functionality, but some part of the topology refactor would benefit from this. There's some amount of NetworkX graph isomorphism used in Molecule.from_pdb
. Unfortunately that PR crashes my browser and even git diff
chokes locally for me, so it's hard for me to dig into the source code now, but I ran a quick snakeviz run on this file (with crystal waters trimmed out), which has only 2423 atoms. The overall runtime is slow (150s) because caching is not fully implemented yet, so the direct impact of NetworkX's slow algorithms is blurred. This can be revisited once cached and other speedups are implemented.
For some medium-sized systems, ~70% of Interchange.to_gromacs
is spent creating NetworkX graphs. This risks being an upper bound on how fast writing out GROMACS files can be for systems of more than a few thousand atoms. The toolkit already tries to do things like short-circuit before graph-matching when the molecules very obviously differ. Though perhaps there are other ideas to explore (more efficient way to create these graphs, other ways graph-matching could altogether be bypassed, etc.).
https://github.com/openforcefield/openff-interchange/issues/1014#issuecomment-2263459426
graph-tool seems to be a more performant replacement for
networkx
. It is available onconda-forge
, although it may or may not make our dependency footprint larger, i.e.