OpenFreeEnergy / gufe

grand unified free energy by OpenFE
https://gufe.readthedocs.io
MIT License
28 stars 7 forks source link

ProtocolDAG._iterate_dag_order is unstable #314

Open LilDojd opened 2 months ago

LilDojd commented 2 months ago

Expected Behavior

ProtocolDAG dag order and key are stable on roundtrip from one process to another

Current Behavior

When playing around with random graphs, I stumbled upon some odd behaviour.

There are some very special hard-to-reproduce cases when the ProtocolDAG key is unstable due to different ordering of protocol_units in nx.topological_sort on different hardware architectures or even different processes. This is related to an old networkx issue: https://github.com/networkx/networkx/issues/1181

Steps to Reproduce

test.json

from typing import List
from gufe import ProtocolDAG, ProtocolUnit
from gufe.tokenization import is_gufe_dict, modify_dependencies, from_dict
import gufe

def roundtrip_key(pdag):
    return ProtocolDAG.from_dict(pdag.to_dict()).key

def roundtrip_pus(pdag) -> List[ProtocolUnit]:
    return ProtocolDAG.from_dict(pdag.to_dict()).protocol_units

def is_reordered(list1, list2):

    if len(list1) != len(list2):
        return False

    freq1 = {}
    freq2 = {}

    for item in list1:
        freq1[item] = freq1.get(item, 0) + 1

    for item in list2:
        freq2[item] = freq2.get(item, 0) + 1

    if freq1 == freq2:
        return True
    else:
        return False

class DummyUnit(ProtocolUnit):
    def _execute(self):
        return {"foo": "bar"}

with open("../test.json", "r") as rf:
    pus = json.load(rf)
    pus = modify_dependencies(pus, from_dict, is_gufe_dict, mode="decode")

pdag = ProtocolDAG(protocol_units=pus, transformation_key="somekey")

# Roundtrip on other process
import multiprocessing

pool = multiprocessing.Pool(processes=1)
print(f"Original key: {pdag.key}, Roundtrip key: {pool.map(roundtrip_key, [pdag])[0]}")

print("Is reordered:", is_reordered(pdag.protocol_units, pool.map(roundtrip_pus, [pdag])[0]))

pool.close()

Outputs:

Original key: ProtocolDAG-a05ef9eab9006e410eb4b3ad0e9a3664, Roundtrip key: ProtocolDAG-a7555d4f141783f28da2d2a172b0bf74
Is reordered: True

Context (Environment)

python: 3.10 gufe: 1.0.0 networkx: 3.2.1

Possible Implementation

Either use OrderedDiGraph from networkx which guarantees consistent ordering or additionally sort ProtocolUnits lexicographically

From

https://github.com/OpenFreeEnergy/gufe/blob/2425b4cf7c08047dc27b559b0e5729215f05f002/gufe/protocols/protocoldag.py#L47-L49

To: d95fa5d9eb85e0e7c95a6f02fef30011b6b4a4c4