proteneer / timemachine

Differentiate all the things!
Other
138 stars 17 forks source link

Ensure consistent dihedral angle computations with OpenMM and mdanalysis #1247

Open proteneer opened 9 months ago

proteneer commented 9 months ago

This PR addresses an issue where we compute the dihedral angle in a way that's the negative of what OpenMM and mdanalysis computes. Note that this does not affect energy/force computations for currently used forcefields (both ligand and protein), since they explicitly define phases of 0 and pi, resulting in a symmetric energy profile. This is mostly a preventative measure for the future if we do implement non-symmetric periodic torsions (either via restraints or using a new forcefield).

proteneer commented 9 months ago

@badisa when you get a chance, not sure why this test is failing:

tests/test_local_move.py::test_local_md_particle_density[1.0-True] FAILED

============================================================================================================================================= FAILURES ==============================================================================================================================================
_____________________________________________________________________________________________________________________________ test_local_md_particle_density[1.0-True] ______________________________________________________________________________________________________________________________

freeze_reference = True, k = 1.0

    @pytest.mark.parametrize("freeze_reference", [True, False])
    @pytest.mark.parametrize("k", [1.0, 1000.0, 10000.0])
    def test_local_md_particle_density(freeze_reference, k):
        """Verify that the average particle density around a single particle is stable.

        In the naive implementation of local md, a vacuum can appear around the local idxs. See naive_local_resampling_move
        for what the incorrect implementation looks like. The vacuum is introduced due to discretization error where in a step
        a particle moves away from the local idxs and is frozen in the next round of local MD.
        """
        mol, _ = get_biphenyl()
        ff = Forcefield.load_from_file("smirnoff_1_1_0_sc.py")

        temperature = constants.DEFAULT_TEMP
        dt = 1.5e-3
        friction = 1.0
        seed = 2022
        cutoff = 1.2

        # Have to minimize, else there can be clashes and the local moves will cause crashes
        unbound_potentials, sys_params, masses, coords, box = get_solvent_phase_system(
            mol, ff, 0.0, box_width=4.0, margin=0.1
        )

        local_idxs = np.arange(len(coords) - mol.GetNumAtoms(), len(coords), dtype=np.int32)

        nblist = custom_ops.Neighborlist_f32(coords.shape[0])

        # Construct list of atoms in the inner shell
        nblist.set_row_idxs(local_idxs.astype(np.uint32))

        intg = LangevinIntegrator(temperature, dt, friction, masses, seed)

        intg_impl = intg.impl()

        v0 = np.zeros_like(coords)
        bps = []
        for p, bp in zip(sys_params, unbound_potentials):
            bps.append(bp.bind(p).to_gpu(np.float32).bound_impl)

        def num_particles_near_ligand(pair):
            new_coords, new_box = pair
            assert coords.shape == new_coords.shape
            assert box.shape == new_box.shape

            ixns = nblist.get_nblist(new_coords, new_box, cutoff)
            flattened = np.concatenate(ixns).reshape(-1)
            inner_shell_idxs = np.unique(flattened)

            # Combine all of the indices that are involved in the inner shell
            subsystem_idxs = np.unique(np.concatenate([inner_shell_idxs, local_idxs]))
            return len(subsystem_idxs)

        ctxt = custom_ops.Context(coords, v0, box, intg_impl, bps)
        # Equilibrate using global steps to start off from a reasonable place
        x0, boxes = ctxt.multiple_steps(1000)

        rng = np.random.default_rng(seed)

        def local_move(_, steps=500):
            local_seed = rng.integers(np.iinfo(np.int32).max)
            xs, boxes = ctxt.multiple_steps_local(steps, local_idxs, k=k, seed=local_seed)
            return (xs[-1], boxes[-1]), None

        # The threshold for this test is sensitive to the random seed. Selected by setting threshold that passes with 10 seeds
>       assert_no_drift(
            (x0[-1], boxes[-1]), local_move, num_particles_near_ligand, n_local_resampling_iterations=250, threshold=0.08
        )

tests/test_local_move.py:268: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

init_args = (array([[-1.44037598, -0.62489441, -0.56411787],
       [-1.46654334, -0.59426654, -0.47246863],
       [-1.49927859, ...   [ 0.13181651, -0.03992045, -0.00980461]]), array([[4.1, 0. , 0. ],
       [0. , 4.1, 0. ],
       [0. , 0. , 4.1]]))
move_fxn = <function test_local_md_particle_density.<locals>.local_move at 0x7f839aca7370>, observable_fxn = <function test_local_md_particle_density.<locals>.num_particles_near_ligand at 0x7f839aca7760>, n_local_resampling_iterations = 250, n_samples = 10, threshold = 0.08

    def assert_no_drift(
        init_args, move_fxn, observable_fxn, n_local_resampling_iterations=100, n_samples=10, threshold=0.5
    ):
        assert n_local_resampling_iterations > 2 * n_samples, "Need iterations to be 2x n_samples"
        assert 0.0 <= threshold <= 1.0, "Threshold must be in interval [0.0, 1.0]"

        traj = [init_args]
        aux_traj = []

        for _ in range(n_local_resampling_iterations):
            updated, aux = move_fxn(traj[-1])

            traj.append(updated)
            aux_traj.append(aux)

        expected_selection_fraction_traj = np.array([observable_fxn(x) for x in traj])

        # A sanity check that the early samples don't have a massive jump within them
        # in the case of unstable local MD this test can pass neighboring atoms go from ~10% of the system to ~100% of the system
        # in the first step
        differences_early = np.abs(
            np.diff(expected_selection_fraction_traj[:n_samples]) / expected_selection_fraction_traj[0]
        )
        assert (
            np.all(differences_early) < 2.0
        ), "Difference between first and last sample greater than 200%, likely unstable"

        avg_at_start = np.mean(expected_selection_fraction_traj[:n_samples])
        avg_at_end = np.mean(expected_selection_fraction_traj[-n_samples:])
        if avg_at_start == avg_at_end:
            assert not np.all(expected_selection_fraction_traj == avg_at_end), "Observable values all identical"

        percent_diff = np.abs(((avg_at_start - avg_at_end) / avg_at_start))
        if percent_diff > threshold:
            msg = f"""
                observable avg over start frames = {avg_at_start:.3f}
                observable avg over end frames = {avg_at_end:.3f}
                but averages of this (and all other observables) should be constant over time
            """
>           assert percent_diff <= threshold, msg
E           AssertionError: 
E                         observable avg over start frames = 1335.000
E                         observable avg over end frames = 1202.800
E                         but averages of this (and all other observables) should be constant over time
E                     
E           assert 0.09902621722846446 <= 0.08

tests/test_local_move.py:89: AssertionError
badisa commented 9 months ago

@badisa when you get a chance, not sure why this test is failing:

tests/test_local_move.py::test_local_md_particle_density[1.0-True] FAILED

============================================================================================================================================= FAILURES ==============================================================================================================================================
_____________________________________________________________________________________________________________________________ test_local_md_particle_density[1.0-True] ______________________________________________________________________________________________________________________________

freeze_reference = True, k = 1.0

    @pytest.mark.parametrize("freeze_reference", [True, False])
    @pytest.mark.parametrize("k", [1.0, 1000.0, 10000.0])
    def test_local_md_particle_density(freeze_reference, k):
        """Verify that the average particle density around a single particle is stable.

        In the naive implementation of local md, a vacuum can appear around the local idxs. See naive_local_resampling_move
        for what the incorrect implementation looks like. The vacuum is introduced due to discretization error where in a step
        a particle moves away from the local idxs and is frozen in the next round of local MD.
        """
        mol, _ = get_biphenyl()
        ff = Forcefield.load_from_file("smirnoff_1_1_0_sc.py")

        temperature = constants.DEFAULT_TEMP
        dt = 1.5e-3
        friction = 1.0
        seed = 2022
        cutoff = 1.2

        # Have to minimize, else there can be clashes and the local moves will cause crashes
        unbound_potentials, sys_params, masses, coords, box = get_solvent_phase_system(
            mol, ff, 0.0, box_width=4.0, margin=0.1
        )

        local_idxs = np.arange(len(coords) - mol.GetNumAtoms(), len(coords), dtype=np.int32)

        nblist = custom_ops.Neighborlist_f32(coords.shape[0])

        # Construct list of atoms in the inner shell
        nblist.set_row_idxs(local_idxs.astype(np.uint32))

        intg = LangevinIntegrator(temperature, dt, friction, masses, seed)

        intg_impl = intg.impl()

        v0 = np.zeros_like(coords)
        bps = []
        for p, bp in zip(sys_params, unbound_potentials):
            bps.append(bp.bind(p).to_gpu(np.float32).bound_impl)

        def num_particles_near_ligand(pair):
            new_coords, new_box = pair
            assert coords.shape == new_coords.shape
            assert box.shape == new_box.shape

            ixns = nblist.get_nblist(new_coords, new_box, cutoff)
            flattened = np.concatenate(ixns).reshape(-1)
            inner_shell_idxs = np.unique(flattened)

            # Combine all of the indices that are involved in the inner shell
            subsystem_idxs = np.unique(np.concatenate([inner_shell_idxs, local_idxs]))
            return len(subsystem_idxs)

        ctxt = custom_ops.Context(coords, v0, box, intg_impl, bps)
        # Equilibrate using global steps to start off from a reasonable place
        x0, boxes = ctxt.multiple_steps(1000)

        rng = np.random.default_rng(seed)

        def local_move(_, steps=500):
            local_seed = rng.integers(np.iinfo(np.int32).max)
            xs, boxes = ctxt.multiple_steps_local(steps, local_idxs, k=k, seed=local_seed)
            return (xs[-1], boxes[-1]), None

        # The threshold for this test is sensitive to the random seed. Selected by setting threshold that passes with 10 seeds
>       assert_no_drift(
            (x0[-1], boxes[-1]), local_move, num_particles_near_ligand, n_local_resampling_iterations=250, threshold=0.08
        )

tests/test_local_move.py:268: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

init_args = (array([[-1.44037598, -0.62489441, -0.56411787],
       [-1.46654334, -0.59426654, -0.47246863],
       [-1.49927859, ...   [ 0.13181651, -0.03992045, -0.00980461]]), array([[4.1, 0. , 0. ],
       [0. , 4.1, 0. ],
       [0. , 0. , 4.1]]))
move_fxn = <function test_local_md_particle_density.<locals>.local_move at 0x7f839aca7370>, observable_fxn = <function test_local_md_particle_density.<locals>.num_particles_near_ligand at 0x7f839aca7760>, n_local_resampling_iterations = 250, n_samples = 10, threshold = 0.08

    def assert_no_drift(
        init_args, move_fxn, observable_fxn, n_local_resampling_iterations=100, n_samples=10, threshold=0.5
    ):
        assert n_local_resampling_iterations > 2 * n_samples, "Need iterations to be 2x n_samples"
        assert 0.0 <= threshold <= 1.0, "Threshold must be in interval [0.0, 1.0]"

        traj = [init_args]
        aux_traj = []

        for _ in range(n_local_resampling_iterations):
            updated, aux = move_fxn(traj[-1])

            traj.append(updated)
            aux_traj.append(aux)

        expected_selection_fraction_traj = np.array([observable_fxn(x) for x in traj])

        # A sanity check that the early samples don't have a massive jump within them
        # in the case of unstable local MD this test can pass neighboring atoms go from ~10% of the system to ~100% of the system
        # in the first step
        differences_early = np.abs(
            np.diff(expected_selection_fraction_traj[:n_samples]) / expected_selection_fraction_traj[0]
        )
        assert (
            np.all(differences_early) < 2.0
        ), "Difference between first and last sample greater than 200%, likely unstable"

        avg_at_start = np.mean(expected_selection_fraction_traj[:n_samples])
        avg_at_end = np.mean(expected_selection_fraction_traj[-n_samples:])
        if avg_at_start == avg_at_end:
            assert not np.all(expected_selection_fraction_traj == avg_at_end), "Observable values all identical"

        percent_diff = np.abs(((avg_at_start - avg_at_end) / avg_at_start))
        if percent_diff > threshold:
            msg = f"""
                observable avg over start frames = {avg_at_start:.3f}
                observable avg over end frames = {avg_at_end:.3f}
                but averages of this (and all other observables) should be constant over time
            """
>           assert percent_diff <= threshold, msg
E           AssertionError: 
E                         observable avg over start frames = 1335.000
E                         observable avg over end frames = 1202.800
E                         but averages of this (and all other observables) should be constant over time
E                     
E           assert 0.09902621722846446 <= 0.08

tests/test_local_move.py:89: AssertionError

This test has been flaky in the past, will play around with it. Easy answer is to change the asertion to 0.1, but 10% seems pretty great.

proteneer commented 9 months ago

Sigh - the phase is defined such that theres a local maxima at theta=phase. If we want to define a minima, we'd need to offset with new_phase=pi + old_phase

badisa commented 7 months ago

@badisa when you get a chance, not sure why this test is failing:

Think this should now be fixed by https://github.com/proteneer/timemachine/pull/1275