GilsonLabUCSD / pAPRika

Advanced toolkit for binding free energy calculations
BSD 3-Clause "New" or "Revised" License
31 stars 14 forks source link

IndexError when building restraints #204

Open lgsmith opened 5 days ago

lgsmith commented 5 days ago

I've been trying to modify tutorial-08 to do a free energy ligand binding calculation on Bromodomain-BRD4, very similar to the one in the APR paper from 2017. With my setup, Tutorial-08 will run to completion. When I swap in my own input files and change the box dimensions and offset of the dummy atoms, and the atom masks that need changing in the static restraint cell, I get an error upon trying to build the first static restraint. I am copying my traceback here, but am happy to provide whatever other information is needed. This is a clean conda env I built yesterday to contain openmmforcefields, openff-nagl, openff-toolkit, and paprika.

The first restraint call is to make a restraint between host atom 1 and dummy atom 1, so it's giving a restraint mask list of length 2 to the static_DAT_restraint constructor. The first part of the cell looks like this:

# use statics from above for ligand (G1, G2)
G1 = f":{ligname}@N4x"
G2 = f":{ligname}@Cl1x"  # this atom will 'lead' the pulling.
# G1 = ":ROC@C14x"
# G2 = ":ROC@C8x"
# CA atoms for residues Ile110, Thr131, Phe157 from paper
# CA atoms for residues Ile69, Thr90, Phe116 our model
H1 = ":69@CA"
H2 = ":90@CA"
H3 = ":116@CA"
# these just need to match the above build step
D1 = ":DM1@DUM"
D2 = ":DM2@DUM"
D3 = ":DM3@DUM"

static_restraints = []
r = restraints.static_DAT_restraint(
    restraint_mask_list=[D1, H1],
    num_window_list=windows,
    ref_structure=f"{compdir}/aligned_solvated_dummy_structure.pdb",
    force_constant=5.0,
    amber_index=False,  # Because we're using OpenMM
)

And here's the traceback:

IndexError                                Traceback (most recent call last)
Cell In[22], line 16
     13 D3 = ":DM3@DUM"
     15 static_restraints = []
---> 16 r = restraints.static_DAT_restraint(
     17     restraint_mask_list=[D1, H1],
     18     num_window_list=windows,
     19     ref_structure=f"{compdir}/aligned_solvated_dummy_structure.pdb",
     20     force_constant=5.0,
     21     amber_index=False,  # Because we're using OpenMM
     22 )
     23 static_restraints.append(r)
     24 r = restraints.static_DAT_restraint(
     25     restraint_mask_list=[H1, H2, H3],
     26     num_window_list=windows,
   (...)
     29     amber_index=False,
     30 )

File ~/miniconda3/envs/paprika/lib/python3.12/site-packages/paprika/restraints/restraints.py:988, in static_DAT_restraint(restraint_mask_list, num_window_list, ref_structure, force_constant, continuous_apr, amber_index)
    985 if len(restraint_mask_list) == 2:
    986     # Distance restraint
    987     if reference_trajectory.top.has_box():
--> 988         target = pytraj.distance(reference_trajectory, mask_string, image=True)[0]
    989         logger.debug("Calculating distance with 'image = True' ...")
    990     else:

IndexError: too many indices for array: array is 0-dimensional, but 1 were indexed

As you can see this probably means there's something wrong with the inputs to the pytraj distance calculation. This could be down to my masks being broken, although I am not sure how they would be. To me it looks like it thinks it's going to be operating on an array of frames, but of course per the tutorial there is just a single frame (when I visualize the aligned_solvated_dummy_structure.pdb in the compdir it looks like I would expect). Any troubleshooting ideas would be much appreciated.

slochower commented 5 days ago

It may be helpful to take your mask and your structure and just seeing what this command gives you pytraj.distance(reference_trajectory, mask_string, image=True).

jaketanderson commented 5 days ago

I recommend trying the suggestion given above. Also, could you share your input files? The tutorial masks are written with the expectation that each dummy atom has the same atom name (DUM) but a different residue name (DM1, DM2, DM3). Maybe your structure has the atom names and the residue names flipped?

Coincidentally, I have also written a notebook based on tutorial-8 that calculates ABFEs on BRD4. Maybe you can take a look and compare my code and structures to yours? If any of the structures or code is useful to you, please make use of it https://github.com/jaketanderson/PL-ABFE-BRD4

lgsmith commented 4 days ago

Thank you both for your responses. Yes, this is how this code reports a bad selection string, which uncovered an upstream problem with my notebook. For posterity, one can't have topology.to_openmm(ensure_unique_atom_names=True) for producing an openMM topology for an instance of Modeller if one is using openff for ligand parameters but wants to have a protein in the system. Maybe the openmm force field will know what to do with the protein atoms, or maybe it won't--it doesn't emit any errors--but if you're trying to find particular atoms on particular residues, say with the mask @CA, they'll return the empty array, which has zero dimension and therefore produces the traceback I got.

I would have really benefited from some better error handling here. Would you guys appreciate it if I added a try except clause in a few spots across restraints.py to catch these types of issues? If yes, do you have a sense for what you'd think a good intervention should be? A minimal solution could look something like this:

try:
    target = pytraj.distance(reference_trajectory, mask_string, image=True)[0]
except IndexError as e:
    print(mask_string, 
        'seems to have found an unexpected number of atoms (possibly zero). Validate these atom masks',  
        *restraint_mask_list)
    raise e

@jaketanderson Thanks for the alternative notebook. I'll definitely look through it.

jaketanderson commented 4 days ago

@lgsmith adding better checking here would be a good idea. Maybe something like this three-line implementation would do the job?

    # Target value
    rest_type = "distance"
    mask_string = " ".join(restraint_mask_list)
    if len(restraint_mask_list) != pytraj.select(restraint_mask_list, reference_trajectory):
        logger.error(f"There are {len(restraint_mask_list)} masks specified, but {pytraj.select(restraint_mask_list, reference_trajectory)} atoms match these masks.")
        raise IndexError("The number of selection masks doesn't match the number of atoms selected.")
    if len(restraint_mask_list) == 2:
        # Distance restraint

Original: https://github.com/GilsonLabUCSD/pAPRika/blob/c0e52bd9b9b4e77c0005180730fe8f5cb2f9c34c/paprika/restraints/restraints.py#L982-L986