mwinokan / PoseButcher

Pose butcher segments a ligand into categories
https://posebutcher.winokan.com
2 stars 0 forks source link

Provide expansion vector limits to Syndirella #29

Open mwinokan opened 7 months ago

mwinokan commented 7 months ago
kate-fie commented 7 months ago

I'm seeing this workflow

  1. Posebutcher identify atom indicies on base compound to yes or no elaborate
  2. Posebutcher derive estimate of number of atoms added
  3. Syndirella atom maps from base compound to reactant
  4. Filter reactants

I'm going to try the Kartograf atom mapping tool

mwinokan commented 7 months ago

Kate To-Do's

{ atom_index: { "num_atom_added": X, "destination": Y }, ... }

Where X is an integer and Y is in ['protein', 'solvent', 'pocket']

mwinokan commented 7 months ago

Max To-Do's

mwinokan commented 7 months ago

I finally got around to creating a butcher for the relaxed Ax0310a. I had to get rid of pockets P5 & P6 for now but it works. This is the butcher directory you can import with PoseButcher.from_directory(). You will need to get posebutcher==0.0.19 from PyPI

butcher_2a_x0310_noP5P6.zip

I will also write up a doc page with an example procedure. For now here is the sample butcher.explore output for the base LXINEYASRREWNB-VIFPVBQESA-N (N.B. the fields 'destination' and 'max_atoms_added':

[{'atom_index': 0,
  'origin': ('GOOD', 'pocket', 'P1'),
  'direction': array([ 0.7915107 , -0.34428629, -0.50495322]),
  'intersections': {5.925: ('BAD', 'solvent space')},
  'first_intersection_distance': 5.925,
  'new_pocket': False,
  'last_intersection_distance': 5.925,
  'destination': 'solvent space',
  'max_atoms_added': inf,
  'success': True},
 {'atom_index': 1},
 {'atom_index': 2},
 {'atom_index': 3,
  'origin': ('GOOD', 'pocket', 'P1'),
  'direction': array([-0.54640907, -0.36690284,  0.75287411]),
  'intersections': {3.177: ('BAD', 'protein clash')},
  'first_intersection_distance': 3.177,
  'new_pocket': False,
  'last_intersection_distance': 3.177,
  'destination': 'protein clash',
  'max_atoms_added': 7,
  'success': True},
 {'atom_index': 4},
 {'atom_index': 5,
  'origin': ('GOOD', 'pocket', 'P1'),
  'direction': array([-0.03015966,  0.02861301,  0.99913547]),
  'intersections': {2.21: ('BAD', 'protein clash')},
  'first_intersection_distance': 2.21,
  'new_pocket': False,
  'last_intersection_distance': 2.21,
  'destination': 'protein clash',
  'max_atoms_added': 1,
  'success': True},
 {'atom_index': 6},
 {'atom_index': 7,
  'origin': ('GOOD', 'pocket', 'P1'),
  'direction': array([ 0.06275531,  0.23879619, -0.96903981]),
  'intersections': {0.887: ('GOOD', 'pocket', "P1'"),
   5.631: ('BAD', 'solvent space')},
  'first_intersection_distance': 0.887,
  'new_pocket': True,
  'last_intersection_distance': 5.631,
  'destination': 'solvent space',
  'max_atoms_added': inf,
  'success': True},
 {'atom_index': 8,
  'origin': ('GOOD', 'pocket', 'P1'),
  'direction': array([-0.63736877, -0.5844069 ,  0.50222468]),
  'intersections': {1.903: ('GOOD', 'pocket', "P2'"),
   1.995: ('BAD', 'protein clash')},
  'first_intersection_distance': 1.903,
  'new_pocket': True,
  'last_intersection_distance': 1.995,
  'destination': 'protein clash',
  'max_atoms_added': 1,
  'success': True},
 {'atom_index': 9,
  'origin': ('GOOD', 'pocket', 'P2'),
  'direction': array([ 0.17790221,  0.93172511, -0.31660562]),
  'intersections': {0.027: ('GOOD', 'pocket', "P1'"),
   2.367: ('BAD', 'protein clash')},
  'first_intersection_distance': 0.027,
  'new_pocket': True,
  'last_intersection_distance': 2.367,
  'destination': 'protein clash',
  'max_atoms_added': 1,
  'success': True},
 {'atom_index': 10,
  'origin': ('GOOD', 'pocket', 'P2'),
  'direction': array([-0.13990879,  0.9836797 ,  0.11313612]),
  'intersections': {1.63: ('BAD', 'protein clash')},
  'first_intersection_distance': 1.63,
  'new_pocket': False,
  'last_intersection_distance': 1.63,
  'destination': 'protein clash',
  'max_atoms_added': 1,
  'success': True},
 {'atom_index': 11},
 {'atom_index': 12,
  'origin': ('GOOD', 'pocket', 'P1'),
  'direction': array([ 0.32430707, -0.38268234, -0.8650891 ]),
  'intersections': {4.516: ('BAD', 'solvent space')},
  'first_intersection_distance': 4.516,
  'new_pocket': False,
  'last_intersection_distance': 4.516,
  'destination': 'solvent space',
  'max_atoms_added': 7,
  'success': True},
 {'atom_index': 13},
 {'atom_index': 14,
  'origin': ('GOOD', 'pocket', 'P1'),
  'direction': array([-0.29913694,  0.55005234,  0.77971759]),
  'intersections': {0.638: ('GOOD', 'pocket', 'P2'),
   7.817: ('BAD', 'protein clash')},
  'first_intersection_distance': 0.638,
  'new_pocket': True,
  'last_intersection_distance': 7.817,
  'destination': 'protein clash',
  'max_atoms_added': inf,
  'success': True},
 {'atom_index': 15},
 {'atom_index': 16,
  'origin': ('BAD', 'solvent space'),
  'direction': array([-0.94793021,  0.28411648,  0.14389633]),
  'intersections': {2.212: ('GOOD', 'pocket', "P1'"),
   4.422: ('BAD', 'protein clash')},
  'first_intersection_distance': 2.212,
  'new_pocket': True,
  'last_intersection_distance': 4.422,
  'destination': 'protein clash',
  'max_atoms_added': 7,
  'success': True},
 {'atom_index': 17},
 {'atom_index': 18},
 {'atom_index': 19,
  'origin': ('BAD', 'solvent space'),
  'direction': array([ 0.55288204, -0.7298518 ,  0.40204204]),
  'intersections': {4.862: ('BAD', 'protein clash')},
  'first_intersection_distance': 4.862,
  'new_pocket': False,
  'last_intersection_distance': 4.862,
  'destination': 'protein clash',
  'max_atoms_added': 14,
  'success': True}]
mwinokan commented 7 months ago

See the doc page I wrote up

You will need molparse==0.0.18 and posebutcher==0.0.20

mwinokan commented 6 months ago

@kate-fie here is a quick summary of what I suggested today:

To compare elaboration E to base B (could be reactant superstructure R' vs R too):

  1. Classify the vectors expanding from B using posebutcher.explore
  2. Calculate all the MCS mappings E onto B
  3. For each of those mappings:
    • Evaluate which vector limits are exceeded
    • Dismiss / don't place any elaboration E where any of the vector limits are exceeded for all of it's mappings
    • All other elaborations should be placed

Just a clarification on the conditional, I think the following pseudocode should do the trick:

place = False
for mapping in elaboration.mappings:
    for vector in mapping.vectors:
        if not is_vector_valid(vector):
            break
    else:
        # all vectors valid
        place = True
        break