openmm / pdbfixer

PDBFixer fixes problems in PDB files
Other
461 stars 115 forks source link

PDB clash pretty printer? #165

Closed nitroamos closed 6 years ago

nitroamos commented 6 years ago

Do you have a utility for reporting energies? I'm interested in something I could use to report the quality of the final structure wrt clashes, bad geometries, etc. Residue intra/inter energies is my ideal. The closest I could find was was a raw call to State.getForces.

peastman commented 6 years ago

There isn't a higher level routine for that. Depending on what quantity you want, it might (or might not) be easy to calculate. Just looking for any forces above a cutoff is a reasonable first pass at it.

nitroamos commented 6 years ago

For what it's worth, here's the code I wrote while debugging membrane placement issues:

from scipy.spatial.distance import pdist
import numpy

def getTag(a):
   return "%2s%-4s %3s %4s"%(a.residue.chain.id,
                             a.residue.id,
                             a.residue.name,a.name)

def printCloseDistances(topology,positions,cutoff=0.09*unit.nanometers):
   atoms = list(topology.atoms())
   print("Printing close distances:",cutoff,len(atoms))
   pos = positions.value_in_unit(unit.nanometers)
   mat = pdist(positions.value_in_unit(unit.nanometers),'sqeuclidean')
   cutoff2 = cutoff.value_in_unit(unit.nanometers)*cutoff.value_in_unit(unit.nanometers)
   close = numpy.where(mat < cutoff2)[0]
   print("Num close:",len(close))

   def calc_row_idx(k, n):
      return int(math.ceil((1/2.) * (- (-8*k + 4 *n**2 -4*n - 7)**0.5 + 2*n -1) - 1))

   def elem_in_i_rows(i, n):
      return i * (n - 1 - i) + (i*(i + 1))/2

   def calc_col_idx(k, i, n):
      return int(n - elem_in_i_rows(i + 1, n) + k)

   def condensed_to_square(k, n):
      i = calc_row_idx(k, n)
      j = calc_col_idx(k, i, n)
      return i, j

   closeDists = {}
   for k in close:
      closeDists[k] = mat[k]

   for k,dist2 in sorted(closeDists.items(), key=operator.itemgetter(1)):
      i,j = condensed_to_square(k,len(atoms))
      a1 = atoms[i]
      a2 = atoms[j]
      dist = unit.sqrt(dist2)
      print("%s : %s : %6s"%(getTag(a1),getTag(a2),dist),pos[i],pos[j])

   print("Done printing distances",len(close))

which may be invoked like this:

printCloseDistances(fixer.topology,fixer.positions,cutoff=0.05*unit.nanometers)

This isn't production ready code as it's extremely memory intensive. I think I was seeing >60gb when testing with waters and membrane! However, our headnode was able to accommodate it so it was met my need of needing something for debugging this Including it here in case someone else finds it useful.