Closed j-wags closed 8 years ago
Test out
Issue #56 should cover this much better in the interim while we await testing for a better program.
I'm pursuing the pymol route here. Key questions to answer will be: Which atoms are by default used for the alignment? Can this be changed? Could a pymol-based command line tool for the alignment offer more options than Schrodinger structalign?
Preliminary code, which performs the alignment using geometric (proximity to binding site) and secondary structure (helices and sheets) constraints. The atom selection is currently set to alpha carbons but can be modified. I do not see any option to apply different weights to different atoms. This will not run on nif1 as pymol is not yet available.
import __main__
__main__.pymol_argv = [ 'pymol', '-c'] #put -cq here to suppress pymol output
import pymol
pymol.finish_launching()
import sys
#### Helper function from http://pymolwiki.org/index.php/Ss
def ss(selection):
class SSE(object):
def __init__(self, start, typ):
self.start, self.typ = start, typ
self.end = -1
def __repr__(self):
return "%s-%s %s" % (self.start, self.end, self.typ)
pymol.stored.pairs = []
pymol.cmd.iterate("%s and n. ca" % selection, "stored.pairs.append((resi, ss))")
num, currentType = pymol.stored.pairs[0]
sses = [SSE(num, currentType)]
currentSSE = sses[0]
for resi, ssType in pymol.stored.pairs:
if ssType == currentType:
currentSSE.end = resi
else:
sses.append(SSE(resi, ssType))
currentSSE = sses[-1]
currentType = ssType
return sses
#for sse in sses:
# print sse
caAlignment = True
molPrefixes = ['largest_splitted_receptor1',
'smallest_splitted_receptor1',
'apo_splitted_receptor1',
'holo_splitted_receptor1',]
for molPrefix in molPrefixes:
pymol.cmd.load('/extra/banzai2/j5wagner/CELPP/2016_07_14_pymol_alignment/test_data/5ev8/%s.pdb' %(molPrefix))
### Geometric filter ###
# >>> help(pymol.cmd.pseudoatom)
#Help on function pseudoatom in module pymol.creating:
#
#pseudoatom(object='', selection='', name='PS1', resn='PSD', resi='1', chain='P', segi='PSDO', elem='PS', vdw=-1.0, hetatm=1, b=0.0, q=0.0, color='', label='', pos=None, state=0, mode='rms', quiet=1,)
center = open('/extra/banzai2/j5wagner/CELPP/2016_07_14_pymol_alignment/test_data/5ev8/center.txt').read().split()
center = [float(i.strip(',')) for i in center]
print center
pymol.cmd.pseudoatom('largest_lig_center', pos=center)
pymol.cmd.select('reference',"(/largest_splitted_receptor1//A//CA) and (byres (largest_lig_center around 15))" ) # Remove reference to chain A in actual deploy - Shuai's solution to issue 56 will cover this
for molPrefix in molPrefixes[1:]:
print
print
print '=========================='
print "PROCESSING", molPrefix
### Secondary structure filter ###
# Get resnums and their SS types - Can filter to "H" and "S" with list comp
#resiSS = pymol.cmd.iterate('/apo_splitted_receptor1////CA','(resi,resv,ss)')
sses = ss('/%s' %(molPrefix)) # not actually used, but cool
print sses # not actually used, but cool
pymol.stored.tuples = []
resiSS = pymol.cmd.iterate(molPrefix,'stored.tuples.append((resn,resv,ss))')
print pymol.stored.tuples
print "FILTERING FOR SS"
ss_near_bs = [i for i in pymol.stored.tuples if i[2] in ['H','S']]
print ss_near_bs
resv_sel_str = '+'.join([str(i[1]) for i in ss_near_bs])
print resv_sel_str
### 3D alignment ###
## API reference ##
## http://www.pymolwiki.org/index.php/Align
#cmd.align( string mobile, string target, float cutoff=2.0,
# int cycles=5, float gap=-10.0, float extend=-0.5,
# int max_gap=50, string object=None, string matrix='BLOSUM62',
# int mobile_state=0, int target_state=0, int quiet=1,
# int max_skip=0, int transform=1, int reset=0 )
# Remove the A chain label when this reaches production - Shuai's fix to issue 56 will only leave one chain.
if caAlignment:
alnOut = pymol.cmd.align('/%s//A//CA'%(molPrefix),
'reference'
#'/largest_splitted_receptor1//A/`%s/CA'%(resv_sel_str)
)# Remove reference to chain A in actual deploy - Shuai's solution to issue 56 will cover this
else:
alnOut = pymol.cmd.align('/%s//A//'%(molPrefix),
'reference',
#'/largest_splitted_receptor1//A//CA',
)# Remove reference to chain A in actual deploy - Shuai's solution to issue 56 will cover this
pymol.cmd.save('%s_aligned.pdb' %(molPrefix),
'/%s' %(molPrefix)
)
print alnOut
pymol.cmd.save('largest_reference.pdb' ,
'reference'
)
#print sys.stdout.read()
pymol.cmd.quit()
Meeting note: We want calpha, radius, and SS emphasis.
Implemented in current jeffdev, and appears to run properly and get qualitatively similar results to schrodinger structalign. Only currently being used for the chimera_proteinligprep stage. We should discuss if we want it in the evaluation stage, where we do another alignment.
nice! can you show some sample results at next celpp meeting?
On 7/28/2016 8:45 PM, j-wags wrote:
Implemented in current jeffdev, and appears to run properly and get qualitatively similar results to schrodinger structalign. Only currently being used for the chimera_proteinligprep stage. We should discuss if we want it in the evaluation stage, where we do another alignment.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/drugdata/D3R/issues/39#issuecomment-236087811, or mute the thread https://github.com/notifications/unsubscribe-auth/AQEJQHfV7ORpeKBfl4n4lYMd6Y9sKAtcks5qaXdwgaJpZM4IsN8r.
Sure. I'm out tomorrow but we can discuss wednesday
Pymol alignment implemented in pull request #71
In testing the updated docking stages, I've found a few weird alignment cases in the tarball.
For example: Week 19's apo-5b15_3vpe Week 19's apo-5e8z_4x2n (notably large distance, alignment may have skipped) Week 19's apo-5job_3t13 Week 22's apo-5b2z-1bkd Week 22's smallest-5ev8_2doo is aligned to the wrong domain of largest (docking box will not be drawn there, this is tough, maybe should be separate issue) Week 22's apo-5ewa_4f6h (again, wrong domain aligned, tough problem)