choderalab / ensembler

Automated omics-scale protein modeling and simulation setup.
http://ensembler.readthedocs.io/
GNU General Public License v2.0
52 stars 21 forks source link

Add a manual template? #72

Open rafwiewiora opened 8 years ago

rafwiewiora commented 8 years ago

I want to add a manual template to the modeling (unpublished structure). @sonyahanson John said you've done this before? Do you have any quick pointers / script?

Thanks so much!

rafwiewiora commented 8 years ago

Ok, so this is pretty simples, yet hacky unfortunately. All you need is, after ensembler gather_templates and before ensembler align:

So for now, I input the manual PDB in a manual_pdbs directory and run this between gather_templates and align:

import mdtraj as md

# put manual pdbs to be added in manual_pdbs/
manual_pdbs = ['TDIX.pdb']

for pdb in manual_pdbs:
    traj = md.load('manual_pdbs/' + pdb)
    protein_atoms = traj.top.select('protein')
    traj = traj.atom_slice(protein_atoms)
    traj.save('templates/structures-resolved/SETD8_HUMAN_%s_A.pdb' % pdb.split('.')[0])
    resolved_seq = traj.top.to_fasta()[0]
    f = open('templates/templates-resolved-seq.fa', 'a')
    f.write('\n>SETD8_HUMAN_%s_A\n' % pdb.split('.')[0])
    f.write(resolved_seq)
    f.write('\n')
    f.close()

The only obstacle I had for just passing the PDB to Ensembler API was the use of SIFTS files - will need to write something extracting the appropriate features from the PDB. Let me think about this a bit more and propose something (ultimately I think having to call gather_templates twice would be good - e.g. first gather from Uniprot, then ensembler gather_templates --gather_from manual_pdb --structure_path X).

@danielparton do you have any suggestions? I was wondering a couple of things about the code:

sonyahanson commented 8 years ago

Sorry for the delay here. I like your script for adding the new templates to the fasta file programmatically. I have just done this manually so far. I have a pretty complete description of how I've been using ensembler in the astrazeneca and dansu-dansu repos.

rafwiewiora commented 8 years ago

Cool, thanks!