samirelanduk / atomium

Python macromolecular parsing (with .pdb/.cif/.mmtf parsing and production)
https://atomium.bio
MIT License
102 stars 19 forks source link

Auto-creation of bonds. #40

Open BradyAJohnston opened 2 years ago

BradyAJohnston commented 2 years ago

I'm sorry if this is obvious, but I have looked around and tried to have a go at it myself, but I am unable to find a way to "auto" generate bonds from a pdb, based on proximity to other atoms etc. Is this implemented somewhere by any chance?

I am currently testing out & implementing atomium for use inside of blender which has enabled me to import pdb files which I couldn't do previously. Any extra information that I can squeeze out of this library would be a great help with not having to reimplement it myself!

samirelanduk commented 2 years ago

Hi - currently no, and it's probably the biggest gap in functionality. I'm currently close to finishing a 2.0 version of atomium, and overhauling the bond generation is vaguely planned for 2.1, though it's hard to say when that will be. I would like to implement something like the PyMol system where the user can generate bonds based on a variety of rules and cutoffs.

Glad it's useful nonetheless - let me know if there's anything else I can help with, and I'll leave this issue open to be closed during the planned 2.1 release.

(Those animations look incredible by the way.)

BradyAJohnston commented 2 years ago

Thanks! Atomium is crucial to making it all work smoothly, so thank you for developing the package!

The addon is now released into beta and so far atomium has been working flawlessly for importing and handling the structures: https://github.com/BradyAJohnston/MolecularNodes

In terms of a list of features that I would want after working with it so far (I am working with the pip release so unaware of any updated features in the dev branch):

BradyAJohnston commented 2 years ago

I implemented my own bond-generation code. I had to include my own dictionary for looking up Van der Waal's radii. If you have any idea for improvements I'd be interested in the meantime as well!

radii_dict = {
    "H"  : 1.10, 
    "He" : 1.40, 
    "Li" : 1.82, 
    "Be" : 1.53, 
    "B"  : 1.92, 
    "C"  : 1.70, 
    "N"  : 1.55, 
    "O"  : 1.52, 
    "F"  : 1.47, 
    "Ne" : 1.54, 
    "Na" : 2.27, 
    "Mg" : 1.73, 
    "Al" : 1.84, 
    "Si" : 2.10, 
    "P"  : 1.80, 
    "S"  : 1.80, 
    "Cl" : 1.75, 
    "Ar" : 1.88, 
    "K"  : 2.75, 
    "Ca" : 2.31, 
    "Sc" : 2.11, 

    # break in the elements, no longer in direct numerical order
    "Ni" : 1.63, 
    "Cu" : 1.40, 
    "Zn" : 1.39
}

def create_bonds(model, connect_cutoff = 0.35, search_distance = 3):
    """
    # from the pymol wiki: https://pymolwiki.org/index.php/Connect_cutoff
    # connect_cutoff = 0.35
    # Two atoms with VDW radii vdw1 and vdw2 are connected with a bond, if the following inequality is true:
    # distance <= connect_cutoff + (vdw1 + vdw2)/2
    """
    for atom in model.atoms():
        primary_atom = atom
        primary_radius = radii_dict[atom.element]
        nearby_atoms = atom.nearby_atoms(search_distance)

        for secondary_atom in nearby_atoms:
            if atom.element == "H":
                connect_adjust = -0.2
            elif secondary_atom.element == "S" & atom.element == "S":
                connect_adjust = 0.2
            else:
                connect_adjust = 0

            secondary_radius = radii_dict[secondary_atom.element]
            distance = atom.distance_to(secondary_atom)
            if distance <= ((connect_cutoff + connect_adjust) + (primary_radius + secondary_radius) / 2):
                primary_atom.bond(secondary_atom)
samirelanduk commented 2 years ago

Of your suggestions, 1 and 4 are already implemented in 2.0, so hopefully that should be available on PyPI soon. 2 would definitely be useful, and easy to implement. 3 as we discussed is definitely planned. 5 is interesting - I comitted to have them be sets a long time ago because I thought 'well they're just in 3D space, they don't really have an order, unlike residues in a chain' - but to be honest, them being unordered annoys me a lot too when I use atomium. Need to think about this a bit more.

Code looks great. I think it would be worth me adding an option to distance_to to just return the square of the distance, which is computationally easier to calculate and if you're just checking if the distance is below some threshold, you can just compare the square of the distance to the threshold.

BradyAJohnston commented 2 years ago

Yeah I can understand the approach, and it I initially though it wouldn't be an issue, but in a multi-state / multi-model PDB file, the same atoms across two different structures end up at different positions in the set. An option to return an ordered list instead of a set, but still have the set as a default maybe would be a good approach.

BradyAJohnston commented 2 years ago

Another item for my wishlist would be access to the space group / crystal symmetry / unit cell from a .pdb file when opened.

samirelanduk commented 2 years ago

Also planned for 2.0! In the mean time you can do:

import atomium
pdb = atomium.open("myfile.pdb", file_dict=True)
crystal_record = pdb["CRYST1"]

...to get the relevant record. Still needs splitting etc though.

BradyAJohnston commented 2 years ago

Ah excellent, thank you!

BradyAJohnston commented 1 year ago

Is there a planned ETA for atomium 2.0? I'm thinking about making some changes to Molecular Nodes internals that would be made easier by the changes planned for 2.0.

samirelanduk commented 1 year ago

Unfortunately it's likely to be several months - I am completely consumed by my main work at the moment. :/

BradyAJohnston commented 1 year ago

Completely understand! Might take a look at some of it in a few months myself but I think everyone is pretty busy these days