Acellera / htmd

HTMD: Programming Environment for Molecular Discovery
https://software.acellera.com/docs/latest/htmd/index.html
Other
261 stars 59 forks source link

Imaging issues #995

Closed shozebhaider closed 3 years ago

shozebhaider commented 3 years ago

Hi ,

I have just run an adaptive simulation of a multimeric protein. I see that out of my 300 simulations over half have imaging issues with one subunit. I was wondering what is the most efficient way to do imaging using HTMD? Is it possible to do it on the final filtered trajectories? or would you advise to do it on the trajectories in the data folder and then reprocess them into filtered folders.

Many thanks.

stefdoerr commented 3 years ago

Hi Shozeb. What exactly do you mean by imaging issues? I am not familiar with the term. You mean that the wrapping seems weird? Can you post an image? Thanks!

shozebhaider commented 3 years ago

@stefdoerr Yes, the wrapping is weird. Attaching two figures.

traj1

traj2

stefdoerr commented 3 years ago

These are NPT simulations I guess? I think this only happens if you use a barostat in ACEMD3. You can always re-wrap your simulations with moleculekit

from moleculekit.molecule import Molecule
mol = Molecule("./structure.prmtop")
mol.read("output.xtc")
mol.wrap("protein")

This will re-wrap the simulation around the protein.

shozebhaider commented 3 years ago

so I write a loop that does it for all my trajectory files (in data) and then re-filter it to output it in my filtered folders ? Is this the efficient way to do this ?

stefdoerr commented 3 years ago

I think you can re-wrap the filtered trajectories directly without bothering with the unfiltered ones. Yes a simple loop should be ok

from moleculekit.molecule import Molecule
from glob import glob
import os

for ff in glob("./filtered/*/"):
    folder_name = os.path.basename(os.path.abspath(ff))
    mol = Molecule("./filtered/filtered.pdb")
    mol.read(os.path.join(ff, "output.filtered.xtc"))
    mol.wrap("protein")
    outdir = os.path.join("filtered_wrapped", folder_name)
    os.makedirs(outdir, exist_ok=True)
    mol.write(os.path.join(outdir, "output.filtered.xtc"))

This ought to work (I didn't test it though)

shozebhaider commented 3 years ago

Thanks. Let me test the code and then I shall close the ticket once done.

shozebhaider commented 3 years ago

Tested the script. The script works fine. The re-wrapping doesn't work for some strange reason. I can re-wrap using other methods but not using Moleculekit. If you want we can explore this further or just close this ticket. Upto you @stefdoerr

stefdoerr commented 3 years ago

So it writes out the files in a new folder called filtered_wrapped but the simulations are still not wrapped? Can you check what this gives? If there is a box defined it ought to wrap correctly.

    mol = Molecule("./filtered/filtered.pdb")
    mol.read(os.path.join(ff, "output.filtered.xtc"))
    print(mol.box)
shozebhaider commented 3 years ago

It has all the box information but still does not wrap. The output for 2 trajectories I tested it on is

[[100.928604 100.928604 100.928604 ... 100.928604 100.928604 100.928604] [101.0954 101.0954 101.0954 ... 101.0954 101.0954 101.0954 ] [ 96.140076 96.140076 96.140076 ... 96.140076 96.140076 96.140076]] [[100.928604 100.928604 100.928604 ... 100.928604 100.928604 100.928604] [101.0954 101.0954 101.0954 ... 101.0954 101.0954 101.0954 ] [ 96.140076 96.140076 96.140076 ... 96.140076 96.140076 96.140076]]

stefdoerr commented 3 years ago

Could you send me maybe by email a filtered.pdb and a single output.filtered.xtc so I can take a look?

stefdoerr commented 3 years ago

If you have a filtered.psf file as well it would be perfect

stefdoerr commented 3 years ago

Ok, the issue is that you have 4 proteins which I didn't originally realize. That means that if you wrap with selection "protein" it will calculate the center of mass of all 4 proteins which lies in the middle between them all and will not wrap them as you expected. There is a second issue that your proteins all have the same segid and chain number for some reason which is a bit confusing to me.

But in any case the solution is to just wrap around the first of the four proteins with mol.wrap("protein and resid 1 to 266")

shozebhaider commented 3 years ago

works well now. Thanks @stefdoerr