pegerto / mdakit_sasa

Toolkit to calculate the surface accessible area of a molecular dynamic trajectory
https://mdakit-sasa.readthedocs.io/en/latest/index.html
GNU General Public License v2.0
4 stars 2 forks source link

Unable to calculate the SASA of a molecule #25

Open wisecashew opened 6 days ago

wisecashew commented 6 days ago

I am running a molecular dynamics simulation of a single polymer chain in water. I want to calculate the solvent accessible surface area of the polymer over the course of the simulation.

Per the example given here, I tried computing the SASA of the polymer using the following piece of code:

import MDAnalysis as mda
from mdakit_sasa.analysis.sasaanalysis import SASAAnalysis

if __name__=="__main__":

    topo = "system.data" # this is my topology
    traj = "coords.lammpstrj"                           # this is my trajectory
    dt   = 2                                                          # this is my timestep

    # generate the universe object
    u = mda.Universe(topo, traj, format="LAMMPSDUMP", lengthunit="angstrom", timeunit="fs", dt=dt)
    print(f"Dimensions of the universe: {u.dimensions}", flush=True)

    # time to play with molecules in the universe
    print("Begin selecting some atoms for the polymer...", end=' ', flush=True)
    polymer = u.select_atoms('resid 1')
    print("done!")

    analysis = SASAAnalysis(u)
    analysis.run(polymer)
         print(analysis.results.total_area)

I expected the program to print out a list of SASA values of the polymer.

Instead, I see the following error:

Dimensions of the universe: [84.75651 84.75651 84.75651 90.      90.      90.     ]
Begin selecting some atoms for the polymer... done!
Traceback (most recent call last):
  File "/scratch/gpfs/user/MC_POLYMER/polymer_lattice/lattice_md/py_analysis/LAMMPS/MDA_reader.py", line 22, in <module>
    analysis.run()
  File "/home/user/.conda/envs/phase/lib/python3.9/site-packages/MDAnalysis/analysis/base.py", line 448, in run
    self._single_frame()
  File "/home/user/.conda/envs/phase/lib/python3.9/site-packages/mdakit_sasa/analysis/sasaanalysis.py", line 93, in _single_frame
    structure.addAtom(a.type.rjust(2), a.resname, a.resnum.item(), a.segid, x, y, z)
  File "/home/user/.conda/envs/phase/lib/python3.9/site-packages/MDAnalysis/core/groups.py", line 4190, in __getattr__
    return super(Atom, self).__getattr__(attr)
  File "/home/user/.conda/envs/phase/lib/python3.9/site-packages/MDAnalysis/core/groups.py", line 4065, in __getattr__
    raise NoDataError(err.format(singular=cls.singular))
MDAnalysis.exceptions.NoDataError: This Universe does not contain resname information

How do I obtain the SASA of the polymer from a trajectory? Further, how does mdakit_sasa obtain the VdW radii to calculate SASA?

I am using Python 3.9.20, MDAnalysis 2.7.0, and the latest version of mdakit-sasa 0.2.6 (it was pip'd today, Sep 24, 2024) on Linux. The conda environment (conda list) is in conda.txt, the pip environment is in pip.txt.

I have attached the trajectory file (coords.txt) and the datafile (system.data.txt) to this post. I had to change the extensions to these files so that Github would accept them.

conda.txt coords.txt system.data.txt pip.txt

pegerto commented 6 days ago

Hi @wisecashew,

Thank you very much for reporting this issue. I'm not very familiar with the LAMMPSDUMP format myself, so let me investigate and get back to you shortly.

This kit builds on the FreeSASA calculation. As the error indicates, the file might not contain the specific residue names. FreeSASA uses these names to extract the specific van der Waals radii. An example of the classification can be found here:

NACCESS

I'll run some tests and get back to you with an update on my findings.

pegerto commented 6 days ago

Hi @wisecashew

I added a new patch version, that safely access the resname and will allow you to run the computation, but the results are not to be trusted.

Unfortunately, after reading a bit more about your file format it will not compute correctly the vdW radii, and will asume 0, that is due the fact that the atom type do not contain a specific name for the guesser. I still looking into the issue and can possible guess following the atom mass, but will double check this with @alepandini .

Regards.

wisecashew commented 5 days ago

Hi @pegerto, thanks for looking into this!

I don't know if this helps but I wrote a calculator using the freesasa module. Here is my script for it:

import lmp_settings
import MDAnalysis as mda
import freesasa
import numpy as np

if __name__=="__main__":

    settings = "pnipam_tip4p-ice-water.settings"
    topo     = "pnipam_tip4p-ice-water.r1.data"
    traj     = "coords.lammpstrj"
    dt       = 2

    # read the settingsfile and keep all the forcefield information in one object
    my_settings = lmp_settings.Settings(settings)
    my_settings.read_settings()

    # generate the universe object
    u = mda.Universe(topo, traj, format="LAMMPSDUMP", lengthunit="angstrom", timeunit="fs", dt=dt)
    print(f"Dimensions of the universe: {u.dimensions}", flush=True)

    print("Begin selecting some atoms for the polymer...", end=' ', flush=True)
    polymer = u.select_atoms('resid 1')
    print("done!")

    # positions in MDA are shifted so that xlo, ylo, zlo = 0
    # shift = np.array([1.5620737849713620e-01, 1.5620737849713620e-01, 1.5620737849713620e-01])
    vdw_radii = list()
    for atom in polymer.atoms:
        # pair_coeff is a dictionary that holds the vdw radius and epsilon of each atom type
        vdw_radii.append(my_settings.pair_coeff[int(atom.type)][1])

    parameters = freesasa.Parameters()

    print(f"Algorith is {parameters.algorithm()}.")
    print(f"Probe radius is {parameters.probeRadius()}Å.")

    sasa_container = list()

    for idx, ts in enumerate(u.trajectory):
        polymer.unwrap()
        coords = polymer.positions 
        coords = coords.flatten().tolist()
        result = freesasa.calcCoord(coords, vdw_radii)
        sasa_container.append(result.totalArea()) # this is an array of result objects

This script prints out the SASA for the trajectory. If you can create a parameter like pair_coeff or vdw_radii for your function which the user has to manually feed into the program, I think that could solve the problem?

pegerto commented 5 days ago

Hi @wisecashew

This code makes sense you mapping each atom type from system to specific VdW raddi, I can easily add this mapping as optional to the SASAAnalysis class

May I ask you how you create pair_coeff is it a manual mapping in your settings file, is it a manual mapping ?

Regards.

wisecashew commented 5 days ago

It is a manual mapping, yes. the "Settings" object has the method "read_settings" which goes into the settingsfile (file with force field information).

pair_coeff is a simple dictionary that takes the atom type as a key and returns a tuple (epsilon, sigma) i.e. the LJ parameters of the atoms.