Acellera / htmd

HTMD: Programming Environment for Molecular Discovery
https://software.acellera.com/docs/latest/htmd/index.html
Other
253 stars 58 forks source link

Request for Generating TPR and NDX Files for Protein system Using HTMD #1053

Closed vas2201 closed 1 year ago

vas2201 commented 1 year ago

I am currently using HTMD for a project and need to generate TPR and NDX files for a system that I have built (Protein). I am not familiar with the process of generating these files using HTMD and was wondering if you could provide some guidance.

Specifically, I am looking for information on how to write TPR file and NDX files (all atoms ) for a system using HTMD. Any code snippets or documentation you can provide on this topic would be greatly appreciated.

Thank you for your time and assistance with this matter. Please let me know if you require any additional information from me.

Best regards, Vas

stefdoerr commented 1 year ago

I assume you are talking about Gromacs? I don't have any experience with Gromacs file formats unfortunately although moleculekit can read .gro files IIRC. For generating tpr files you probably need to use gmx grompp but I cannot help you much further with that.

vas2201 commented 1 year ago

Thank you for your response and assistance. Gromacs is capable of creating the files I need. I performed adaptive MD simulations using htmd and produced xtc files, which I believe are in the Gromacs format (please correct me if I'm mistaken). These files were used for analyzing conformational changes, and the results were satisfactory.

Now, I want to apply the methyl relaxation analysis to the same files (as described in this reference: https://github.com/fahoffmann/MethylRelax). However, this analysis requires the use of the ndx and tpr files that were utilized during the simulations. To ensure consistency, I plan to attempt to generate the TPR and NDX files from htmd. I will also try using gmx to generate the necessary files, and hopefully, they will be compatible with the htmd xtc files.

stefdoerr commented 1 year ago

Good luck! I will close this as it's not something we actively support but if you reach any conclusions or want to share something feel free to add to the issue. I personally have no experience with GROMACS so I cannot help you much with it.

vas2201 commented 1 year ago

As I anticipated, the topology files and atom numbers generated by GROMACS and HTMD do not match. However, I plan to attempt a workaround by conducting simulations using output files from HTMD with GROMACS, as I have already achieved sufficient conformational sampling. In principle, this approach should work. Thanks for your help.

vas2201 commented 1 year ago

I have an alternative approach for methyls of AILV using HTMD for performing tICA analysis, in which I define a network of methyl groups within 10 angstroms, which are undergoing significant conformational changes. Ultimately, I plan to include dihedrals for these methyl groups to improve the construction of the tICA analysis. As I am new to coding, could you please review the following code and share your thoughts or suggestions for improvement?

import numpy as np
from htmd.ui import Molecule
from htmd.projections.metricdistance import MetricDistance
from htmd.simlist import simlist
from htmd.projections.metric import Metric

# Function to calculate distance matrix
def calculate_distance_matrix(coords):
    dists = np.sqrt(np.sum((coords[:, None] - coords) ** 2, axis=-1))
    return dists

# Load the molecule
mol = Molecule('./filtered/filtered.pdb') # Replace with your molecule file

# Apply periodic boundary conditions by wrapping the molecule
#mol.wrap()

# Select the methyl carbon atoms in A, I, L, V residues
sel1 = mol.atomselect('((resname ALA and name CB) or ' \
                                '(resname ILE and (name CD1 or name CG2)) or ' \
                                '(resname LEU and name CD1) or ' \
                                '(resname VAL and name CG1))')
sel2 = mol.atomselect('((resname ALA and name CB) or ' \
                                '(resname ILE and (name CD1 or name CG2)) or ' \
                                '(resname LEU and name CD1) or ' \
                                '(resname VAL and name CG1))')

# Calculate the distance matrix for the selected atoms
coords = mol.coords[sel1, :, 0]
dists = calculate_distance_matrix(coords)

# Set the distance threshold to 10 angstroms and find atom pairs
threshold = 10
atom_pairs = np.argwhere(np.triu(dists < threshold, k=1))

# Create a MetricDistance object with the selected atom pairs and periodic distance calculations between selections
metric = MetricDistance(sel1=sel1, sel2=sel2, periodic='selections')
metr = Metric(fsims)
metr.set( MetricDistance(sel1=sel1, sel2=sel2, periodic='selections'))
data = metr.project()
data.fstep = 0.1
data.plotTrajSizes()
stefdoerr commented 1 year ago

You can also do the distance matrix with cdist from scipy https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.cdist.html which does all-against-all distances

The only "mistake" I see here (beyond the fact that you define the metric twice at the end) is that you are not using the atom_pairs array. You can use the pairs like this as an example. That way it will only calculate distances between those pairs which are under 10A in the first frame of your reference molecule. Keep in mind that if the reference molecule is not indicative of what happens in the rest of the simulations you might be missing out on other interactions.

        sel1 = np.array([0, 1, 2]).reshape(-1, 1)
        sel2 = np.array([1, 2, 3]).reshape(-1, 1)
        res = MetricDistance(sel1, sel2, periodic=None, pairs=True)