Acellera / htmd

HTMD: Programming Environment for Molecular Discovery
https://software.acellera.com/docs/latest/htmd/index.html
Other
253 stars 58 forks source link

RuntimeError (adaptive MD for RNA): when using a dihedral with an atom name containing a single quote #1056

Open vas2201 opened 1 year ago

vas2201 commented 1 year ago

I encountered a RuntimeError when running the ad.run() function using a dihedral with an atom name containing a single quote (e.g., "O3'"). Here's the code I used:

atom1_alpha_Dihang_residue_6 = {'name': "O3'", 'resid': 5, 'segid': 'RNAA', 'chain': ' '}
atom2_alpha_Dihang_residue_6 = {'name': 'P', 'resid': 6, 'segid': 'RNAA', 'chain': ' '}
atom3_alpha_Dihang_residue_6 = {'name': 'O5', 'resid': 6, 'segid': 'RNAA', 'chain': ' '}
atom4_alpha_Dihang_residue_6 = {'name': 'C5', 'resid': 6, 'segid': 'RNAA', 'chain': ' '}
dih_alpha_6 = [Dihedral(atom1_alpha_Dihang_residue_6, atom2_alpha_Dihang_residue_6, atom3_alpha_Dihang_residue_6, atom4_alpha_Dihang_residue_6)]

all_dih = dih_alpha_6
print(all_dih)

# The output is as follows:
# name  resid   insertion   chain   segid
# O3'   5               RNAA
# P 6               RNAA
# O5    6               RNAA
# C5    6               RNAA

**************************************************************************
When I proceeded to run ad.run(), I got the following error:
RuntimeError: Expected one atom from atomselection {'name': "O3'", 'resid': 5, 'segid': 'RNAA', 'chain': ' ', 'insertion': ''}. Got 0 instead.

I believe the issue might be related to the atom name containing a single quote.

Please advise on this matter. note : I uploaded notebook for quick look. https://drive.google.com/file/d/148kbylnU8BTyF7sTz4xXRNbbQnNBVWrs/view?usp=share_link Thank you

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[88], line 1
----> 1 ad.run()

File ~/anaconda3/envs/100htmd/lib/python3.9/site-packages/htmd/adaptive/adaptive.py:196, in AdaptiveBase.run(self)
    194 # If currently running simulations are lower than nmin start new ones to reach nmax number of sims
    195 if self._running <= self.nmin and epoch < self.nepochs:
--> 196     flag = self._algorithm()
    197     if flag is False:
    198         self._unsetLock()

File ~/anaconda3/envs/100htmd/lib/python3.9/site-packages/htmd/adaptive/adaptiverun.py:208, in AdaptiveMD._algorithm(self)
    207 def _algorithm(self):
--> 208     data = self._getData(self._getSimlist())
    209     if not self._checkNFrames(data):
    210         return False

File ~/anaconda3/envs/100htmd/lib/python3.9/site-packages/htmd/adaptive/adaptiverun.py:246, in AdaptiveMD._getData(self, sims)
    241 # if self.contactsym is not None:
    242 #    contactSymmetry(data, self.contactsym)
    244 if self.ticadim > 0:
    245     # tica = TICA(metr, int(max(2, np.ceil(self.ticalag))))  # gianni: without project it was tooooo slow
--> 246     data = metr.project()
    247     data.dropTraj()  # Drop before TICA to avoid broken trajectories
    248     # 1 < ticalag < (trajLen / 2)

File ~/anaconda3/envs/100htmd/lib/python3.9/site-packages/htmd/projections/metric.py:159, in Metric.project(self, njobs)
    157     for proj in self.projectionlist:
    158         if isinstance(proj, Projection):
--> 159             proj._setCache(uqMol)
    160 else:
    161     logger.warning(
    162         "Cannot calculate description of dimensions due to different topology files for each trajectory."
    163     )

File ~/anaconda3/envs/100htmd/lib/python3.9/site-packages/moleculekit/projections/projection.py:31, in Projection._setCache(self, mol)
     30 def _setCache(self, mol):
---> 31     resdict = self._calculateMolProp(mol)
     32     self._cache.update(resdict)

File ~/anaconda3/envs/100htmd/lib/python3.9/site-packages/moleculekit/projections/metricdihedral.py:839, in MetricDihedral._calculateMolProp(self, mol, props)
    836 else:
    837     dihedrals = ensurelist(self._dihedrals)
--> 839 res["dihedrals"] = Dihedral.dihedralsToIndexes(mol, dihedrals, protsel)
    840 return res

File ~/anaconda3/envs/100htmd/lib/python3.9/site-packages/moleculekit/projections/metricdihedral.py:136, in Dihedral.dihedralsToIndexes(mol, dihedrals, sel)
    134     atomsel = atomsel & selatoms
    135     if np.sum(atomsel) != 1:
--> 136         raise RuntimeError(
    137             "Expected one atom from atomselection {}. Got {} instead.".format(
    138                 a, np.sum(atomsel)
    139             )
    140         )
    141     idx.append(np.where(atomsel)[0][0])
    142 indexes.append(idx)

RuntimeError: Expected one atom from atomselection {'name': "O3'", 'resid': 5, 'segid': 'RNAA', 'chain': ' ', 'insertion': ''}. Got 0 instead.
vas2201 commented 1 year ago

FYI ... If I use single quote around O3' atom as follows .. atom1_alpha_Dihang_residue_6 = {'name': 'O3'', 'resid': 5, 'segid': 'RNAA', 'chain': ' '}

it is causing syntax error , that pointing out at 'resid' Cell In[30], line 1 atom1_alpha_Dihang_residue_6 = {'name': 'O3'', 'resid': 5, 'segid': 'RNAA', 'chain': ' '}

SyntaxError: invalid syntax

stefdoerr commented 1 year ago

I think the issue is simply that you put chain = " " meaning that the chain should be an empty space instead of an empty string. Try with:

atom1_alpha_Dihang_residue_6 = {'name': "O3'", 'resid': 5, 'segid': 'RNAA', 'chain': ''}

Also make sure the segid is RNAA by doing print(mol.segid)

vas2201 commented 1 year ago

The above issue is resolved in the Charmm force field. However, when I switched to amber forcefield for RNA. I am encountering the following issues. Looks like the default amber force field not recognizing the RNA. Is there any way that I can specify the amber force field for RNA?.

I define the default amber force fields as per documentation as follows. topos_amber = amber.defaultTopo() frcmods_amber = amber.defaultParam() bmol_amber = amber.build(rna, topo=topos_amber, param=frcmods_amber, outdir='./build_amber')

Attached files below for your reference https://drive.google.com/file/d/1HGo6KUalPOYgSrtD9vsBXov5FkvUNHIn/view?usp=share_link


BuildError Traceback (most recent call last) Cell In[13], line 3 1 topos_amber = amber.defaultTopo() 2 frcmods_amber = amber.defaultParam() ----> 3 bmol_amber = amber.build(rna, topo=topos_amber, param=frcmods_amber, outdir='./build_amber')

File ~/anaconda3/envs/100htmd/lib/python3.9/site-packages/htmd/builder/amber.py:469, in build(mol, ff, topo, param, prefix, outdir, caps, ionize, saltconc, saltanion, saltcation, disulfide, teleap, teleapimports, execute, atomtypes, offlibraries, gbsa, igb) 466 _detect_cofactors_ncaa_ptm(mol, param, topo) 468 if ionize: --> 469 molbuilt = _build( 470 mol, 471 ff=ff, 472 topo=topo, 473 param=param, 474 prefix=prefix, 475 outdir=outdir, 476 disulfide=disulfide, 477 teleap=teleap, 478 teleapimports=teleapimports, 479 execute=execute, 480 atomtypes=atomtypes, 481 offlibraries=offlibraries, 482 gbsa=gbsa, 483 igb=igb, 484 ) 485 shutil.move( 486 os.path.join(outdir, "structure.crd"), 487 os.path.join(outdir, "structure.noions.crd"), 488 ) 489 shutil.move( 490 os.path.join(outdir, "structure.prmtop"), 491 os.path.join(outdir, "structure.noions.prmtop"), 492 )

File ~/anaconda3/envs/100htmd/lib/python3.9/site-packages/htmd/builder/amber.py:728, in _build(mol, ff, topo, param, prefix, outdir, disulfide, teleap, teleapimports, execute, atomtypes, offlibraries, gbsa, igb) 726 os.chdir(currdir) 727 if errors: --> 728 raise BuildError( 729 errors 730 + [f"Check {logpath} for further information on errors in building."] 731 ) 732 logger.info("Finished building.") 734 if ( 735 os.path.exists(os.path.join(outdir, "structure.crd")) 736 and os.path.getsize(os.path.join(outdir, "structure.crd")) != 0 737 and os.path.getsize(os.path.join(outdir, "structure.prmtop")) != 0 738 ):

BuildError: UnknownResidueError("Unknown residue(s) ['URA'] found in the input structure. You are either missing a topology definition for the residue or you need to rename it to the correct residue name") MissingAtomTypeError('Missing atom type for [".R<DG5 1>.A<O2\' 32>", ".R<DG5 1>.A<HO2\' 33>", ".R<DG 2>.A<O2\' 34>", ".R<DG 2>.A<HO2\' 35>", ".R<DA 3>.A<O2\' 33>", ".R<DA 3>.A<HO2\' 34>", \'.R<URA 4>.A<P 1>\', \'.R<URA 4>.A<O1P 2>\', \'.R<URA 4>.A<O2P 3>\', ".R<URA 4>.A<O5\' 4>", ".R<URA 4>.A<C5\' 5>", ".R<URA 4>.A<C4\' 6>", ".R<URA 4>.A<O4\' 7>", ".R<URA 4>.A<C3\' 8>", ".R<URA 4>.A<O3\' 9>", ".R<URA 4>.A<C2\' 10>", ".R<URA 4>.A<O2\' 11>", ".R<URA 4>.A<C1\' 12>", \'.R<URA 4>.A<N1 13>\', \'.R<URA 4>.A<C2 14>\', \'.R<URA 4>.A<O2 15>\', \'.R<URA 4>.A<N3 16>\', \'.R<URA 4>.A<C4 17>\', \'.R<URA 4>.A<O4 18>\', \'.R<URA 4>.A<C5 19>\', \'.R<URA 4>.A<C6 20>\', ".R<URA 4>.A<H5\' 21>", ".R<URA 4>.A<H5\'\' 22>", ".R<URA 4>.A<H4\' 23>", ".R<URA 4>.A<H3\' 24>", ".R<URA 4>.A<H2\' 25>", ".R<URA 4>.A<HO2\' 26>", ".R<URA 4>.A<H1\' 27>", \'.R<URA 4>.A<H3 28>\', \'.R<URA 4>.A<H5 29>\', \'.R<URA 4>.A<H6 30>\', ".R<DC 5>.A<O2\' 31>", ".R<DC 5>.A<HO2\' 32>", ".R<DA 6>.A<O2\' 33>", ".R<DA 6>.A<HO2\' 34>", ".R<DA 7>.A<O2\' 33>", ".R<DA 7>.A<HO2\' 34>", \'.R<URA 8>.A<P 1>\', \'.R<URA 8>.A<O1P 2>\', \'.R<URA 8>.A<O2P 3>\', ".R<URA 8>.A<O5\' 4>", ".R<URA 8>.A<C5\' 5>", ".R<URA 8>.A<C4\' 6>", ".R<URA 8>.A<O4\' 7>", ".R<URA 8>.A<C3\' 8>", ".R<URA 8>.A<O3\' 9>", ".R<URA 8>.A<C2\' 10>", ".R<URA 8>.A<O2\' 11>", ".R<URA 8>.A<C1\' 12>", \'.R<URA 8>.A<N1 13>\', \'.R<URA 8>.A<C2 14>\', \'.R<URA 8>.A<O2 15>\', \'.R<URA 8>.A<N3 16>\', \'.R<URA 8>.A<C4 17>\', \'.R<URA 8>.A<O4 18>\', \'.R<URA 8>.A<C5 19>\', \'.R<URA 8>.A<C6 20>\', ".R<URA 8>.A<H5\' 21>", ".R<URA 8>.A<H5\'\' 22>", ".R<URA 8>.A<H4\' 23>", ".R<URA 8>.A<H3\' 24>", ".R<URA 8>.A<H2\' 25>", ".R<URA 8>.A<HO2\' 26>", ".R<URA 8>.A<H1\' 27>", \'.R<URA 8>.A<H3 28>\', \'.R<URA 8>.A<H5 29>\', \'.R<URA 8>.A<H6 30>\', ".R<DA 9>.A<O2\' 33>", ".R<DA 9>.A<HO2\' 34>", ".R<DG 10>.A<O2\' 34>", ".R<DG 10>.A<HO2\' 35>", ".R<DC 11>.A<O2\' 31>", ".R<DC 11>.A<HO2\' 32>", ".R<DA 12>.A<O2\' 33>", ".R<DA 12>.A<HO2\' 34>", ".R<DG 13>.A<O2\' 34>", ".R<DG 13>.A<HO2\' 35>", ".R<DG 14>.A<O2\' 34>", ".R<DG 14>.A<HO2\' 35>", \'.R<URA 15>.A<P 1>\', \'.R<URA 15>.A<O1P 2>\', \'.R<URA 15>.A<O2P 3>\', ".R<URA 15>.A<O5\' 4>", ".R<URA 15>.A<C5\' 5>", ".R<URA 15>.A<C4\' 6>", ".R<URA 15>.A<O4\' 7>", ".R<URA 15>.A<C3\' 8>", ".R<URA 15>.A<O3\' 9>", ".R<URA 15>.A<C2\' 10>", ".R<URA 15>.A<O2\' 11>", ".R<URA 15>.A<C1\' 12>", \'.R<URA 15>.A<N1 13>\', \'.R<URA 15>.A<C2 14>\', \'.R<URA 15>.A<O2 15>\', \'.R<URA 15>.A<N3 16>\', \'.R<URA 15>.A<C4 17>\', \'.R<URA 15>.A<O4 18>\', \'.R<URA 15>.A<C5 19>\', \'.R<URA 15>.A<C6 20>\', ".R<URA 15>.A<H5\' 21>", ".R<URA 15>.A<H5\'\' 22>", ".R<URA 15>.A<H4\' 23>", ".R<URA 15>.A<H3\' 24>", ".R<URA 15>.A<H2\' 25>", ".R<URA 15>.A<HO2\' 26>", ".R<URA 15>.A<H1\' 27>", \'.R<URA 15>.A<H3 28>\', \'.R<URA 15>.A<H5 29>\', \'.R<URA 15>.A<H6 30>\', ".R<DG 16>.A<O2\' 34>", ".R<DG 16>.A<HO2\' 35>", \'.R<URA 17>.A<P 1>\', \'.R<URA 17>.A<O1P 2>\', \'.R<URA 17>.A<O2P 3>\', ".R<URA 17>.A<O5\' 4>", ".R<URA 17>.A<C5\' 5>", ".R<URA 17>.A<C4\' 6>", ".R<URA 17>.A<O4\' 7>", ".R<URA 17>.A<C3\' 8>", ".R<URA 17>.A<O3\' 9>", ".R<URA 17>.A<C2\' 10>", ".R<URA 17>.A<O2\' 11>", ".R<URA 17>.A<C1\' 12>", \'.R<URA 17>.A<N1 13>\', \'.R<URA 17>.A<C2 14>\', \'.R<URA 17>.A<O2 15>\', \'.R<URA 17>.A<N3 16>\', \'.R<URA 17>.A<C4 17>\', \'.R<URA 17>.A<O4 18>\', \'.R<URA 17>.A<C5 19>\', \'.R<URA 17>.A<C6 20>\', ".R<URA 17>.A<H5\' 21>", ".R<URA 17>.A<H5\'\' 22>", ".R<URA 17>.A<H4\' 23>", ".R<URA 17>.A<H3\' 24>", ".R<URA 17>.A<H2\' 25>", ".R<URA 17>.A<HO2\' 26>", ".R<URA 17>.A<H1\' 27>", \'.R<URA 17>.A<H3 28>\', \'.R<URA 17>.A<H5 29>\', \'.R<URA 17>.A<H6 30>\', ".R<DG 18>.A<O2\' 34>", ".R<DG 18>.A<HO2\' 35>", ".R<DG 19>.A<O2\' 34>", ".R<DG 19>.A<HO2\' 35>", ".R<DC 20>.A<O2\' 31>", ".R<DC 20>.A<HO2\' 32>", ".R<DA 21>.A<O2\' 33>", ".R<DA 21>.A<HO2\' 34>", ".R<DC 22>.A<O2\' 31>", ".R<DC 22>.A<HO2\' 32>", ".R<DA 23>.A<O2\' 33>", ".R<DA 23>.A<HO2\' 34>", ".R<DC 24>.A<O2\' 31>", ".R<DC 24>.A<HO2\' 32>", ".R<DC 25>.A<O2\' 31>", ".R<DC 25>.A<HO2\' 32>", ".R<DA 26>.A<O2\' 33>", ".R<DA 26>.A<HO2\' 34>", ".R<DG 27>.A<O2\' 34>", ".R<DG 27>.A<HO2\' 35>", \'.R<URA 28>.A<P 1>\', \'.R<URA 28>.A<O1P 2>\', \'.R<URA 28>.A<O2P 3>\', ".R<URA 28>.A<O5\' 4>", ".R<URA 28>.A<C5\' 5>", ".R<URA 28>.A<C4\' 6>", ".R<URA 28>.A<O4\' 7>", ".R<URA 28>.A<C3\' 8>", ".R<URA 28>.A<O3\' 9>", ".R<URA 28>.A<C2\' 10>", ".R<URA 28>.A<O2\' 11>", ".R<URA 28>.A<C1\' 12>", \'.R<URA 28>.A<N1 13>\', \'.R<URA 28>.A<C2 14>\', \'.R<URA 28>.A<O2 15>\', \'.R<URA 28>.A<N3 16>\', \'.R<URA 28>.A<C4 17>\', \'.R<URA 28>.A<O4 18>\', \'.R<URA 28>.A<C5 19>\', \'.R<URA 28>.A<C6 20>\', ".R<URA 28>.A<H5\' 21>", ".R<URA 28>.A<H5\'\' 22>", ".R<URA 28>.A<H4\' 23>", ".R<URA 28>.A<H3\' 24>", ".R<URA 28>.A<H2\' 25>", ".R<URA 28>.A<HO2\' 26>", ".R<URA 28>.A<H1\' 27>", \'.R<URA 28>.A<H3 28>\', \'.R<URA 28>.A<H5 29>\', \'.R<URA 28>.A<H6 30>\', ".R<DC 29>.A<O2\' 31>", ".R<DC 29>.A<HO2\' 32>", ".R<DA 30>.A<O2\' 33>", ".R<DA 30>.A<HO2\' 34>", \'.R<URA 31>.A<P 1>\', \'.R<URA 31>.A<O1P 2>\', \'.R<URA 31>.A<O2P 3>\', ".R<URA 31>.A<O5\' 4>", ".R<URA 31>.A<C5\' 5>", ".R<URA 31>.A<C4\' 6>", ".R<URA 31>.A<O4\' 7>", ".R<URA 31>.A<C3\' 8>", ".R<URA 31>.A<O3\' 9>", ".R<URA 31>.A<C2\' 10>", ".R<URA 31>.A<O2\' 11>", ".R<URA 31>.A<C1\' 12>", \'.R<URA 31>.A<N1 13>\', \'.R<URA 31>.A<C2 14>\', \'.R<URA 31>.A<O2 15>\', \'.R<URA 31>.A<N3 16>\', \'.R<URA 31>.A<C4 17>\', \'.R<URA 31>.A<O4 18>\', \'.R<URA 31>.A<C5 19>\', \'.R<URA 31>.A<C6 20>\', ".R<URA 31>.A<H5\' 21>", ".R<URA 31>.A<H5\'\' 22>", ".R<URA 31>.A<H4\' 23>", ".R<URA 31>.A<H3\' 24>", ".R<URA 31>.A<H2\' 25>", ".R<URA 31>.A<HO2\' 26>", ".R<URA 31>.A<H1\' 27>", \'.R<URA 31>.A<H3 28>\', \'.R<URA 31>.A<H5 29>\', \'.R<URA 31>.A<H6 30>\', ".R<DA 32>.A<O2\' 33>", ".R<DA 32>.A<HO2\' 34>", ".R<DC 33>.A<O2\' 31>", ".R<DC 33>.A<HO2\' 32>", ".R<DC 34>.A<O2\' 31>", ".R<DC 34>.A<HO2\' 32>", \'.R<URA 35>.A<P 1>\', \'.R<URA 35>.A<O1P 2>\', \'.R<URA 35>.A<O2P 3>\', ".R<URA 35>.A<O5\' 4>", ".R<URA 35>.A<C5\' 5>", ".R<URA 35>.A<C4\' 6>", ".R<URA 35>.A<O4\' 7>", ".R<URA 35>.A<C3\' 8>", ".R<URA 35>.A<O3\' 9>", ".R<URA 35>.A<C2\' 10>", ".R<URA 35>.A<O2\' 11>", ".R<URA 35>.A<C1\' 12>", \'.R<URA 35>.A<N1 13>\', \'.R<URA 35>.A<C2 14>\', \'.R<URA 35>.A<O2 15>\', \'.R<URA 35>.A<N3 16>\', \'.R<URA 35>.A<C4 17>\', \'.R<URA 35>.A<O4 18>\', \'.R<URA 35>.A<C5 19>\', \'.R<URA 35>.A<C6 20>\', ".R<URA 35>.A<H5\' 21>", ".R<URA 35>.A<H5\'\' 22>", ".R<URA 35>.A<H4\' 23>", ".R<URA 35>.A<H3\' 24>", ".R<URA 35>.A<H2\' 25>", ".R<URA 35>.A<HO2\' 26>", ".R<URA 35>.A<H1\' 27>", \'.R<URA 35>.A<H3 28>\', \'.R<URA 35>.A<H5 29>\', \'.R<URA 35>.A<H6 30>\', \'.R<URA 36>.A<P 1>\', \'.R<URA 36>.A<O1P 2>\', \'.R<URA 36>.A<O2P 3>\', ".R<URA 36>.A<O5\' 4>", ".R<URA 36>.A<C5\' 5>", ".R<URA 36>.A<C4\' 6>", ".R<URA 36>.A<O4\' 7>", ".R<URA 36>.A<C3\' 8>", ".R<URA 36>.A<O3\' 9>", ".R<URA 36>.A<C2\' 10>", ".R<URA 36>.A<O2\' 11>", ".R<URA 36>.A<C1\' 12>", \'.R<URA 36>.A<N1 13>\', \'.R<URA 36>.A<C2 14>\', \'.R<URA 36>.A<O2 15>\', \'.R<URA 36>.A<N3 16>\', \'.R<URA 36>.A<C4 17>\', \'.R<URA 36>.A<O4 18>\', \'.R<URA 36>.A<C5 19>\', \'.R<URA 36>.A<C6 20>\', ".R<URA 36>.A<H5\' 21>", ".R<URA 36>.A<H5\'\' 22>", ".R<URA 36>.A<H4\' 23>", ".R<URA 36>.A<H3\' 24>", ".R<URA 36>.A<H2\' 25>", ".R<URA 36>.A<HO2\' 26>", ".R<URA 36>.A<H1\' 27>", \'.R<URA 36>.A<H3 28>\', \'.R<URA 36>.A<H5 29>\', \'.R<URA 36>.A<H6 30>\', ".R<DG 37>.A<O2\' 34>", ".R<DG 37>.A<HO2\' 35>", ".R<DA 38>.A<O2\' 33>", ".R<DA 38>.A<HO2\' 34>", \'.R<URA 39>.A<P 1>\', \'.R<URA 39>.A<O1P 2>\', \'.R<URA 39>.A<O2P 3>\', ".R<URA 39>.A<O5\' 4>", ".R<URA 39>.A<C5\' 5>", ".R<URA 39>.A<C4\' 6>", ".R<URA 39>.A<O4\' 7>", ".R<URA 39>.A<C3\' 8>", ".R<URA 39>.A<O3\' 9>", ".R<URA 39>.A<C2\' 10>", ".R<URA 39>.A<O2\' 11>", ".R<URA 39>.A<C1\' 12>", \'.R<URA 39>.A<N1 13>\', \'.R<URA 39>.A<C2 14>\', \'.R<URA 39>.A<O2 15>\', \'.R<URA 39>.A<N3 16>\', \'.R<URA 39>.A<C4 17>\', \'.R<URA 39>.A<O4 18>\', \'.R<URA 39>.A<C5 19>\', \'.R<URA 39>.A<C6 20>\', ".R<URA 39>.A<H5\' 21>", ".R<URA 39>.A<H5\'\' 22>", ".R<URA 39>.A<H4\' 23>", ".R<URA 39>.A<H3\' 24>", ".R<URA 39>.A<H2\' 25>", ".R<URA 39>.A<HO2\' 26>", ".R<URA 39>.A<H1\' 27>", \'.R<URA 39>.A<H3 28>\', \'.R<URA 39>.A<H5 29>\', \'.R<URA 39>.A<H6 30>\', ".R<DC 40>.A<O2\' 31>", ".R<DC 40>.A<HO2\' 32>", ".R<DC3 41>.A<O2\' 32>", ".R<DC3 41>.A<HO2\' 33>"]') Check /home/nmrbox/spenumutchu/Desktop/Projects/aufr12_project/SL2_drug/build_amber/log.txt for further information on errors in building.

from htmd.ui import *

stefdoerr commented 1 year ago

Can you send me the initial file you used? Normally you should be able to build RNA fine in AMBER.

vas2201 commented 1 year ago

I uploaded the file that i used with charmm force field for RNA. Running fine without issues.

https://drive.google.com/file/d/1R3DajOODBxT621giw9RVkaNBgyMOtSWV/view?usp=share_link

stefdoerr commented 1 year ago

No sorry I meant the starting PDB file. Because I want to test the pipeline from the start

vas2201 commented 1 year ago

attached pdb file here ! https://drive.google.com/file/d/1YWgrH4AyTHMP323EzXn5Ld2Gb0vxh5fP/view?usp=share_link

stefdoerr commented 1 year ago

There is something non-standard about your atom or residue names that pdb2pqr does not like and it doesn't recognize the nucleic acids as such. If I use your file I get:

ValueError: No biomolecule heavy atoms found and no ligand present. Unable to proceed.  You may also see this message if PDB2PQR does not have parameters for any residue in your biomolecule.

If I used the 5v16 directly from the RCSB database it works correctly

from htmd.builder.amber import build
from htmd.builder.solvate import solvate
from moleculekit.tools.preparation import systemPrepare
from moleculekit.molecule import Molecule

mol = Molecule("5v16")
pmol = systemPrepare(mol)
smol = solvate(pmol, pad=10)
bmol = build(smol, ionize=True, outdir="/tmp/test")

I don't recommend solvating like that though (you should solvate as a cubic box) but just wanted to demonstrate that it works fine if you start from the RCSB file

vas2201 commented 1 year ago

I tried using adaptive sampling as per your suggestions with the Amber force field and user-defined dihedrals for RNA. While I did not encounter any issues with the Charmm force field and adaptive sampling was successful, I am having issues with adaptive sampling using the Amber force field for RNA (after first epoch). I tried tweaking the code for segid and chain in multiple ways to resolve the issues, but encountered issues such as atoms not being selected or detected. If you have time, it would be great if you could review the code. The Jupyter notebook can be found at this link: https://drive.google.com/file/d/1WEOASnxw62U44T-aSfD1SH70aoR1ISCE/view?usp=share_link

Error : 
2023-03-30 01:28:08,531 - jobqueues.localqueue - INFO - Running /home/nmrbox/spenumutchu/Desktop/Projects/aufr12_project/SL2_WT/htmd_amber/input/e1s9_md_50ns_1 on device 1
2023-03-30 01:28:30,000 - htmd.adaptive.adaptive - INFO - Processing epoch 1
2023-03-30 01:28:30,000 - htmd.adaptive.adaptive - INFO - Retrieving simulations.
2023-03-30 01:28:30,001 - htmd.adaptive.adaptive - INFO - 2 simulations in progress
2023-03-30 01:28:30,001 - htmd.adaptive.adaptiverun - INFO - Postprocessing new data
Creating simlist: 100%|███████████████████████████| 9/9 [00:00<00:00, 63.93it/s]
2023-03-30 01:28:32,094 - htmd.simlist - WARNING - Filtering was not able to write ./filtered/filtered.prmtop due to error: Molecule cannot write files with "prmtop" extension yet. If you need such support please notify us on the github moleculekit issue tracker.
Filtering trajectories: 100%|█████████████████████| 8/8 [00:09<00:00,  1.25s/it]

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[20], line 2
      1 get_ipython().run_line_magic('cd', '/home/nmrbox/spenumutchu/Desktop/Projects/aufr12_project/SL2_WT/htmd_amber')
----> 2 ad.run()

File ~/anaconda3/envs/100htmd/lib/python3.9/site-packages/htmd/adaptive/adaptive.py:196, in AdaptiveBase.run(self)
    194 # If currently running simulations are lower than nmin start new ones to reach nmax number of sims
    195 if self._running <= self.nmin and epoch < self.nepochs:
--> 196     flag = self._algorithm()
    197     if flag is False:
    198         self._unsetLock()

File ~/anaconda3/envs/100htmd/lib/python3.9/site-packages/htmd/adaptive/adaptiverun.py:208, in AdaptiveMD._algorithm(self)
    207 def _algorithm(self):
--> 208     data = self._getData(self._getSimlist())
    209     if not self._checkNFrames(data):
    210         return False

File ~/anaconda3/envs/100htmd/lib/python3.9/site-packages/htmd/adaptive/adaptiverun.py:246, in AdaptiveMD._getData(self, sims)
    241 # if self.contactsym is not None:
    242 #    contactSymmetry(data, self.contactsym)
    244 if self.ticadim > 0:
    245     # tica = TICA(metr, int(max(2, np.ceil(self.ticalag))))  # gianni: without project it was tooooo slow
--> 246     data = metr.project()
    247     data.dropTraj()  # Drop before TICA to avoid broken trajectories
    248     # 1 < ticalag < (trajLen / 2)

File ~/anaconda3/envs/100htmd/lib/python3.9/site-packages/htmd/projections/metric.py:159, in Metric.project(self, njobs)
    157     for proj in self.projectionlist:
    158         if isinstance(proj, Projection):
--> 159             proj._setCache(uqMol)
    160 else:
    161     logger.warning(
    162         "Cannot calculate description of dimensions due to different topology files for each trajectory."
    163     )

File ~/anaconda3/envs/100htmd/lib/python3.9/site-packages/moleculekit/projections/projection.py:31, in Projection._setCache(self, mol)
     30 def _setCache(self, mol):
---> 31     resdict = self._calculateMolProp(mol)
     32     self._cache.update(resdict)

File ~/anaconda3/envs/100htmd/lib/python3.9/site-packages/moleculekit/projections/metricdihedral.py:839, in MetricDihedral._calculateMolProp(self, mol, props)
    836 else:
    837     dihedrals = ensurelist(self._dihedrals)
--> 839 res["dihedrals"] = Dihedral.dihedralsToIndexes(mol, dihedrals, protsel)
    840 return res

File ~/anaconda3/envs/100htmd/lib/python3.9/site-packages/moleculekit/projections/metricdihedral.py:136, in Dihedral.dihedralsToIndexes(mol, dihedrals, sel)
    134     atomsel = atomsel & selatoms
    135     if np.sum(atomsel) != 1:
--> 136         raise RuntimeError(
    137             "Expected one atom from atomselection {}. Got {} instead.".format(
    138                 a, np.sum(atomsel)
    139             )
    140         )
    141     idx.append(np.where(atomsel)[0][0])
    142 indexes.append(idx)

RuntimeError: Expected one atom from atomselection {'name': "O3'", 'resid': 5, 'chain': '', 'insertion': '', 'segid': ''}. Got 0 instead.
stefdoerr commented 1 year ago

Can you share with me these folders please so I can take a look?

ad.generatorspath = './generators'
ad.inputpath = './input'
ad.datapath = './data'
ad.filteredpath = './filtered'
vas2201 commented 1 year ago

Thanks for your reply. I attached zip folder as drive link and contain requested folders. https://drive.google.com/file/d/10qc_C5LH-ZkUXjA9WHbXkLK0knQv7S_G/view?usp=share_link Regards Vas

stefdoerr commented 1 year ago

After filtering the Molecule has a segid "0"

In [16]: mol.segid
Out[16]: array(['0', '0', '0', ..., '0', '0', '0'], dtype=object)

So the error it gives

RuntimeError: Expected one atom from atomselection {'name': "O3'", 'resid': 5, 'chain': '', 'insertion': '', 'segid': ''}. Got 0 instead.

is correct because segid is not "". Fix your dihedral selections to have "segid": "0"

vas2201 commented 1 year ago

I have identified the problem. Upon building the build_amber, the segid was no longer present . That is causing conflict while defining the dihedrals in metric.

To address this, I utilized the following command to define the segid , which successfully resolved the issue.

bmol.set('segid', '0', sel='nucleic')

Thanks for your help on this matter. Regards Vas