Closed Foly93 closed 1 year ago
Hi @Foly93, thanks for the suggestion. I think historically the trajectories we read fall under two categories:
u.atoms.fragments
to get individually bonded moleculesu.atoms.guess_bonds()
)Either way, once you do have bonds (and it sounds like your universe does) -- you could use .fragments
to get molecules, I think. This topology attribute is computed by associating molecules together by bonds. e.g.
>>> import MDAnalysis as mda
>>> from MDAnalysis.tests.datafiles import TPR, XTC
>>> u = mda.Universe(TPR, XTC) # protein and solvent molecules
>>> u.atoms.fragments
(<AtomGroup with 3341 atoms>,
<AtomGroup with 3 atoms>,
<AtomGroup with 1 atom>,
<AtomGroup with 3 atoms>,
<AtomGroup with 1 atom>,
<AtomGroup with 3 atoms>,
<AtomGroup with 1 atom>,
...
)
I think you could use the number of atoms here to guess which molecules are solvent vs ions.
An even better alternative would be a function called guess_chains(), that would assign water (n_atoms > 1 and n_atoms < 10) and ions (n_atoms == 1) chain X and macromolecules or ligands chain A,B,C etc.
I can see how that could be very valuable for trajectories within the biological domain, but it sounds like it makes many assumptions that would make it difficult to generalise outside of it. If .fragments
is sufficiently convenient, it may be easier to work with that and assign solvent/ions/ligands etc individually for each system.
@Foly93 were you able to use fragments
?
Hi @orbeckst, yes it worked! thank you very much.
Thanks for the feeback @Foly93 ! Given that fragments works, I am closing this issue. Please feel free to request it to be re-opened if necessary.
Hello everybody,
I am working with amber and I wanted to create a reduced system, partially with the help of MDAnalysis. The problem that I ran into, is that
ag.write('test.pdb')
would not identify the termini of the molecules correctly which led to problems later on. Ofc, one could manually pick the molecules viau.select_atoms('..')
, however, that is not very robust for various systems. Therefore, imo a get_Molecules() method would make sense, that identifies molecules in the system. The idea is to assign the atoms of these molecules to individual chains that would then also be printed to the PDB file. It was shown, that the initial problem got solved that way.I came up with some lines to take care of that. However, for my system (50000 atoms) it takes about 11 minutes, which is somewhat lenghty.
but it works.
An even better alternative would be a function called guess_chains(), that would assign water (
n_atoms > 1 and n_atoms < 10
) and ions (n_atoms == 1
) chain X and macromolecules or ligands chain A,B,C etc. Thanks for reading and thanks for any help.Cheers!