Open lunamorrow opened 1 month ago
@lunamorrow @cbouy you have here to an AtomGroup.
What you probably want is to a Universe
no? See example in RDKit reader here: https://github.com/MDAnalysis/mdanalysis/blob/develop/package/MDAnalysis/converters/RDKit.py#L35C1-L47C53
Direct to AtomGroup is probably not what you want.
Important in this as well is that RDKitReader is a subclass of MemoryReader
Just to make sure we all are on the same page in case there's any misunderstanding on the goal of the different classes that are set up for converters:
Parser
creates a topology from the "foreign" object (here an openbabel mol). The class should inherit from TopologyReaderBase
and define a parse
method that returns a Topology
with all the atom-level and residue-level attributes. For historical reasons the attribute under which the foreign object is available is self.filename
.Reader
reads a trajectory. In the case of an openbabel mol that means parsing the coordinates from each conformer. Because OBabel is not really meant to process huge files it's fine to assume everything will fit in memory hence the use of the MemoryReader
as a base class.Reader
and Parser
combined will automagically allow you to create a Universe
with u = mda.Universe(obmol)
Converter
does the opposite step from the above, i.e. convert an AtomGroup
or Universe
to a foreign object. You can directly inherit from ConverterBase
and define a convert
method, which can then be automagically used with obmol = my_atomgroup.convert_to.openbabel(<optional parameters>)
Hope this helps!
Ahhh ok, thanks @hmacdope and @exs-cbouy. I was planning to have the Parser
make a Universe, and the Reader
an AtomGroup but I see the redundancy now. What you've said makes sense @exs-cbouy, as I need to have the topology and the positions/trajectory to create a Universe
. I just had a quick look at documentation and it appears that MemoryReader
is for topologies with a Trajectory, while SingleFrameReaderBase
is for topologies with just one position set. The only trajectory accepted by OpenBabel seems to be xtc, which MDAnalysis already takes. I assume it is best practice to inherit from MemoryReader
though so that the converter can capture all possible info? I'll change that over now.
I suspect it would be best for me to start on the Parser' before the
Reader` too. What would you suggest @exs-cbouy, seeing as you have done it before?
I haven't used openbabel much but I'm guessing it can store coordinates for each conformer on the same molecule object (like the RDKit does), in which case the MemoryReader
makes sense (since you won't always have a single set of coordinates for a given molecule).
Yes I would suggest doing the Parser
before, I don't remember if you really need the Reader
to start playing around and constructing a Universe from an openbabel mol, but worst case scenario you could just use dummy coordinates in the Reader
to begin with.
To clarify this further, @lunamorrow by trajectory here we just mean "any set of coordinate data" which much be present in ANY format, not just that with more than one frame or a traditional MD format like xtc. For example, using the MemoryReader
you can make a trajectory from a raw numpy array. You will conceptually at least do the same but after extracting the data from Obabel
I'm guessing it can store coordinates for each conformer on the same molecule object
Yes it appears so, I will double check their API to be safe.
Yes I would suggest doing the Parser before,
Great I'll get going on that first then
To clarify this further, @lunamorrow by trajectory here we just mean "any set of coordinate data" which much be present in ANY format, not just that with more than one frame or a traditional MD format like xtc. For example, using the
MemoryReader
you can make a trajectory from a raw numpy array. You will conceptually at least do the same but after extracting the data from Obabel
Thanks for the clarification @hmacdope! I didn't know you could just feed in a numpy array too, that is really cool.
The first step of this OpenBabel converter will be to convert OpenBabel OBMols to MDAnalysis AtomGroups. This will enable the indirect parsing of over 100 file types into a format that MDAnalysis tools can analyse.
The
OpenBabelReader
will take an OBMol and correctly convert it to an AtomGroup. This Class will need to account for different attributes in OBMol objects formed from different file types, and will exploit the OpenBabel python wrappers for easy access of attributes. The resulting AtomGroup can be analysed as is, or assigned to a Topology or a Residue/Segment.During the creation of this converter class, I will be reaching out to active OpenBabel contributors to gain advice and input about how best to develop it.
For more information and suggested implementation please see GSoC Project.