mosdef-hub / mbuild

A hierarchical, component based molecule builder
https://mbuild.mosdef.org
Other
173 stars 80 forks source link

Conversion from `pybel.Molecule` to `mb.Compound`? #554

Closed ahy3nz closed 5 years ago

ahy3nz commented 5 years ago

Describe the behavior you would like added to mBuild Openbabel has a huge variety of supported file formats ( #430 uses pybel to convert a smiles string into an mbuild compound), but could we also utilize openbabel in a general enough manner to be able to convert any pybabel Molecule into an mb Compound?

In particular, this references #376 , but I'm not sure if openbabel correctly parses all relevant lattice information (at least it looks like we can get lattice vectors and space groups)

Describe the solution you'd like Similar to from_trajectory, or from_parmed, could we have from_pybel/from_openbabel? A user would use pybel to read all the different openbabel-formats, then call from_pybel to generate the associated mb.Compound?

Or also something like to_pybel to help open up a lot of output-file-formats. For example, #553 goes compound -> pdb -> pybel -> smiles, but maybe we could shorten the pipeline: compound -> pybel -> smiles/cif/something.

Our only job then would be to handle pybel <-> compound

Describe alternatives you've considered I think this could open up a lot of file formats, but this might be superfluous addition/rabbithole to try and accommodate pybel.Molecule and however much information is stored in there

Additional context openbabel is already in requirements-dev, so we're not adding to the dependencies, or rather openbabel and mbuild-openbabel functionality would probably just be optional if the user has openbabel installed or not?

mikemhenry commented 5 years ago

I like this! openbabel has a ton of utility that we could use, adding and removing H, it also has smiles matching, lots of good stuff. I haven't poked into the pybel.Molecule data structure to see what we would need to do to create one.

mattwthompson commented 5 years ago

I think would be a good addition to the new backend. It would probably be a bit more work than just implementing it here but it would be a more robust solution. I'm assuming babel has a lot more detail in their molecule class than we need here. It would also filter out all the "should this be included in mbuild?" questions to a single filter, and anything that would not get passed to mbuild would still be stored in that top object.

mikemhenry commented 5 years ago

They have an example that looks like a good starting point for creating a pybel/openbabel molecule object

import openbabel, pybel

mol = openbabel.OBMol()
a = mol.NewAtom()
a.SetAtomicNum(6)   # carbon atom
a.SetVector(0.0, 1.0, 2.0) # coordinates
b = mol.NewAtom()
mol.AddBond(1, 2, 1)   # atoms indexed from 1

pybelmol = pybel.Molecule(mol)
pybelmol.write("sdf", "outputfile.sdf")
ahy3nz commented 5 years ago

Yeah I think we can split up the work into to_pybel and from_pybel or something like that? Did you have a preference on which function you'd like to write? I have no idea how much more work lies ahead, but we can start drafting some code