Closed schneiderfelipe closed 5 years ago
Originally I was going to try and have two separate configs: one that writes using pybel.Molecule
attributes, and one that writes using cclib attributes. Each config would contain a "prototypical" boilerplate for that parser. The (current) ones for pybel are simple, because it has an input generator already, though not customizable. The logic to go from a ccdata
to a template will have to live somewhere no matter what path is chosen. You could take every attribute present in the resulting ccdata
and inject it into a pybel.Molecule
, but that would be confusing compared to a subclass.
Another tricky thing is that pybel can read multiple molecules/fragments from a file, and cclib cannot yet. Do you ever use this capability?
For reference, I took all the attributes of a pybel molecule and a ccdata instance from a geometry optimization and
In [42]: attrs_ccdata.difference(attrs_pbmol)
Out[42]:
{'OPT_DONE',
'OPT_NEW',
'OPT_UNCONVERGED',
'OPT_UNKNOWN',
'_attributes',
'_attrlist',
'_dictsofarrays',
'_intarrays',
'_listsofarrays',
'arrayify',
'atomcharges',
'atomcoords',
'atomnos',
'closed_shell',
'converged_geometries',
'coreelectrons',
'geotargets',
'geovalues',
'getattributes',
'grads',
'homos',
'listify',
'metadata',
'moenergies',
'moments',
'mult',
'natom',
'nbasis',
'nelectrons',
'new_geometries',
'nmo',
'optdone',
'scfenergies',
'scftargets',
'scfvalues',
'setattributes',
'typecheck',
'unconverged_geometries',
'unknown_geometries',
'writecml',
'writejson',
'writexyz'}
In [43]: attrs_pbmol.difference(attrs_ccdata)
Out[43]:
{'OBMol',
'__iter__',
'_cinfony',
'_exchange',
'_gettitle',
'_repr_html_',
'_repr_svg_',
'_settitle',
'addh',
'atoms',
'calccharges',
'calcdesc',
'calcfp',
'clone',
'conformers',
'convertdbonds',
'data',
'dim',
'draw',
'energy',
'exactmass',
'formula',
'localopt',
'make3D',
'molwt',
'removeh',
'residues',
'spin',
'sssr',
'title',
'unitcell'}
In [45]: attrs_pbmol.intersection(attrs_ccdata)
Out[45]:
{'__class__',
'__delattr__',
'__dict__',
'__dir__',
'__doc__',
'__eq__',
'__format__',
'__ge__',
'__getattribute__',
'__gt__',
'__hash__',
'__init__',
'__init_subclass__',
'__le__',
'__lt__',
'__module__',
'__ne__',
'__new__',
'__reduce__',
'__reduce_ex__',
'__repr__',
'__setattr__',
'__sizeof__',
'__str__',
'__subclasshook__',
'__weakref__',
'charge',
'write'}
Even though the attributes are mostly orthogonal, I wonder if the solution is as simple as below. Extra logic would be needed to handle charge
and mult
. It isn't quite right because Molecule
isn't a subclass of anything. Some of these choices might depend on if you want to read a file with both packages or just one, if you want to present a unified interface, etc.
class Molecule:
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.ccdata = None
self.pbmol = None
def update(self):
attrs_ccdata = set()
attrs_pbmol = set()
if self.ccdata:
attrs_ccdata = set([attr for attr in dir(self.ccdata)
if attr[0] != '_'])
for attr_ccdata in attrs_ccdata:
# set: obj, name, value
# get: obj, name
setattr(self, attr_ccdata,
getattr(self.ccdata, attr_ccdata))
if self.pbmol:
attrs_pbmol = set([attr for attr in dir(self.pbmol)
if attr[0] != '_'])
for attr_pbmol in attrs_pbmol:
setattr(self, attr_pbmol,
getattr(self.pbmol, attr_pbmol))
Hi Eric,
I've been working on a related project where I needed to created a layer of abstraction similar to the one needed here (through the Atoms
class). I made pnictogen depend on it.
This solves the cclib support by integrating cclib and pybel in a unified interface. I never wanted different templates to work with different backends, so the current approach makes me happy. Furthermore, new backends can be added in no time.
The approach relies somewhat on the cclib-to-pybel bridge, which brings some minor drawbacks:
molecule.mult
instead of molecule.spin
). I believe that's no problem (changes are straighforward), but it makes the current interface not backward compatible.This effectively delegates all the code around molecule manipulation and such to the pyrrole package. I'd be happy to accept contributions there as well!
I'm closing this, feel free to reopen it if needed.
Initially, I thought about sub-classing
pybel.Molecule
but eventually ended up passing the object to the template engine as is. Since now we want to add support for other parsers (such as cclib), I believe this extra layer of abstraction is unavoidable.The point is, should we sub-class
pybel.Molecule
?