Open tovrstra opened 3 years ago
I've added a few PRs adding support for bond information to a few more file formats, just to get a better feel of what is needed, see #248 and #250. There are a few different things needed by various formats. Splitting up the current bonds
array into a three attributes seems most appropriate:
bonds
just becomes an array with two columns for pairs for atoms (first two columns of the current bonds
attribute.bondtypes
is a vector with an integer kind for each bond. Integers can be mapped to various string representations of bond types, to deal with incompatible nomenclatures.bondorders
is a floating point vector with just the bond orders of the bonds listed in bonds
. This is needed e.g. by QCSchema.The usual validation mechanisms can impose consistent sizes of these arrays and a read-only nbond
property can be added in line with natom
.
In the long run, we can add machinery to deduce a decent bondorders vector from the bondtypes (and vice versa), but that is a secondary concern. (This would make more sense after addressing #191.) Just getting the right attributes in place should be done first because it affects API and blocks a 1.0 release.
At the moment, bond orders are force to be integer because they are stored in the same array containing the atomic indices. This can be fixed easily with a structured data type:
np.dtype("int, int, float")
.