Closed khoran closed 10 years ago
implemented through the "canonical" option of open babel.
Two things are actually needed here, one function to convert a whole compound to a canonicalized compound, and another to return the mapping of atom order between the original and the canonicalized compounds. The first is done through the convertFormat function with the "canonical" option, and the second is done by the canonicalNumbering function.
One thing to look into is the Morgan algorithm for uniquely labeling/numbering atoms in molecules. This will be important for various basic cheminformatics routines and it is entirely missing. The general problem is to be able to assign always the same numbers to the atoms no matter in what order they are presented in the input.
OpenBabel has some utility for it but I am not sure what will be best implementing our own (shouldn't be very difficult) or use OpenBabel's instead?
Here is the OpenBabel entry: http://openbabel.org/dev-api/canonical_code_algorithm.shtml And some pseudo code on JoeLib: http://www.ra.cs.uni-tuebingen.de/software/joelib/tutorial/algorithms/Morgan.html