iwatobipen / align-it-ob3

GNU Lesser General Public License v3.0
7 stars 1 forks source link

RDKit port #2

Open OliverBScott opened 2 years ago

OliverBScott commented 2 years ago

Hi there, thanks for sharing this code and also the code for shape-it, it is really great to see people supporting the open-source community! I have been meaning to try to learn/use rdkit in c++ for a while and this gave me an excuse. My attempt can be found here https://github.com/OliverBScott/align-it , and also has Python wrappers (pyalignit). I am sure there are still a few bugs so I would be grateful for any contributions!

When working on this I also noticed that the ported code produces different pharmacophores for molecules with and without hydrogens. With the original code, this doesn't happen (hydrogen agnostic). I think the issue is due to a couple of functions that have been changed incorrectly. For example in hDonFuncCalc.cpp the original code (ob2) at line 43 is:

if (a->GetFormalCharge() >= 0 && ((a->GetImplicitValence() - a->GetHvyValence()) !=0)) 

This line seems to be checking whether there are any hydrogen atoms connected. The port replaces GetImplicitValence() with GetImplicitHCount(). While this makes sense logically it seems that the GetImplicitValence() function does not do what you would expect, from the documentation:

Returns: The implicit valence of this atom type (i.e. maximum number of connections expected) 

while GetImplicitHCount() seems to do what you would expect , from the documentation:

Returns: The number of non-hydrogens connected to this atom

i.e. if the expected number of connections of a nitrogen atom is 3 and its heavy degree is 2 then we expect it to have 1 hydrogen atom.

I think that to get the equivalent behaviour with openbabel3 one could use the following to get the number of hydrogens while remaining agnostic to whether hydrogens are explicitly included:

a->GetImplicitHCount() + a->ExplicitHydrogenCount()

and thus the line becomes:

if (a->GetFormalCharge() >= 0 && ((a->GetImplicitHCount() + a->ExplicitHydrogenCount()) !=0)) 

I think that openbabel2 had very confusing function names regarding valence and H counting!

There are a few more cases of this particular function in the files:

I have attempted to fix these here but it is possible that there may still be issues.

Thanks again!

Oliver

iwatobipen commented 2 years ago

Hi @OliverBScott Thanks for your issue! It makes sense. I'll check code. Best, @iwatobipen