gruenewald-lab / CGsmiles

Coarse-Grained Smiles (CGsmiles) for representing abitrarily complex molecules using a compact line notation
4 stars 1 forks source link

To Do: Hydrogen accounting via bond orders #14

Open fgrunewald opened 1 month ago

fgrunewald commented 1 month ago

At the moment the code applies several pysmiles functions to add the right number of hydrogen atoms. In principle we should be able to just get that from the correct bond orders, however, aromaticity is kind of preventing this. The current code works but we should clean this up at some point. Related to PR #12.

pckroon commented 1 month ago

I think the correct solution is to count aromatic bond orders as 1, and deduct 1 for the valency from each aromatic atom.

fgrunewald commented 1 month ago

Unfortunetly not. Take p-cresol as example: {[#SN3a]1[#TC5]2[#SN2a][#TC5]12}.{#SN3a=[$][$]c[N+](=O)[O-],#TC5=[$]cc[$],#SN2a=[$][$]cOC}

The carbon in the SN3a/SN2a fragment has an hcount of 3 and then from two aromatic bonds so -2. It ends up with CH which coincidentially is valid for this molecule.

pcresol

pckroon commented 1 month ago

But, those carbons have valency 4. Subtract 3 bonds (of which 2 aromatic), subtract 1 because it's aromatic, leaves 0, right?

fgrunewald commented 1 month ago

yes but not in the place where you're thinking. because during the resolve process the valency is not known

pckroon commented 1 month ago

Fixing the hydrogens should be the very very last thing right? Or is this a pysmiles issue?