gruenewald-lab / CGsmiles

Coarse-Grained Smiles (CGsmiles) for representing abitrarily complex molecules using a compact line notation
5 stars 2 forks source link

To Do: Hydrogen accounting via bond orders #14

Closed fgrunewald closed 3 days ago

fgrunewald commented 4 months ago

At the moment the code applies several pysmiles functions to add the right number of hydrogen atoms. In principle we should be able to just get that from the correct bond orders, however, aromaticity is kind of preventing this. The current code works but we should clean this up at some point. Related to PR #12.

pckroon commented 4 months ago

I think the correct solution is to count aromatic bond orders as 1, and deduct 1 for the valency from each aromatic atom.

fgrunewald commented 4 months ago

Unfortunetly not. Take p-cresol as example: {[#SN3a]1[#TC5]2[#SN2a][#TC5]12}.{#SN3a=[$][$]c[N+](=O)[O-],#TC5=[$]cc[$],#SN2a=[$][$]cOC}

The carbon in the SN3a/SN2a fragment has an hcount of 3 and then from two aromatic bonds so -2. It ends up with CH which coincidentially is valid for this molecule.

pcresol

pckroon commented 4 months ago

But, those carbons have valency 4. Subtract 3 bonds (of which 2 aromatic), subtract 1 because it's aromatic, leaves 0, right?

fgrunewald commented 4 months ago

yes but not in the place where you're thinking. because during the resolve process the valency is not known

pckroon commented 4 months ago

Fixing the hydrogens should be the very very last thing right? Or is this a pysmiles issue?

fgrunewald commented 3 days ago

this has been solved with #29