chembl / ChEMBL_Structure_Pipeline

ChEMBL database structure pipelines
MIT License
193 stars 38 forks source link

Salt remove issue #28

Closed aqiph closed 2 years ago

aqiph commented 3 years ago

Hi, thanks for implementing this useful pipeline!

We come up with one issue related to salt remove: We found that: For these two molecules, the 'standardize_mol' function followed by the 'get_parent_mol' function will not remove the salt part: Smiles one: CN1C[C@H]2CC@@HC[C@H]2C1.Cc1cc(C(=O)O)ncn1 Smiles two: Cc1ccc(-c2nc3ccc(C)cn3c2CC(=O)N(C)C)cc1.O=C(O)C(O)(O)C(O)(O)C(=O)O

Another question: For a dimer/polymer molecule, will the two functions 'standardize_mol' and 'get_parent_mol' change it to a monomer?

Thanks! @greglandrum Could you please help us to solve this problem?

eloyfelix commented 2 years ago

the salt part in your molecules is not on the ChEMBL standardiser list of salts (https://github.com/chembl/ChEMBL_Structure_Pipeline/blob/master/chembl_structure_pipeline/data/salts.smi), and hence the standardiser won’t strip the salt.

get_parent_mol won't return the monomer for a polymer.