mychem / mychem-code

Mychem is an extension for MySQL that makes possible to use cheminformatics functions within SQL queries.
GNU General Public License v2.0
21 stars 14 forks source link

Added molecule_to_query #12

Closed fredrikw closed 8 years ago

fredrikw commented 8 years ago

Added molecule_to_query, a function to create SMARTS that do not add implicit H similar to what's used in opisomorph in openbabel. Calling molecule_to_smiles with e.g. an indole will add a hydrogen on the nitrogen, no matter if the h was stated explicitly or not in the input. This will pose problems in match_substruct, leading to substituted compounds not found when doing a search with a bare indole in e.g. MOL-file format. This change adds molecule_to_query that uses a functionality in OB that will turn off this behavior when the out option "h" is added together with a parameter (this is used in opisomorph).

With this change added, calling the SQL statement

SELECT molecule_to_query(smiles_to_molecule("n1ccc2c1cccc2")), molecule_to_smiles(smiles_to_molecule("n1ccc2c1cccc2"))
, molecule_to_query(smiles_to_molecule("[nH]1ccc2c1cccc2")), molecule_to_smiles(smiles_to_molecule("[nH]1ccc2c1cccc2"));

will return

molecule_to_query(smiles_to_molecule("n1ccc2c1cccc2")) molecule_to_smiles(smiles_to_molecule("n1ccc2c1cccc2")) molecule_to_query(smiles_to_molecule("[nH]1ccc2c1cccc2")) molecule_to_smiles(smiles_to_molecule("[nH]1ccc2c1cccc2"))
n1ccc2c1cccc2 [nH]1ccc2c1cccc2 [nH]1ccc2c1cccc2 [nH]1ccc2c1cccc2