mychem / mychem-code

Mychem is an extension for MySQL that makes possible to use cheminformatics functions within SQL queries.
GNU General Public License v2.0
21 stars 14 forks source link

Incorrect substructure match #11

Closed studur closed 8 years ago

studur commented 8 years ago

Hi,

First, thank you for your work on this MySQL cartridge.

I installed the Mychem cartridge on a Ubuntu 15.04 using MySQL 5, openbabel 2.3.2, mysqldb...

I use JSME from Peter Ertl to input SMILES into a mysql query ot the type : SELECT compounds.name FROM compounds,bin_structures WHERE compounds.id=bin_structures.compound_id AND MATCH_SUBSTRUCT('smiles inserted here',obserialized);

I queried both a test database taken from Chemical Structures (http://chem-file.sourceforge.net/) converted to a sdf with openbabel and another database containing over 3000 compounds.

When I search for the pattern 'NCC(=O)O' corresponding to glycine or 'C(=O)O' for a carboxylic acid, the query does not return the expected molecules.

Is it a problem originating from my source sdf ? From the fingerprints ?

Thank you for helping me

fredrikw commented 8 years ago

Hi,

It is really difficult to say anything about the problem from the information you give in the email. The query looks ok, but since it is not the exact query (I suppose that the text ’smiles inserted here’ isn’t part of your query) you are running I cannot say that you didn’t type something wrong and since the test database isn’t included we cannot test that one either. Do you have the possibility to provide a small docker machine, or at least a SQL script to recreate your database?

The problem is at least not with the fingerprints, since they aren’t used at all in your query.

Kind regards, Fredrik

studur commented 8 years ago

Hi, Sorry, I solved the problem yesterday. The culprit was a sterilizing php routine used on my web page inputs to eliminate certain symbols used in SQL injection attacks. By echoing the submitted SQL query I saw that my submitted query lacked the = symbol in C(=O)O. I had to rewrite the routine to allow the symbols used in smiles. Thanx again for this nice piece of software. Works like a charm now