Open bdeadman opened 3 months ago
Running the following on my (out of date) clone of ord-data, I get 411 records.
SELECT compound.smiles AS smiles, compound.reaction_role FROM ord.compound WHERE smiles LIKE 'OCCS' ;
Running the search in the online interface returns no results for 'Reactants & Reagents' = "OCCS" with the exact, similarity or substructure search options.
The same search with the SMARTS option returned 100 entries (limited by the query) but note that when replicating the query it now returns no results.
Running the following on my (out of date) clone of ord-data, I get 181 records.
SELECT compound.smiles AS smiles, compound.reaction_role FROM ord.compound WHERE smiles LIKE 'O=C1C=CC(=O)C=C1' ;
Running the search in the online interface returns no results for 'Reactants & Reagents' = "O=C1C=CC(=O)C=C1" with the exact, similarity or substructure search options. SMARTS search also failed to return results. This was on the production and staging instances of ord.
@skearnes @miori-nd who wants this one?
I'll look into this
Get Outlook for Androidhttps://aka.ms/AAb9ysg
From: Ben Deadman @.> Sent: Tuesday, August 6, 2024 1:12:12 PM To: open-reaction-database/ord-interface @.> Cc: miori @.>; Mention @.> Subject: Re: [open-reaction-database/ord-interface] Problems searching products by SMILES or SMARTS (Issue #125)
@skearneshttps://www.google.com/url?q=https://github.com/skearnes&source=gmail-imap&ust=1723569134000000&usg=AOvVaw3DIRgGStY81yK4rZCM4pCL @miori-ndhttps://www.google.com/url?q=https://github.com/miori-nd&source=gmail-imap&ust=1723569134000000&usg=AOvVaw2wx0aNSCsUSEgtl_FrZwJJ who wants this one?
— Reply to this email directly, view it on GitHubhttps://www.google.com/url?q=https://github.com/open-reaction-database/ord-interface/issues/125%23issuecomment-2271760232&source=gmail-imap&ust=1723569134000000&usg=AOvVaw2GFvCsYhccbPDNXi-sU4a-, or unsubscribehttps://www.google.com/url?q=https://github.com/notifications/unsubscribe-auth/BFLMMVRW2CPPXESLKVZ45CTZQD7WZAVCNFSM6AAAAABMCWPQCWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENZRG43DAMRTGI&source=gmail-imap&ust=1723569134000000&usg=AOvVaw0MfPVe9bSVSWqlAw-vD0Me. You are receiving this because you were mentioned.Message ID: @.***>
I looked at the product SMILES search; this is a timeout on the backend. I'll dig into the SQL query and see if I can optimize it.
I can definitely speed up the "exact" queries. Will push up soon.
@bdeadman I pushed the exact query fix to prod. The SMARTS patterns listed are invalid; put them in https://smarts.plus/smartsview for testing.
Thanks @skearnes. I was working of this resource https://www.daylight.com/dayhtml/doc/theory/theory.smarts.html. For my future reference the correct SMARTS pattern would be "[c,C]=[c,C]" and this validates in the tester you linked. The problem was I hadn't ended the SMARTS with a node (an atom).
I'll run some tests on the prod tomorrow.
Testing on production interface today. Values in table represent the number of reactions returned. Results limited to 100.
Search Term | Exact | Similar(0.5) | Substructure | SMARTS |
---|---|---|---|---|
"Reactants & Reagents" = | ||||
"SCCO" | 100 | 0 | 0 | 0 |
"OCCS" | 100 | 0 | 0 | 0 |
"O=C1C=CC(=O)C=C1" | 100 | 0 | 0 | 0 |
"C1=CC(=O)C=CC1=O" | 100 | 0 | 0 | 0 |
"c1ccncc1" | 0 | 0 | 100 | 100 |
"[c,C]=[c,C]" | NA | NA | NA | 100 |
"[c,C]#[c,C]" | NA | NA | NA | 0 |
"C#C" | 100 | 0 | 0 | 100 |
"Products" = | ||||
"SCCO" | 0 | 0 | 0 | 0 |
"OCCS" | 0 | 0 | 0 | 0 |
"CC=C1C=C(OC)N=C(N)N1" | 1 | 0 | 0 | 0 |
"c1ccncc1" | 7 | 0 | 100 | 100 |
"[c,C]=[c,C]" | NA | NA | NA | 100 |
"[c,C]#[c,C]" | NA | NA | NA | 0 |
"C#C" | 6 | 0 | 100 | 100 |
Outcomes from above testing:
As reported by a user, and observed by me in #122, the chemical searcher is not finding the expected results.