Open YANGJJ93MS opened 1 year ago
Hi, thanks for your report. So I am less familiar with the SMARTS but here's what I've found:
rcdk::matches
uses SMARTSQueryTool under the hood.rcdk
should either fix or deprecate this function (@rajarshi would need to weigh in there)If you try it out at https://www.simolecule.com/cdkdepict/depict.html, the supplied SMARTS pattern does match the the molecule.
However, I agree that we should fix the function to use SMARTSPattern
On Sun, Feb 12, 2023 at 12:19 AM zachcp @.***> wrote:
Hi, thanks for your report. So I am less familiar with the SMARTS but here's what I've found:
- rcdk::matches uses SMARTSQueryTool https://github.com/CDK-R/cdkr/blob/master/rcdk/R/matching.R#L16 under the hood.
- SMARTSQueryTool is deprecated https://cdk.github.io/cdk/latest/docs/api/org/openscience/cdk/smiles/smarts/SMARTSQueryTool.html
- SMARTSPattern https://cdk.github.io/cdk/latest/docs/api/org/openscience/cdk/smarts/SmartsPattern.html is prefferred.
- I am not familiar enough with SMARTS to know if your example is truly expected to be negative. You should probaby confirm on the CDK users mailing list https://cdk.github.io/cdk/latest/docs/api/org/openscience/cdk/smarts/SmartsPattern.html with your specific pattern.
- rcdk should either fix or deprecate this function @.*** https://github.com/rajarshi would need to weigh in there)
— Reply to this email directly, view it on GitHub https://github.com/CDK-R/cdkr/issues/136#issuecomment-1426905952, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAIMOJR6AXJTU4XQJU5VATWXAUCTANCNFSM6AAAAAARQYH6EY . You are receiving this because you were mentioned.Message ID: @.***>
-- Rajarshi Guha | http://blog.rguha.net | @rguha https://twitter.com/rguha
Dear Rajarshi,
Thank you for your reply!
I found that the rdkit substructer matching fucntion did the same mistake. Please kindly find the picture below:
As a matter of fact, the substructure that I am looking for is an benzene structure with an ammonia side chain, which is totally different from the naphthalene structure. I believed the reason is that the algorithm took the naphthalene structure as an alkyl structure.
Best regards, Junjie
There is an issue for substructure match function. I got true value even if the substructure is not in the query molecule.
Please include a minimal reproducible example (AKA a reprex). If you've never heard of a reprex before, start by reading https://www.tidyverse.org/help/#reprex.
There is an issue for substructure match function. I got true value even if the substructure is not in the query molecule.
Screenshots
"rcdk::matches(query2,mol1) CN(C)c1cccc2c(S(=O)(=O)Oc3ccc4c5c3OC3C(=O)CC(O)C6(O)C(C4)N(CC4CC4)CCC536)cccc12.match TRUE"
System (please complete the following information):