epam / Indigo

Universal cheminformatics toolkit, utilities and database search tools
http://lifescience.opensource.epam.com
Apache License 2.0
291 stars 100 forks source link

SMARTS query: Recursive SMARTS fail to find match in the reaction substructure #1088

Open alkorolyov opened 1 year ago

alkorolyov commented 1 year ago

Indigo sometimes fails to find a substructure match, when using recursive SMARTS

Steps to Reproduce

  1. I was able to reproduce the error using python indigo wrapper on Win 10 x64 and Ubuntu 22.04 x64,

  2. Script to reproduce:

    
    reactions = []
    reactions.append(indigo.loadReaction("CF.CC>>"))
    reactions.append(indigo.loadReaction("FC.CC>>"))
    reactions.append(indigo.loadReaction("CC.FC>>"))
    reactions.append(indigo.loadReaction("CC.CF>>"))
    reactions.append(indigo.loadReaction("C.CF>>"))
    reactions.append(indigo.loadReaction("C.FC>>"))
    reactions.append(indigo.loadReaction("FC>>"))
    reactions.append(indigo.loadReaction("CF>>"))

query = indigo.loadReactionSmarts("[$(CF)]>>")

for rxn in reactions: match = indigo.substructureMatcher(rxn).match(query) if match: print(rxn.smiles(), "matched", query.smarts()) else: print(rxn.smiles(), "not matched", query.smarts())


**Expected behavior**
All those reactions should match the query, as they all contain that substructure.

**Actual behavior**
The output is weird as it partially matches it, but starting to fail at some moment:
output:

CF.CC>> matched [$(CF)]>> FC.CC>> matched [$(CF)]>> CC.FC>> not matched [$(CF)]>> CC.CF>> not matched [$(CF)]>> C.CF>> not matched [$(CF)]>> C.FC>> matched [$(CF)]>> FC>> matched [$(CF)]>> CF>> matched [$(CF)]>>



**Indigo version**  

Tested on Win10 and Ubuntu 22.04

`1.10.0.0-ga65114f36-x86_64-win-msvc-1934`
`1.10.0.0-ga65114f36-x86_64-linux-gnu-11.2.1`

Python versions:
`Python 3.10.9 | packaged by conda-forge | (main, Jan 11 2023, 15:15:40) [MSC v.1916 64 bit (AMD64)] on win32`
`Python 3.8.10 (default, Nov 14 2022, 12:59:47) [GCC 9.4.0]`
AlexeyGirin commented 2 months ago

Moved to Refined Backlog since no fix over 5 versions