epam / Indigo

Universal cheminformatics toolkit, utilities and database search tools
http://lifescience.opensource.epam.com
Apache License 2.0
315 stars 103 forks source link

Error: Wrong Result showing on Reaction exact search #700

Open sitanshubhunia opened 2 years ago

sitanshubhunia commented 2 years ago

Database: "PostgreSQL 12.7, compiled by Visual C++ build 1914, 64-bit" Bingo version: "1.9.1.r0-g353401f win64"

Wrong result found in reaction exact search [ index generated on reaction Smiles ]

Query: QRY_reaction.cml.txt

Target: ontarget_reaction_result_1.cml.txt ontarget_reaction_result_2.cml.txt ontarget_wrong_reaction_result_3.cml.txt

ontarget_wrong_reaction_result_3.cml.txt is wrong result. But in reaction exact search based on rsmiles is showing wrong result wrong_reaction_result

bingo_config_reaction

What is wrong in my end?

AlexanderSavelyev commented 2 years ago

Hi @sitanshubhunia Could you please inform what is the difference? From the screenshot I can see superatom ^Boc (is it superatom?) but in CML files there is no superatom, only carbon. If it is the case, smiles does not support superatoms. Please inform what is the original structures, or what is the difference in CML files? Thanks Aleksandr

Chandrim commented 2 years ago

Screenshot_2022-04-10-21-49-53-005_com.android.chrome.jpg This is protecting group

sitanshubhunia commented 2 years ago

@AlexanderSavelyev I am sorry. Reaction Structure was wrong. That's why " Boc " not included in cml file. Actually it is taken as a text. wrong_mrv.mrv.txt Right_mrv.mrv.txt

In wrong_mrv.mrv.txt I have found "^Boc" as

i_was_wrong

AlexanderSavelyev commented 2 years ago

Hi @sitanshubhunia

Indigo does not support alias atoms to be expanded. One should use sgroups superatoms (full abbreviation should be included). In your case as I understand, the alias is assigned to carbon atom, which returns carbon as a match

image

If it is incorrect understanding, could you please provide steps to reproduce (actual and expected results)? What structure have you indexed and what structure have you used as query?

Thanks Aleksandr

sitanshubhunia commented 2 years ago

Hi @AlexanderSavelyev

I understand what you wants to say.

FYI

I indexed below structure Indexed_structure.mrv.txt

Indexed_structure mrv

Same structure is used in query

sitanshubhunia commented 2 years ago

Hi @AlexanderSavelyev

Is it possible to extract text from mol/sdf/rxn .. file. when load into postgres db ?I mean to say any pgsql function is there?

If possible then how ?

If not , requesting for your kind consideration