epam / Indigo

Universal cheminformatics toolkit, utilities and database search tools
http://lifescience.opensource.epam.com
Apache License 2.0
312 stars 105 forks source link

Substructure Reaction Bingo/PostgreSQL (Explicit H issue) #134

Open pmagique opened 5 years ago

pmagique commented 5 years ago

Currently running Bingo on postgreSQL database and developping substructure reaction queries. I get results no problem on substrure reactions but explicit hydrogens and atom-mapping on queries do not seem to give appropriate results. For example the following query gives the following result (it shouldn't).

QUERY: SELECT * FROM reactions WHERE reaction_mdl_rxn_format @ ('$RXN

1 1 0 $MOL

Ketcher 02211910492D 1 1.00000 0.00000 0

9 9 0 0 0 999 V2000 4.7000 -4.5993 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.5090 -5.1871 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.2000 -6.1382 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.2000 -6.1382 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 3.8910 -5.1871 0.0000 C 0 0 0 0 0 0 0 0 0 1 0 0 3.6322 -4.2212 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 2.9251 -5.4459 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 4.4412 -3.6334 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 5.2000 -3.7333 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 0 0 2 3 1 0 0 0 3 4 1 0 0 0 4 5 1 0 0 0 5 1 1 0 0 0 5 6 1 0 0 0 5 7 1 0 0 0 1 8 1 0 0 0 1 9 1 0 0 0 M END $MOL

Ketcher 02211910492D 1 1.00000 0.00000 0

6 6 0 0 0 999 V2000 12.6867 -4.7747 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 12.5822 -5.7692 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 11.6040 -5.9771 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.1040 -5.1111 0.0000 C 0 0 0 0 0 0 0 0 0 1 0 0 11.7732 -4.3680 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 10.3969 -4.4040 0.0000 F 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 0 0 2 3 1 0 0 0 3 4 1 0 0 0 4 5 1 0 0 0 5 1 1 0 0 0 4 6 1 0 0 0 M END ','ALL')::bingo.rsub

RESULT: $RXN

1 1 $MOL

Ketcher 04231811342D 1 1.00000 0.00000 0

8 7 0 0 0 999 V2000 5.2500 -5.0757 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.4410 -4.4879 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.7500 -3.5368 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.7500 -3.5368 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.0590 -4.4879 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.1622 -2.7278 0.0000 F 0 0 0 0 0 0 0 0 0 0 0 0 7.0100 -4.7969 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.3750 -5.6250 0.0000 Cl 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 0 0 2 3 1 0 0 0 3 4 1 0 0 0 4 5 1 0 0 0 5 1 1 0 0 0 3 6 1 6 0 0 5 7 1 1 0 0 M END $MOL

Ketcher 04231811332D 1 1.00000 0.00000 0

14 14 0 0 0 999 V2000 5.2500 -5.0757 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 4.4410 -4.4879 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.7500 -3.5368 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.7500 -3.5368 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.0590 -4.4879 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.1622 -2.7278 0.0000 F 0 0 0 0 0 0 0 0 0 0 0 0 7.0100 -4.7969 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.2500 -6.0757 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 4.3840 -6.5757 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 6.1160 -6.5757 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0 6.1160 -7.5757 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.9821 -8.0757 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 5.2500 -8.0757 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 6.1160 -8.5757 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 0 0 2 3 1 0 0 0 3 4 1 0 0 0 4 5 1 0 0 0 5 1 1 0 0 0 3 6 1 6 0 0 5 7 1 1 0 0 1 8 1 0 0 0 8 9 2 0 0 0 8 10 1 0 0 0 10 11 1 0 0 0 11 12 1 0 0 0 11 13 1 0 0 0 11 14 1 0 0 0 M END

David

pmagique commented 5 years ago

In addition: Explicit H, ReactantProduct Atom mapping, Chirality do not work.

Strangely, I tried running queries on a set of 2200 reactions; changing ALL for any other parameter gives exactly the same set of results

select * from $table where $column @ ('$query', 'ALL')::bingo.rsub