asad / ReactionDecoder

Reaction Decoder Tool (RDT) - Atom Atom Mapping Tool
GNU Lesser General Public License v3.0
76 stars 24 forks source link

Co-transport mapping error #21

Closed cfrainay closed 3 years ago

cfrainay commented 3 years ago

Hi!

I inadvertently run RDT on a co-transport reaction and was surprised that the results didn't yield the seemly trivial mapping between pairs of identical molecules. Here is what I ran (an obvious transport reaction and a less easy to spot one with non-canonical smiles):

java -jar rdt-2.4.1-jar-with-dependencies.jar -g -c -j AAM -f TEXT -Q SMI -q "O=C(N)NCCCC(N)C(=O)O.O=C(O)C(N)CCCN>>O=C(N)NCCCC(N)C(=O)O.O=C(O)C(N)CCCN"
java -jar rdt-2.4.1-jar-with-dependencies.jar -g -c -j AAM -f TEXT -Q SMI -q "O=C(O)C(N)CC(=O)N.O=C(O)C(N)CS>>C(N)(CC(=O)N)C(=O)O.O=C(O)C(N)CS"

I looked a bit and it seems that Recon3D contains many co-transporters that involve compounds with high similarity, resulting in, I believe, erroneous mapping from RDT.
See other examples here, here, and here.

Is there any way to fix this or detect those cases? Thanks!

asad commented 3 years ago

Thanks.

Let me check, do you get the same output as attached below?

java -jar rdt-2.4.1-jar-with-dependencies.jar -g -c -j AAM -f TEXT -Q SMI -q "O=C(N)NCCCC(N)C(=O)O.O=C(O)C(N)CCCN>>O=C(N)NCCCC(N)C(=O)O.O=C(O)C(N)CCCN"

ECBLAST_smiles_AAM

cfrainay commented 3 years ago

Yes, I got the same mapping (not exactly the same picture tho, cyan and purple are swapped)

asad commented 3 years ago

Is this mapping not as expected?

johnmay commented 3 years ago

It's a transport reaction, the same thing appears on the left as right, there should be no bonds broken.

johnmay commented 3 years ago
[NH2:9][CH:8]([CH2:7][CH2:6][CH2:5][NH:4][C:2]([NH2:3])=[O:1])[C:10]([OH:12])=[O:11].[NH2:21][CH2:20][CH2:19][CH2:18][CH:16]([NH2:17])[C:14]([OH:15])=[O:13]>>[NH2:9][CH:8]([CH2:7][CH2:6][CH2:5][NH:4][C:2]([NH2:3])=[O:1])[C:10]([OH:12])=[O:11].[NH2:21][CH2:20][CH2:19][CH2:18][CH:16]([NH2:17])[C:14]([OH:15])=[O:13]
asad commented 3 years ago

Agree, if we know its a transporter one could accept a mapping without bond changes. However there are also counter examples in the pathways.

GLHolliday79 commented 3 years ago

Hi,

When we were developing the tool, we were focused on the problem of mapping chemical reactions where the basic assumption if you're trying to map a reaction, then there will always be a reaction to map. Obviously, this doesn't consider the case of a transport mechanism where there isn't a reaction at all (as far as we know, anyway, if I've learnt anything working on the mechanisms of enzyme reactions it's to assume nothing!).

There are several cases where the "correct" mapping (where correct is in relation to the biochemical reaction mechanism) isn't necessarily the lowest energy result. There were a couple of examples in the original paper for this tool (KEGG reaction R01148, Figure C in the supplementary material). Other examples include EC 2.1.4.1 and KEGG reaction R00021. The tool was originally optimised for such cases, hence, there the assumption I mentioned earlier.

Obviously, that doesn't solve the issue :-)

It depends how complex you want to get as to how to solve it -- you could add a transporter flag, or a "give me the lowest energy solution" flag (on the assumption that no bond changes is the lowest energy solution, of course), even "give me all the answers" flag if you're feeling greedy. All solutions will have some value (you might not know the biochemical reaction mechanism, so you might want all the possible mapping solutions for example).

I hope that helps!

cfrainay commented 3 years ago

Hi!

Thanks everyone for the quick feedback, RDT is a great tool and it's nice to see that there are people watching over it!

I agree that mapping transport reactions is a bit silly, or let's say 'out of the scope' of RDT, I was wondering if this unexpected behavior could hide something else, but your explanations clarified that for me, thanks again.

However, when using the output of RDT for atom tracking in a metabolic network, one will still have to deal with those transport reactions. Luckily they're easy to spot. Is there a way in the RDT api to label the reactants' atoms without going through to the mapping part? For example having

O=C(O)C=C(C(=O)O)CC(=O)O.O=C(O)C(=O)CCC(=O)O

and obtain

[O:1]=C:2[CH:4]=C:5[CH2:9]C:10[OH:12].[O:13]=C:14C:16[CH2:18][CH2:19]C:20[OH:22]

This way I could just duplicate this to the other side of the transport reactions where the structure is kept unchanged, without performing the unnecessary mapping...

thanks again,

asad commented 3 years ago

Here is another example https://www.genome.jp/dbget-bin/www_bget?rn:R02996 image

asad commented 3 years ago

Hi!

Thanks everyone for the quick feedback, RDT is a great tool and it's nice to see that there are people watching over it!

I agree that mapping transport reactions is a bit silly, or let's say 'out of the scope' of RDT, I was wondering if this unexpected behavior could hide something else, but your explanations clarified that for me, thanks again.

However, when using the output of RDT for atom tracking in a metabolic network, one will still have to deal with those transport reactions. Luckily they're easy to spot. Is there a way in the RDT api to label the reactants' atoms without going through to the mapping part? For example having

O=C(O)C=C(C(=O)O)CC(=O)O.O=C(O)C(=O)CCC(=O)O

and obtain

[O:1]=C:2[CH:4]=C:5[CH2:9]C:10[OH:12].[O:13]=C:14C:16[CH2:18][CH2:19]C:20[OH:22]

This way I could just duplicate this to the other side of the transport reactions where the structure is kept unchanged, without performing the unnecessary mapping...

thanks again,

Thanks, we will discuss and propose a solution, perhaps a flag.

asad commented 3 years ago

Added -b option to accept no bond change solution.

a) compile mvn -f pom-local.xml clean compile assembly:single install -Dmaven.test.skip=true b) run java -jar rdt-2.5.0-SNAPSHOT-jar-with-dependencies.jar -g -c -j AAM -f TEXT -Q SMI -q "O=C(O)C(N)CC(=O)N.O=C(O)C(N)CS>>C(N)(CC(=O)N)C(=O)O.O=C(O)C(N)CS" -b c) PNG ECBLAST_smiles_AAM

RDT v2.5.0 pre release: https://github.com/asad/ReactionDecoder/releases/tag/v2.5.0

cfrainay commented 3 years ago

Awesome! thank you very much for the prompt response