rxn4chemistry / biocatalysis-model

RXN for biochemical reactions
MIT License
61 stars 14 forks source link

Mapping files to convert between compound SMILES and source compound ID #7

Open jolespin opened 3 months ago

jolespin commented 3 months ago

The rxn_smiles here: https://github.com/rxn4chemistry/biocatalysis-model/blob/main/data/ecreact-1.0.csv

NC(=O)c1ccc[n+]([C@@H]2O[C@H](COP(=O)(O)OP(=O)(O)OC[C@H]3O[C@@H](n4cnc5c(N)ncnc54)[C@H](O)[C@@H]3O)[C@@H](O)[C@H]2O)c1.NCCC=O.O|1.2.1.8>>NCCC(=O)O

This splits into the following components of the reaction:

substrate="NC(=O)c1ccc[n+]([C@@H]2O[C@H](COP(=O)(O)OP(=O)(O)OC[C@H]3O[C@@H](n4cnc5c(N)ncnc54)[C@H](O)[C@@H]3O)[C@@H](O)[C@H]2O)c1.NCCC=O.O"
ec="1.2.1.8"
product="NCCC(=O)O"

The source for this is brenda_reaction_smiles

Would it be possible to provide mapping files that have the original compound identifier from the source database?