theicechol / metamoles

UW-DIRECT Project on metabolite retrosynthetic analysis
MIT License
7 stars 2 forks source link

ML task on our project #3

Closed theicechol closed 5 years ago

theicechol commented 5 years ago

We must re-state our ML idea on our data/ molecular fingerprints real soon. Hopefully, we can shape it up by this weekend.

Let's put some information on this issue so that it is easy for us to follow.

theicechol commented 5 years ago

Let me try to point it out First, use the input compound to generate a bunch of compounds based on the similarity index. Then, seek for enzymes that accept one of those substrates for their regular substrate. Finally, go back to the similarity index and rank the enzymatic reaction based on the similarity score.

Still kinda vague, there must be a getter approach!!!

theicechol commented 5 years ago

1) An input data for us should be either SMILES/SMARTS/InChI strings or we can just use a database entry like PubChem entry to call for that particular compound.

No serious problem on managing the input format.

theicechol commented 5 years ago

In KEGG database, the PubChem ID is available which we can direct to that PubChem entry and retrieve the desired string (SMILES or InChI)

theicechol commented 5 years ago

2) From the known enzymes with substrate promiscuity in our database