Wang-Bioinformatics-Lab / ChemicalStructureWebService

Web Server for Chemical Structure pictures as well as other chemical structure things
https://structure.gnps2.org
MIT License
6 stars 2 forks source link

Suggestion - refactor #10

Closed jvansan closed 4 years ago

jvansan commented 5 years ago

I would like to propose a refactor of the API so that the RDKit logic is separated from the Flask logic.

I have a prototype of the required change implemented at https://github.com/jvansan/ChemicalStructureWebService/tree/refactor.

I believe that these changes will improve the maintainability, consistency, and extendability of this project in the future.

The only con that I can see is that there is a very minuscule overhead for object create. To expand on this a little, I have created a "speed test" at https://github.com/jvansan/ChemicalStructureWebService/blob/refactor/unit_test/speed_test.py

In this, I converted growing SMILES strings of linear alkanes, in triplicate.

From the results, you can see three methods listed in the comment:

1) mwang = 18.703308335971087,18.248935278039426,18.457531138090417 seconds to execute 500 calls
2) JvS - precalculated = 21.32231805101037,20.969998513930477,20.78585294005461 seconds to execute 500 calls
3) JvS - calc on fly (+cache) = 18.714338005986065,18.474315379047766,18.3609054380795 seconds to execute 500 calls

What I did was run this script three times for each method. 1) mwang - that is the repo in it's current state - you will see that this is marginally the fastest method 2) JvS precompute - in this approach I suggest pre-computing the SMILES, InChI, and InChIKey which slows the method down a little bit 3) JvS calc on fly - in this approach I suggest calculating properties as needed, and caching them in the object. This approach is very close in timing to the current codebase and therefore may be the better option.

I originally wrote method 2 because I wanted an API endpoint which sends all the chemical structure formats to the front end I am developing. I then realized that there would be some overhead for pre-computing, and so decided to try the on the fly method. Ultimately, I don't think the timing difference any difference, I simply wanted to present the facts.

If you think these changes are acceptable, we can decide on which approach works best for everyone (2 or 3), and then work on improving the system as need. I have already implemented all the required features to create both SVG/PNG images as desired, as well as similarity matching.

PS before merging, I would also suggest a code cleanup of the old Indigo infrastructure. I haven't read through this code, but I have left it for the time being to see if there are any features there that we may want to add back.

mwang87 commented 5 years ago

Hey Jeff,

I think thats a good abstraction. The speed is not that big of a concern since its a web service. Network latency will kill us there anyway.

I'm going to do a clean up of the indigo code likely won't need it, then I'll have you ship over a pull request and we can merge it in. We're almost there.

Ming

mwang87 commented 4 years ago

Hey @jvansan I think I'm going to pull in some of these changes as part of the refactor. I like the factory concept, especially for embedding images when we don't want to client to be smart to know if there is smiles or Inchi, when one could be absent.

jvansan commented 4 years ago

@mwang87 I like the idea as well. You could even image setting it up so that that server uses regex to automatically detect the data input type.

mwang87 commented 4 years ago

@jvansan Oooh, thats fancy, was probably going to do simple thing for now and put the infrastructure there to do fancy.

mwang87 commented 4 years ago

Note to self, small test case:

https://gnps-structure.ucsd.edu/structureimg?smiles=N%2FA&inchi=InChI%3D1S%2FC18H19NO4%2Fc1-23-17-12-14(4-8-16(17)21)5-9-18(22)19-11-10-13-2-6-15(20)7-3-13%2Fh2-9%2C12%2C20-21H%2C10-11H2%2C1H3%2C(H%2C19%2C22)%2Fb9-5%2B&width=350&height=250

mwang87 commented 4 years ago

@jvansan I did a major refactor and added on some fun bits of logic and added a UI. Throwing it up on a test server. Checkout the revisions here:

https://github.com/mwang87/ChemicalStructureWebService/pull/15

The UI will need improvement, but I think its a decent start.

jvansan commented 4 years ago

This is looking good. I think you're right that the UI is a decent start. I'm going to suggest a few small additions just to improve the UX.

jvansan commented 4 years ago

My latest commit to branch refactor-ux adds a copy to clipboard option and a clear button.

mwang87 commented 4 years ago

Looks great! Merging into refactor.

mwang87 commented 4 years ago

https://github.com/mwang87/ChemicalStructureWebService/pull/15 passes unit and integration tests. Will merge into rdkit-refactor and deploy into production.