nleroy917 / optipyzer

Multi-Species Codon Optimization Engine
https://optipyzer.com
Apache License 2.0
23 stars 5 forks source link

Bechmarking + User Interface In Docker #49

Closed Epannec closed 1 year ago

Epannec commented 1 year ago

Hi,

Thank you for providing a docker image server, it makes it a bit more easy to run. Are there any plans on making also the webserver (with GUI) available?

Secondly, I was wondering whether some benchmarking has been done that "prove" that expression rates of transfected sequences are in line with the results from other established algorithms?

Thanks, Erwin

nleroy917 commented 1 year ago

Hi Erwin -

RE: the GUI Including the UI in the docker container is a great idea. They are completely de-coupled, so in theory you could just spin up the web server from the source code and point it at your local docker server. I can probably write up some documentation for this or even create a compose file that will do this for you.

RE: benchmarks This is a great question. It's tough to do any such benchmarking without getting into the lab and just doing the transfections and measuring expression levels. I have gone down this benchmarking road in silico and hit many roadblocks that made it hard to do any analysis at scale. However, as a mini experiment, I compared the GC content of some sequences produced by Optipyzers algorithm, Integrate DNA Technologies codon optimization tool^1, and another web-based algorithm called JCat^2.

I used ten randomly generated proteins^3. These sequences were originally expressed in Escherichia coli, so I set up the experiment such that they would be optimized for expression in Homo sapiens. Strikingly, Optipyzer and IDT were pretty similar even though Optipyzer had consistently smaller GC-contents. However, JCat consistently produced sequences with much larger GC-Contents than either of the other two.

But what does this even show? Probably nothing. It's my understanding that GC-content is not related to expression levels in mammals^4. I just needed a way to say "hey, these two sequences from two different sources appear similar". So really all this does is show that maybe Optipyzer's sequences look like the ones produced by IDT. We are confident in the RCA-Index implementation which has been shown to predict expression^5 is correct.

At the end of the day, however, until you go to the bench and actually do the transfections, it truly is hard to say. I wish I had a better answer, but in vitro experiments were way outside the scope of this project. I hope this provides some more insight and I'm always open to suggestions and ideas!

Epannec commented 1 year ago

Hi Nathan,

Regarding the GUI Honestly, that would be great. I know my way around Python and docker quite well, but I'm very far from front-end development. Having access to the webserver could be very handy. But please don't feel obliged our forced in any way!

Regarding the benchmarks That is a great answer and references, thank you. I absolutely understand that you cannot start doing those benchmarks now, and I realize that such endeavor would be quite laborious. I was just wondering whether this has been done, but your answer already surpassed my expectations ;-) One sidenote, I somewhere heard that IDT did codon optimization for synthesis, rather than for expression. But that would go against what they postulate on the site, so I'm not sure about that. Just something to keep in the back of your head when including them in your benchmarks, I guess.

Kind regards, Erwin

nleroy917 commented 1 year ago

Hi Erwin!

Thanks for being patient... I got started on the Dockerfile for the UI last week, and it wasn't working immediately and I got sidetracked. Anyways, I've implemented two things:

  1. There is a new Dockerfile inside web/ that will build the user interface for you and serves it off of http://localhost:3000
  2. a docker-compose.yaml file that spins them both up simultaneously.

The PR is here: #50

So, if you wish to start the server on http://localhost:8000 and the the user interface on http://localhost:3000, you can do so by running at the root of the repository:

docker compose up --build

note that building and starting the containers individually rather than through docker compose can lead to odd behavior. However, build and running one while running the other software natively should be ok.

Let me know if you have questions or if there are other tweaks that would be useful!

Nathan