ga4gh / ga4gh-server

Reference implementation of the APIs defined in ga4gh-schemas. RETIRED 2018-01-24
http://ga4gh.org
Apache License 2.0
96 stars 91 forks source link

Only use flask for debug #1607

Open david4096 opened 7 years ago

david4096 commented 7 years ago

This PR provides an experimental "run under gunicorn" option to ga4gh_server.

This PR runs the server behind the gunicorn WSGI server when running via ga4gh_server. This makes the software much easier to use and more practical for people who don't know what "Apache" or "nginx" are.

This spawns multiple worker processes that can each handle their own requests. This standalone WSGI HTTP Server is well tested with Flask, however, our implementation includes some challenges. The first is that TLS mode won't work directly, and logging needs to be handled differently.

I'd like to get this experimental feature out there for folks to test, I think that by default when running ga4gh_server we would like to run behind gunicorn as opposed to the SimpleHTTPServer. It would allow us to simplify the Apache instructions, and the dockerfile. For apache, one simply sets up a proxy and doesn't need to use mod_wsgi.

(env) ➜  ga4gh-server git:(1548_gunicorn_optional) ✗ python server_dev.py
--------------------------------------------------------------------------------
DEBUG in __init__ [ga4gh/server/network/__init__.py:54]:
Peer already in registry http://1kgenomes.ga4gh.org UNIQUE constraint failed: peer.url
--------------------------------------------------------------------------------
[2017-03-08 17:56:01 +0000] [16971] [INFO] Starting gunicorn 19.7.0
[2017-03-08 17:56:01 +0000] [16971] [INFO] Listening at: http://127.0.0.1:8000 (16971)
[2017-03-08 17:56:01 +0000] [16971] [INFO] Using worker: sync
[2017-03-08 17:56:01 +0000] [16976] [INFO] Booting worker with pid: 16976
[2017-03-08 17:56:01 +0000] [16977] [INFO] Booting worker with pid: 16977
[2017-03-08 17:56:01 +0000] [16978] [INFO] Booting worker with pid: 16978
[2017-03-08 17:56:01 +0000] [16979] [INFO] Booting worker with pid: 16979
[2017-03-08 17:56:01 +0000] [16980] [INFO] Booting worker with pid: 16980
[2017-03-08 17:56:01 +0000] [16981] [INFO] Booting worker with pid: 16981
[2017-03-08 17:56:01 +0000] [16982] [INFO] Booting worker with pid: 16982
[2017-03-08 17:56:01 +0000] [16983] [INFO] Booting worker with pid: 16983
[2017-03-08 17:56:01 +0000] [16984] [INFO] Booting worker with pid: 16984

Close #1548

TODO

codecov-io commented 7 years ago

Codecov Report

Merging #1607 into master will decrease coverage by 0.07%. The diff coverage is 58.33%.

@@            Coverage Diff             @@
##           master    #1607      +/-   ##
==========================================
- Coverage   86.65%   86.57%   -0.08%     
==========================================
  Files          33       33              
  Lines        7471     7494      +23     
  Branches      926      928       +2     
==========================================
+ Hits         6474     6488      +14     
- Misses        805      813       +8     
- Partials      192      193       +1
Impacted Files Coverage Δ
ga4gh/server/cli/server.py 66.03% <58.33%> (-3.97%) :x:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update ec57524...6f59917. Read the comment docs.

david4096 commented 7 years ago

To better demonstrate what this gives us, I ran the locust load testing suite with this locustfile against this PR in two modes. This test is a little better isolated than the previous which used curl. Locust uses gevent to spawn workers at a lower cost on the system.

First with the -d option which uses the flask debug we already ship with. 68 requests/second

"Method" "Name" "# requests" "# failures" "Median response time" "Average response time" "Min response time" "Max response time" "Average Content Size" "Requests/s"
"GET" "/" 42 1 5000 14854 1254 67302 16007 0.32
"POST" "/datasets/search" 565 2 2200 5941 3 69022 158 4.36
"POST" "/features/search" 631 3 2100 6016 3 67749 2 4.87
"POST" "/featuresets/search" 631 0 2000 5738 5 67694 364 4.87
"POST" "/readgroupsets/search" 669 1 2000 6138 13 69053 23775 5.16
"POST" "/reads/search" 6355 19 2000 5938 3 69506 10048 49.02
"None" "Total" 8893 26 2000 5987 3 69506 9081 68.60

And then with the gunicorn server (running without -d), 130 requests/sec

"Method" "Name" "# requests" "# failures" "Median response time" "Average response time" "Min response time" "Max response time" "Average Content Size" "Requests/s"
"GET" "/" 18 0 800 796 669 921 16007 0.22
"POST" "/datasets/search" 747 0 370 514 3 2683 158 9.04
"POST" "/features/search" 748 0 390 527 3 2630 2 9.05
"POST" "/featuresets/search" 800 0 390 521 3 2634 364 9.68
"POST" "/readgroupsets/search" 802 0 440 555 8 2492 23775 9.70
"POST" "/reads/search" 7645 0 430 557 3 2760 9718 92.51
"None" "Total" 10760 0 420 549 3 2760 8741 130.20

Note the failures with the builtin flask server, which doesn't handle simultaneous requests well.

david4096 commented 7 years ago

If we merge this we should follow it with a PR that improves test coverage under gunicorn mode in test_gestalt.

Tests that involve ensuring an announce request is sent and received will be easier to write, I believe.

david4096 commented 7 years ago

I don't want to rush this into the release, but I would love to get feedback!