Benchmarking and scaling up

bitsapien commented 7 years ago

I have one instance of glot-run running. I did a benchmarking using 'Apache ab' with the following parameters:

Number of Request - 200
Concurrency - 10
Print "Hello World!" program in python

These are the results -

This is ApacheBench, Version 2.3 <$Revision: 1748469 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient) Completed 100 requests Completed 200 requests Finished 200 requests

Server Software: Cowboy Server Hostname: 127.0.0.1 Server Port: 8090

Document Path: /languages/python/latest Document Length: 42 bytes

Concurrency Level: 10 Time taken for tests: 165.044 seconds Complete requests: 200 Failed requests: 170 (Connect: 0, Receive: 0, Length: 170, Exceptions: 0) Non-2xx responses: 170 Total transferred: 29120 bytes Total body sent: 55800 HTML transferred: 1260 bytes Requests per second: 1.21 [#/sec] (mean) Time per request: 8252.200 [ms] (mean) Time per request: 825.220 [ms] (mean, across all concurrent requests) Transfer rate: 0.17 [Kbytes/sec] received 0.33 kb/s sent 0.50 kb/s total

Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.6 0 7 Processing: 4140 8146 4262.4 5039 18715 Waiting: 4139 8145 4262.3 5038 18715 Total: 4141 8146 4262.4 5039 18716

Percentage of the requests served within a certain time (ms) 50% 5039 66% 9413 75% 11697 80% 12977 90% 15369 95% 16340 98% 17946 99% 18164 100% 18716 (longest request)

That shows a 15% success rate. Planning to have a coding challenge for my college, and requests will be close to the above parameters. Been thinking about various approaches, one of them being where code run requests are pushed to Redis, while multiple glot-run nodes listen for requests, and process when available. Confused about how to go about scaling this setup. Looking for suggestions. Thanks.

prasmussen commented 7 years ago

I'm guessing the bottleneck is on the machine where the docker daemon is running and not in glot-run itself. I can think of 3 ways to scale the docker api:

Vertical. Add more/faster cpu's and faster disks on the machine where the docker daemon is running.
Horizontal. Set up a load balancer (haproxy/nginx/etc) and multiple machines running the docker daemon and configure DOCKER_API_URL to go to the load balancer.
Queue. Add a queue in front of the docker daemon, I don't have much experience with this, but it looks like nginx plus and haproxy has support for queueing request when the max connection limit is reached. You must also have to a high DOCKER_RUN_TIMEOUT configured in glot-run in this case. The following haproxy configuration options seems relevant: https://cbonte.github.io/haproxy-dconv/1.7/configuration.html#4.2-maxconn https://cbonte.github.io/haproxy-dconv/1.7/configuration.html#5.2-maxqueue https://cbonte.github.io/haproxy-dconv/1.7/configuration.html#4-timeout%20queue

javierprovecho commented 7 years ago

swarm should be the best way to go, it is an official proxy for docker daemon with extended capabilities of orchestrators. You should point DOCKER_API_URL to swarm endpoint and register all your docker daemons with swarm.

more info here: https://www.docker.com/products/docker-swarm

prasmussen commented 7 years ago

Swarm definitely looks like the way to go 👍

bitsapien commented 7 years ago

Thanks, I'll try doing that.

rushi216 commented 6 years ago

@bitsapien any success with docker swarm? I am trying to do same

bitsapien commented 6 years ago

@rushi216 have not tried it.

sanjayme97 commented 4 years ago

@bitsapien how did you set up this locally? guide me on this

prasmussen / glot-run

Benchmarking and scaling up #13