vmware-archive / runtimes

Kubeless function runtimes: https://kubeless.io/docs/runtimes/
Apache License 2.0
81 stars 89 forks source link

Python cherrypy server - number of threads too low? #93

Closed varunbpatil closed 4 years ago

varunbpatil commented 4 years ago

Hi,

I noticed that the python runtime can only handle 10 requests in parallel which is the default number of threads in the cherrypy server thread pool.

class CherryPyWSGIServer(HTTPServer):

    """A subclass of HTTPServer which calls a WSGI application."""

    wsgi_version = (1, 0)
    """The version of WSGI to produce."""

    def __init__(self, bind_addr, wsgi_app, numthreads=10, server_name=None,
                 max=-1, request_queue_size=5, timeout=10, shutdown_timeout=5,
                 accepted_queue_size=-1, accepted_queue_timeout=10):
        self.requests = ThreadPool(self, min=numthreads or 1, max=max,
            accepted_queue_size=accepted_queue_size,
            accepted_queue_timeout=accepted_queue_timeout)

I thought cherrypy does auto-scaling of the thread pool, but apparently it doesn't (stackoverflow).

My proposal is to make it configurable via an environment variable.

andresmgot commented 4 years ago

Hi @varunbpatil,

This has not been reported before, I am not sure why? In any case, feel free to send a POC in which you increase the number of threads to check if that solves your issue.

varunbpatil commented 4 years ago

Hi @andresmgot, following is the POC I did.

Made the following code change in kubeless.py:

    bottle.run(
        app,
        server="cherrypy",
        host="0.0.0.0",
        port=func_port,
        numthreads=os.getenv("CHERRYPY_NUMTHREADS", 10),   # This is the only code change
    )


The following is the function handler (handler.py)

import time

def handler(event, context):
    time.sleep(10)  # Deliberately sleeping for 10 seconds so that the results are differentiable
    return "Hello World\n"

I'm using hey to test response times.

10 parallel requests with numthreads=10 (default).

Observation: The total response time is around 11 seconds which is as expected when all 10 requests are running in parallel.

$ hey -n 10 -c 10 -m GET -t 0 -disable-keepalive http://localhost:8080

Summary:         
  Total:        11.0580 secs
  Slowest:      11.0579 secs
  Fastest:      10.0289 secs
  Average:      10.2362 secs
  Requests/sec: 0.9043

  Total data:   120 bytes
  Size/request: 12 bytes

Response time histogram:
  10.029 [1]    |______
  10.132 [7]    |________________________________________
  10.235 [0]    |  
  10.338 [0]    |
  10.440 [0]    |                   
  10.543 [0]    |                                      
  10.646 [0]    |                                    
  10.749 [0]    |                                    
  10.852 [0]    |                                       
  10.955 [0]    |                                    
  11.058 [2]    |___________

Latency distribution:
  10% in 10.0296 secs
  25% in 10.0298 secs
  50% in 10.0310 secs              
  75% in 11.0569 secs
  90% in 11.0579 secs
  0% in 0.0000 secs
  0% in 0.0000 secs

Details (average, fastest, slowest):
  DNS+dialup:   0.2067 secs, 10.0289 secs, 11.0579 secs
  DNS-lookup:   0.0001 secs, 0.0000 secs, 0.0005 secs
  req write:    0.0001 secs, 0.0000 secs, 0.0003 secs
  resp wait:    10.0291 secs, 10.0243 secs, 10.0333 secs
  resp read:    0.0003 secs, 0.0000 secs, 0.0017 secs

Status code distribution:
  [200] 10 responses


11 parallel requests with numthreads=10 (default).

Observation: The total response time has now jumped to 20 seconds because the 11th request had to wait for 10 seconds for one of the previous requests to complete. This shows that only 10 requests are running in parallel.

$ hey -n 11 -c 11 -m GET -t 0 -disable-keepalive http://localhost:8080

Summary:         
  Total:        20.0475 secs
  Slowest:      20.0469 secs
  Fastest:      10.0232 secs
  Average:      11.1258 secs
  Requests/sec: 0.5487

  Total data:   132 bytes
  Size/request: 12 bytes

Response time histogram:
  10.023 [1]    |______
  11.026 [7]    |________________________________________
  12.028 [2]    |___________
  13.030 [0]    |
  14.033 [0]    |                   
  15.035 [0]    |                                      
  16.037 [0]    |                                    
  17.040 [0]    |                                    
  18.042 [0]    |                                       
  19.045 [0]    |                                    
  20.047 [1]    |______

Latency distribution:
  10% in 10.0301 secs
  25% in 10.0303 secs
  50% in 10.0391 secs              
  75% in 11.0396 secs
  90% in 20.0469 secs
  0% in 0.0000 secs
  0% in 0.0000 secs

Details (average, fastest, slowest):
  DNS+dialup:   0.2775 secs, 10.0232 secs, 20.0469 secs
  DNS-lookup:   0.0007 secs, 0.0001 secs, 0.0026 secs
  req write:    0.0002 secs, 0.0001 secs, 0.0003 secs
  resp wait:    10.8477 secs, 10.0217 secs, 19.0315 secs
  resp read:    0.0004 secs, 0.0001 secs, 0.0022 secs

Status code distribution:
  [200] 11 responses


11 parallel requests with numthreads=11 (via environment variable).

Observation: The total response time is now back to 11 seconds for 11 parallel requests.

$ hey -n 11 -c 11 -m GET -t 0 -disable-keepalive http://localhost:8080

Summary:         
  Total:        11.0368 secs
  Slowest:      11.0364 secs
  Fastest:      10.0315 secs                
  Average:      10.3984 secs
  Requests/sec: 0.9967

  Total data:   132 bytes
  Size/request: 12 bytes

Response time histogram:
  10.031 [1]    |_______
  10.132 [6]    |________________________________________
  10.232 [0]    |  
  10.333 [0]    |
  10.433 [0]    |                   
  10.534 [0]    |                                      
  10.634 [0]    |                                    
  10.735 [0]    |                                    
  10.835 [0]    |                                       
  10.936 [0]    |                                    
  11.036 [4]    |___________________________

Latency distribution:
  10% in 10.0352 secs
  25% in 10.0354 secs
  50% in 10.0377 secs
  75% in 11.0364 secs
  90% in 11.0364 secs
  0% in 0.0000 secs
  0% in 0.0000 secs

Details (average, fastest, slowest):
  DNS+dialup:   0.3663 secs, 10.0315 secs, 11.0364 secs
  DNS-lookup:   0.0002 secs, 0.0000 secs, 0.0004 secs
  req write:    0.0001 secs, 0.0000 secs, 0.0002 secs
  resp wait:    10.0314 secs, 10.0240 secs, 10.0369 secs
  resp read:    0.0005 secs, 0.0000 secs, 0.0025 secs

Status code distribution:
  [200] 11 responses
varunbpatil commented 4 years ago

This has not been reported before, I am not sure why?

I have no clue why this hasn't been discovered before. Maybe because it is only observable with long running functions and we don't need/use too much parallelism with such long running functions.