vmware-archive / kubeless

Kubernetes Native Serverless Framework
https://kubeless.io
Apache License 2.0
6.86k stars 755 forks source link

Python runtime global interpretter lock implications #435

Open murali-reddy opened 6 years ago

murali-reddy commented 6 years ago

Current use of bottle in python runtime http event triggers means parallel requrests are seralized. Bottle with standard wsgiref server can not handle concurrent requests due to global interpretter lock.

Also any use of multithreading approach will result in same limitations. At any time a python app will use one core. Pythons multiprocessing (https://docs.python.org/2/library/multiprocessing.html) library is better suited to overcome GIL limitations.

Please see https://aws.amazon.com/blogs/compute/parallel-processing-in-python-with-aws-lambda/ as well for similar limitations.

I am opening this issue to consider the design for Python runtime that can achive true parallel execution of python functions.

anguslees commented 6 years ago

It can handle concurrent requests, just can't use more than one core's worth of CPU. In particular, if a Function is making network calls to some other service then python will happily work on other requests while waiting for the response.

Also note that a naive multiprocessing setup will result in all request/response bodies being serialised/deserialised twice: http <-> server <-> multiprocessing worker. A more efficient multi-process model (imo) for python would be to pre-fork and have the separate processes fight over a single socket in accept().

An alternative is just to allow kubernetes (HPA) to start multiple processes via multiple container instances if required. Once they're separate processes, there's very little reason to try to keep them in the same actual container instance. Fwiw, I changed to a multi-process bottle backend at some point, and then went back to single process once I looked at it from this pov.

Related to this, and just fyi: A bunch of the standard python prometheus metrics (that are exported currently) assume a single python process (memory usage, etc).

murali-reddy commented 6 years ago

It can handle concurrent requests, just can't use more than one core's worth of CPU. In particular, if a Function is making network calls to some other service then python will happily work on other requests while waiting for the response.

Yes there is concurrency, but no parallelism. With python runtime with GIL limitation, even in a multithreaded python program, at any time there is only one thread running on one core even on multi-core processor. With GIL, CPU intensive multi-threads is bad, and for i/o intensive apps much more light weight gevent or like would are typicall used.

A more efficient multi-process model (imo) for python would be to pre-fork and have the separate processes fight over a single socket in accept().

Agree.

The point of this issue to be aware for GIL when designing anything for python runtime. We could have other options than using multiprocessing module like the solution you mentioned.

Also nature of workloads (CPU or i/o intensive) function runs, latency requirements also play a role too. We could discuss and figure a best approach.