lithops-cloud / lithops

A multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud ☁️🚀
http://lithops.cloud
Apache License 2.0
320 stars 105 forks source link

[Code Engine/Knative] Execution concurrency #448

Closed zhanggbj closed 4 years ago

zhanggbj commented 4 years ago

However, by default both benchmark run with workers=100,as I can see Knative accepted 100 req/s so only scale up around 100 pods.

image

image

I noticed max worker for Code Engine is 1200 https://github.com/lithops-cloud/lithops/blob/master/lithops/serverless/backends/code_engine/config.py#L28

and default worker for Knative is 100 https://github.com/lithops-cloud/lithops/blob/master/lithops/serverless/backends/knative/config.py#L30

If workers=1000, we need lithops client to send out 1000 req/s concurrently.

lithops:
    #storage: ibm_cos
    storage_bucket: lithops-serving
    #executor: serverless
    #rabbitmq_monitor: <True/False>
    #data_cleaner: <True/False>
    workers: 1000 

From what I observed, 1000 POST are sent out in 6m21s for 1000 execution, max 446/s POST in parallel at the very begining. image

@gilv @JosepSampe Are these all in your expectation? Thanks!

JosepSampe commented 4 years ago

@zhanggbj Yes, this is the expected behaviour in knative. By default, lithops uses a threadpool of 500 threads (max) to perform the invocations. At the same time, in knative invocations are synchronous, i.e., lithops has to wait one invocation to finish to invoke another one. It is for this reason you see a max level of concurrency at ~500 in the last plot. You can increase (or decrease) this number by using the invoke_pool_threads parameter in the map() call: https://github.com/lithops-cloud/lithops/blob/master/docs/api-details.md#executormap

It is important to note that this behaviour only happens with knative. In the rest of available compute backends invocations are asynchronous, so lithops would invoke the 1000 calls immediately without waiting for their completion.

gilv commented 4 years ago

@zhanggbj @JosepSampe I guess it's resolved...closing this for now

zhanggbj commented 3 years ago

Hi @JosepSampe @gilv ,

One more question about the threads, is there anything I can configure the threads numbers from 500 to 1000, just got some time and I would like to know the 1000 execution result. 500 thread is a limitation for us (Code Engine or Knative serving)

If no config for now, I need to hack it https://github.com/lithops-cloud/lithops/blob/cdb692d99b4bfbfcf161d0c92640ced2dedd6ee8/lithops/executors.py#L206

Is it possible to add a configuration? As I think we should not have such limitation from lithops side, if uses want to evaluate the 1000 execution.

JosepSampe commented 3 years ago

@zhanggbj Yes, we can add this value as a parameter in the configuration file. Meanwhile you can test different values of invoke_pool_threads by simply setting it in the map() call (no need to modify lithops source code):

fexec = lithops.FunctionExecutor()
fexec.map(my_map_function, iterdata, invoke_pool_threads=1000)
zhanggbj commented 3 years ago

@JosepSampe thanks for your reply, i've opened a new issue https://github.com/lithops-cloud/lithops/issues/521.