lithops-cloud / lithops

A multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud ☁️🚀
http://lithops.cloud
Apache License 2.0
319 stars 105 forks source link

Restrict the number of live actions on a single PyWren execution #160

Closed omerb01 closed 5 years ago

omerb01 commented 5 years ago

something I noticed that should be done inside PyWren and also according to Lachlan's suggestion: https://github.com/metaspace2020/pywren-annotation-pipeline/pull/29

correct me if I'm wrong but currently seems that CF can't deal with more than 1200 actions in parallel so if we do that, some invocations fail. To prevent this behaviour, we can maintain a container that holds invocations and we can make sure that no more than X actions are created. Also we can get this X value as a configurable parameter

what do you think about this suggestion?

JosepSampe commented 5 years ago

@omerb01 Do you refer to a hard-limit? I mean, if I put in the configuration something like: invocation_limit=1000, then if I want to spawn 1001 functions, PyWren would raise an error and no one invocation would be spawned?

omerb01 commented 5 years ago

@JosepSampe no, I don't think its a good approach because as a PyWren user I can make sure that no more than X invocations will be spawned regardless of any limit. I think that we need some sort of logic similar to a thread pool that holds invocations before they are sent to CF if a maximum number of live actions has been reached, and when single action finish its job, we will spawn a new one by take one invocation out of the waiting queue.

gilv commented 5 years ago

@omerb01 I actually not sure what is this about? As a user - he can submit any number of invocations. In PyWren we have retry mechanism for invocation that failed to run..in some near future we will implement a queue to hold invocation details

omerb01 commented 5 years ago

@gilv you can submit any number of invocations, but as you know, CF can handle only ~1200 actions in parallel. the queue you are talking about is exactly what I suggested :)

gilv commented 5 years ago

@omerb01 thanks..I close this one as we handle it part of CloudButton