Open TheNeikos opened 4 years ago
This is not implemented yet but it's a great idea. When I have some time I'll take a look and see if there are any built in features in the wasmtime engine that support this or if we've got to code it ourselves.
In the same vein, I think it would be great to also get those stats as well. So that one can find out how much each actor is currently consuming.
I couldn't agree more. There's a bit of a hack for this where you could use the Prometheus middleware and then use a grafana dashboard to watch the usage of each actor ... But what we really should have is a nice, holistic way of tracking and potentially limiting usage of actors and providers.
Thinking "out loud":
One approach might be to keep track of "actor time", which would be an accumulation of the total number of milliseconds spent in execution by the actor since start time. By maintaining this value, we can expose it for query as well as return an Err
when we attempt to invoke an actor that might have exceeded some defined (env variable?) quota for actor time.
For completeness, if we maintained "actor time", we would also need to:
WasccHost
that can query the actor time for an individual actor or maybe all actors in the hostWhen declaring a quota for actor time, we could define a sliding window, where the actor can spend no more than n milliseconds in execution within a time period of t minutes. If t is 0, that would essentially be an absolute limit rather than a sliding window limit and, once exceeded, the actor would be unusable. Ancillary question: If an actor exceeds its actor time budget for an absolute quota, should we just remove the actor from the host rather than continuing to return an Err
for each invocation?
NB: I'd like to avoid having to keep track of things like physical CPU usage because that just makes the host all the more platform/arch/OS dependent (not to mention unreliable depending on where the host is running), whereas counting milliseconds spent in execution is something that can easily be done without adding dependency trees.
I don't think that removing a host per default is a good idea. For example, if you are hosting an actor where someone buys X time, they can then easily 'buy more' so to speak by adding to the available time, so that the next time they get invoked it works again.
And since the Error itself would be descriptive (I imagine something like Error::NoRunningTimeLeft
or so), removing the actor from the host would always be a possibility.
Excellent point. I hadn't considered the use case where an actor's execution quota could be modified live while actively deployed. I was thinking entirely from the microservices "deploy once and leave it until the next update" perspective.
As this feature gets implemented then, I think there are some requirements:
WasccHost
lattice
WasccHost
API and API calls on the lattice
ExecutionQuotaExceeded
error, which will then allow the consumer to decide whether or not they want to remove the actor or, as you say, someone could add more quarters to the machine and keep the actor alive.I will edit the subject of this issue to convert it from a question into an imperative to implement this functionality.
Sounds great! Some things I can think of about possible issues:
cast
patterns as enabled by #72 I think a good mantra to ask as this gets implemented might be, "Would a FaaS bill for this time?" . If a function wakes up and then makes a synchronous call to some other billable resource, then the function is accruing billable usage at the same time as the resource being consumed.
Ah, the cast part would solve my question perfectly. As in my case a network connection is being made and I don’t want ‘punish’ any actors for things outside of their direct control.
I’m willing to help realizing these requests btw. I’ll send you an email soon, so we could coordinate if you wish.
Cheers!
Is setting a hard-cap on CPU Time/RAM a possibility right now? I've looked through the API and didn't find anything pertaining to that. Or is this just not yet implemented?