Managing context between invocations

mstn commented 4 years ago

Hi, congrats for this great project!

I do not know much about custom runtimes, sorry if this is a newbie question!

I was wondering how you manage the AWS Lambda Execution Context.

In Nodejs background processes (e.g. http, timeouts) that did not complete when the function returned are resumed in hot starts. How does it work with this runtime?

Thanks!

NickSeagull commented 4 years ago

Hello! Thanks for pointing this out :)

About the Execution Context, we just don't manage it. This is probably the main reason behind the lag in performance in hot starts. See this benchmarking post.

In the case of the AWS SDK (required for all calls), it's initialization is happening in each call to the lambda. This is suboptimal, but still we haven't seen a significant performance hit, that's why we deprioritized it.

Still, if you find this being an issue, I can guide you towards fixing this :D

mstn commented 4 years ago

It could be my first Haskell project. If you can guide me, I'll be very happy! :)

Even if there is no big difference between cold and hot starts, you can have some benefits if you reuse db connections across invocations.

dnikolovv commented 4 years ago

@NickSeagull I would like to tackle this. Could you provide some guidance on adding this capability?

mstn commented 4 years ago

@NickSeagull @dnikolovv I am still interested btw! I would be happy if you can keep me in the loop. ;)

NickSeagull commented 4 years ago

Hey! Thanks for pinging me, completely missed it.

Yes, implementing this would be very nice. The context is initialized in this function. My biggest question would be what to store from the context, and what to initialize with each invocation.

Also, I think that we should add the amazonka package, and make the handlers run in the AWST monad, and have a typeclass or something that allows having IO handlers like nowadays. Something like

class MonadAWS m where
  inAWS :: m a => AWST r m a

instance MonadAWS IO where
  inAWS = liftIO

instance MonadAWS (AWST r) where
  inAWS = id

In this way we'd maintain the simplicity of having IO for quick tests, and still have the possibility of caching the AWS context between invocations, which is what made this runtime much slower in the benchmarks.

Feel free to comment your ideas, or even the steps that you might be thinking of when implementing this.

And absolutely, go ahead, this is a great feature, and I'd be very grateful if it becomes a reality 😄

dnikolovv commented 4 years ago

One thing I cannot wrap my head around is that in the AWS docs it is explicitly mention that the execution context is a good place to store a DB connection, but how would you do that?

I get the static assets in /tmp and that we can do I guess relatively quickly, but am struggling to understand where we would put the db connections (you can't serialize those, I think).

On adding amazonka, I've also thought about it, but didn't really see an imminent need so I ignored it.

NickSeagull commented 4 years ago

Well, actually the AWS runtime host is a container running forever loop that gets frozen after each invocation. In languages like JS, you can do context.foobar = quux and that object will be stored while the Lambda is "hot".

On a cold start, the context object is lost, and reinitialized. What I'm thinking of is, that perhaps the context should be an IORef?

dnikolovv commented 4 years ago

I'll be experimenting with that a bit.

dnikolovv commented 4 years ago

Well, I've basically implemented it. It needs more testing and some cleanup, but I can confirm that it works (or at least it appears to).

~~You can check it out here - https://github.com/dnikolovv/aws-lambda-haskell-runtime/tree/persistent-execution-context~~

I've submitted a PR - #72. It needs documentation and perhaps some example projects.

Peek 2020-06-15 17-42

theam / aws-lambda-haskell-runtime

Managing context between invocations #60