lrworth commented 4 years ago

The original description of this PR related to handling JSON payloads, but as of 3.0.0 this change was not needed; however I would still like to be able to call runLambda without using generateLambdaDispatcher, and the only thing stopping this is that LambdaError and ToLambdaResponseBody are not exported.

This PR exports those types.

Old description

See https://github.com/theam/aws-lambda-haskell-runtime/issues/77

This PR is a draft as it is primarily for discussion.

In order to handle non-JSON payloads to the lambda, this PR:

adds RawApiGatewayRequest and RawApiGatewayResponse which are newtype wrappers over ApiGatewayRequest and ApiGatewayResponse but with FromJSON instances that don't parse the body;
publicly exposes the types needed to write a custom main routine so that we can avoid the JSON decoding baked into generateLambdaDispatcher; and
removes base64 support as it is not handled correctly.

The second point I would like to discuss further. I did this by necessity to get my project working, but now that I have it I don't want to go back to using generateLambdaDispatcher even if it did support non-JSON payloads, because of the following benefits:

It is much simpler to relate my code to the AWS documentation, because there's a main I can point to and understand.
An explicit main allows for initialisation setup, like reading environment variables and setting up amazonka, before the main loop commences.
We can write higher-level wrappers for runLambda with more specific types, such as runSqsLambda :: AWS.Env -> (SqsPayload -> ExceptT String AWS.AWS String) -> IO () which removes any doubt about what type the handler function has to be.

Moreover, with the addition of abstractions like runSqsLambda the number of keystrokes saved by using generateLambdaDispatcher is negligible at best; compare

main = do
  awsEnv <- AWS.newEnv AWS.Discover
  runSqsLambda awsEnv myHandler

versus

generateLambdaDispatcher StandaloneLambda defaultDispatcherOptions
-- and in another file
handler = do
  awsEnv <- AWS.newEnv AWS.Discover -- happens on every execution
  withSqsPayload awsEnv myHandler

(I also added a shell.nix so I could develop this library - take it or leave it 😊)

dnikolovv commented 4 years ago

Since version 3.0.0, support for custom initialization has been added using the initializeContext function inside Main.hs. It runs on cold starts only and preserves your custom context between lambda calls (in an IORef, so you can also mutate it). You can see how that's been done in the example project.

One thing I don't like about this Text approach is that you don't get any type safety.

In my opinion, this feature is a great use case for the DispatcherOptions. We could have some ContentType field based on which we could force the request body type to implement a specific typeclass. You're forced to have a FromJSON now, we could just as easily generate code that requires you to have FromFormURLEncoded (or whatever).

About having a custom main, it'd be nice to have as some very advanced opt-int feature, but since the introduction of initializeContext, I can't think of any use for it. (I agree that generateLambdaDispatcher is a bit magical, though)

Edit: You could've avoided the Raw.. stuff altogether. If you just use ApiGatewayRequest Text, the runtime is going to skip parsing anyhow. There are overlapping FromJSON instances for Text and String which do exactly what you did with the Raw wrappers.

endgame commented 4 years ago

There are overlapping FromJSON instances for Text and String which do exactly what you did with the Raw wrappers.

You mean these ones? We didn't realise that they existed, and personally I think they're a bit spooky and magical. It might be better to do some kind of content-type dispatching instead?

lrworth commented 4 years ago

@dnikolovv I see these things were added just after we started development — great minds think alike. In that case, I believe all that is missing from aws-lambda-haskell-runtime to enable custom main functions is these 2 exports.

dnikolovv commented 4 years ago

@endgame Yeah, I thought that it's a bit shady when adding those, but it seemed to make perfect sense that someone wanting the raw request body would just do ApiGatewayRequest Text or ApiGatewayRequest String.

Anyhow, as @lrworth mentioned, those are very new and it's likely you didn't have them at the time of investigating the issue.

About the content dispatching, this is exactly what I meant by having it in the DispatcherOptions. The overlapping instances must stay anyways, because otherwise we lose support for ApiGatewayRequest Text when using JSON, which is even weirder imo.

About the main thing, @NickSeagull is the one that takes the decisions. The only concern I have about it is that people will use it to implement interesting features instead of submitting an issue and getting them into the runtime itself, but we can't have the best of all worlds :(

endgame commented 4 years ago

The only concern I have about it is that people will use it to implement interesting features instead of submitting an issue and getting them into the runtime itself

What if you exposed thing in a public .Internal module, with a haddock of "if you need to use this, please talk to us because we'd like the public API to support your usecase"?

dnikolovv commented 4 years ago

The only concern I have about it is that people will use it to implement interesting features instead of submitting an issue and getting them into the runtime itself

What if you exposed thing in a public .Internal module, with a haddock of "if you need to use this, please talk to us because we'd like the public API to support your usecase"?

Yeah I guess that'd be the best thing to do.

lrworth commented 4 years ago

I've generated a few thoughts via sleep and a shower.

About a month ago AWS SAM gained the ability to build and deploy [Haskell] projects with this command:

sam build && sam deploy

only requiring you to write a Makefile containing a build target for each AWS::Serverless::Function you wish to deploy. In this configuration, each Serverless Function is built independently, and if you choose a separate Main module per Function, the Handler value becomes redundant. Using generateLambdaDispatcher introduces overhead of aligning the Handler name in template.yaml with a source file name, and of validating the Handler name at runtime.

generateLambdaDispatcher is inherently non-composable. If someone needs a slightly different use-case from this function, they have no choice but to fork aws-lambda-haskell-runtime and add their additional functionality. I get where you're coming from @dnikolovv regarding people submitting features back — but I think forcing use of generateLambdaDispatcher makes it harder to contribute features back. For example, I've written a function providing everything you need to write a Lambda taking events from SQS, and once I've proven that it works in production I am happy to submit a PR back to this repo containing the additional function; I don't know how I'd achieve this by modifying generateLambdaDispatcher. (Perhaps by defining handler = mySQSWrapper realHandler? But it's hard to predict what type signature you'll get...)

I suppose in my ideal world, aws-lambda-haskell-runtime would provide combinators for building your own dispatcher, as well as a growing collection of commonly-needed dispatchers (such as one for SQS, one for API Gateway Proxying, one for Alexa Skills, one for DynamoDB events, etc). None of this precludes generateLambdaDispatcher but it may become redundant over time.

(I know text can come across as sterile and harsh but I do really appreciate the work you guys have put into this package already; I want as much as you to see this package flourish and become the go-to for dead-simple Haskell deployments to Lambda.)

lrworth commented 4 years ago

@dnikolovv I've updated this PR to allow users to call runLambda and nothing else.

dnikolovv commented 4 years ago

@lrworth I'm all in for a custom main. The concerns I have are superficial.

@NickSeagull can merge this, I'm just a contributor.

NickSeagull commented 4 years ago

First of all, thanks everyone for taking part in the discussion. Things like this make the project grow in a good way 😃

Regarding the generateLambdaDispatcher, this approach was taken so one could deploy full pledged apps without writing the same handling code again and again, and generating a project each time. In a typical serverless architecture, you'd have many different functions that call each other.

As an example, think that you are making an app that consumes a huge JSON from some legacy service that essentially is an array of customers that you want to register in your new system and finally notify them.

First, you got a lambda that consumes this JSON of, say 1k of customers. This "entry point" lambda will split the data into chunks, and invoke another lambda (the processor one) for each of those.
Each of the processor lambdas will receive the customer data, and process it somehow, by storing them into a DB or whatever. In the end, it will post a message to SQS
Last lambda is invoked as a reaction to the SQS messages, and it notifies the user.

If we were to use runLambda in this scenario, we'd have first to create a project for each of the lambdas (only 3 for this example, but the number quickly grows) and then write the same boilerplate code for each of them.

On the other hand, with the generateLambdaDispatcher, we are able to just write handlers in modules and let AWS call the appropriate one.

Of course, this approach is not perfect, and the minimum would be that amazonka is initialized and one can use it from within the handlers with no overhead, and the input parameters are not that restricted with a FromJSON instance.

Let's do one thing, what do you think of renaming runLambda to unsafeRunLambda to tell the users that they are on their own here, and work towards some combinators like you described? Could you introduce the change in this PR please?

lrworth commented 4 years ago

Thanks for your response @NickSeagull 😄

There is no need to create a new project for each Lambda, only a separate executable target in the cabal file, and with common stanzas this becomes quite manageable (about 3 lines per target). In fact what you described (3 interacting lambdas connected with SQS) is exactly what we are doing and this design works well so far.

In any case, there's probably a better solution for us than just using runLambda (or unsafeRunLambda) as-is, so we'll keep experimenting in our fork until we have something worth contributing back. I'll close this for now.

NickSeagull commented 4 years ago

There is no need to create a new project for each Lambda, only a separate executable target in the cabal file, and with common stanzas this becomes quite manageable (about 3 lines per target). In fact what you described (3 interacting lambdas connected with SQS) is exactly what we are doing and this design works well so far.

But how do you do that, if the executable needs to be called bootstrap? If it is possible to do what you describe, I'd actually would be very happy to remove the Template Haskell dispatching, and fallback to the Cabal way, as it is easier to maintain, and with some documentation, easy to use aswell.

With this change it'd be very easy to implement the amazonka initialization, etc..

endgame commented 4 years ago

AWS' sam CLI can build runtime: provided targets by invoking a Makefile: https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/building-custom-runtimes.html

Each function invokes a distinct target, and you build the relevant component and then copy it across to $(ARTIFACTS_DIR)/bootstrap (sam sets a different $ARTIFACTS_DIR for each function to build).

theam / aws-lambda-haskell-runtime

Enable users to call `runLambda` #78

Old description