middyjs / middy

🛵 The stylish Node.js middleware engine for AWS Lambda 🛵
https://middy.js.org
MIT License
3.73k stars 377 forks source link

Consider adding a `finally` phase. #622

Closed nfantone closed 2 years ago

nfantone commented 3 years ago

I've been using middy and writing custom middlewares for it for some time now and I always find myself facing this problem.

The "onion-like" scheme that middy follows when invoking middlewares makes it so before and after hooks are called in inverted order. However, when an error is thrown, all after callbacks are ignored and onError hooks are chained using the original declaration order. This would be fine if you consider your onError flow to be completely isolated from your after chain - but, in my experience, most of the time, this is not the case. Logic like body serialization, adding custom headers, normalizing the response, etc. you'd still need to run regardless of whether an exception was raised or not. Getting that to work with middy and its current API is not straightforward at all. If you have ever written a middleware that supports both after and onError phases, you have probably struggled with this by now.

Here's a real life example. Consider the following middlewares:

On a successful lambda invocation (i.e.: no errors are thrown), the expected middleware execution order should be: normalizeResponse -> json (error is not called). Which leads to the following setup:

// No errors thrown, reversed order: normalizeResponse -> json ✔️ 
// Error is thrown, normal order: error -> json -> normalizeResponse ✖️ 
middy(handler)
  .use(error())
  .use(json())
  .use(normalizeResponse());

In case of an error, the correct order should now be error -> normalizeResponse -> json. But because json and normalizeRespose calls are swapped, this is incompatible with the declaration shown above.

AFAIK, the only way to honour the middleware pattern and work around this situation is to create different middlewares that do the same thing on different phases and .use them in different orders.

// No errors thrown, reversed order: normalizeResponse -> json ✔️ 
// Error is thrown, normal order: error -> normalizeErrorResponse -> jsonError ✔️
middy(handler)
  .use(error()) // defines `onError`
  .use(normalizeErrorResponse()) // defines `onError`
  .use(jsonError()) // defines `onError`
  .use(json()) // defines `after`
  .use(normalizeResponse()); // defines `after`

This completely defeats the purpose of having reusable logic encapsulated in middleware as units and quickly multiplies the number of middlewares needed to accomplish the same thing.

With all of the above said, I feel like middy would greatly benefit from having an extra phase that would always execute last, regardless of whether an error was thrown or not. This would clearly emulate the classic try/catch/finally control flow.

try {
  // before phase
  await handler();
  // after phase
} catch (err) {
  // onError phase
} finally {
  // cleanup phase
  // Either we swallow potential errors thrown here or we expose/emit a global error event
}

A setup like that could simplify building middlewares, help decoupling logic and encourage reusability. More often than not, when a middleware declares both after and onError hooks pointing to the same function, what it's really trying to achieve is to execute that piece of logic no matter what the exit conditions are. Unfortunately, middy turns that into an irreconcilable issue.

Moreover, many of the middlewares currently being published under @middy/*, such as @middy/http-cors and @middy/http-security-headers to mention a couple, suffer from the same symptoms outlined before. All of them could be remedied by moving their after and onError calls to a single cleanup.

Any thoughts?

willfarrell commented 3 years ago

Thank you for the detailed use case.

I ran into this years ago when I first started using middy. Proper ordering of ones' middlewares can resolve most after/onError confits. However when you have multiple middlewares that are needed in a certain order for encoding and formatting the responses fall apart as you described. I built https://github.com/willfarrell/middy-jsonapi to accomplish exactly this. It handles errors, stringifying and formatting in one middleware. I haven't run into any issues since. However, this approach may not make sense for all.

I thought about adding in finally for v2 but decided against it due to the reason you outlined above. Swallowing errors is unlikely something we'll introduce, and a global error event just adds unneeded complexity.

Have you tried inline middlewares? If not checkout the docs for more details. This might work:

 normalizeResponse = normalizeResponse()
 jsonToString = jsonToString()

 middy(handler)
  .after(normalizeResponse.after)
  .after(jsonToString.after)
  .use(errorMessage())
  .onError(jsonToString.after)
  .onError(normalizeResponse.after)
  .onError(handler => {
    return handler.response
  })
// after: jsonToString -> normalizeResponse
// onError: errorMessage ->  jsonToString -> normalizeResponse -> inlineMiddleware

This should give you the control you need. What do you think of this approach? or do we need to think of a way to group middlewares the need to be run in the same order for after and onError?

nfantone commented 3 years ago

Hi @willfarrell. Thank you for your response! Let me address your points inline.

Proper ordering of ones' middlewares can resolve most after/onError conflicts.

Most is the keyword here. Although I would argue it's not even that many. The example I presented in my OT cannot be resolved by reordering. Any time you have a middleware implementing both after and onError hooks expecting them to run on each request, you introduce a problem where ordering cannot be a solution.

I built willfarrell/middy-jsonapi to accomplish exactly this. It handles errors, stringifying and formatting in one middleware.

I was taking a peek at that (thanks for sharing, BTW). I thought of a couple of comments.

  1. IMHO, the coupling of different responsibilities in a single middleware kind of goes against the idea of the design pattern and re-introduces the type of complexity we are looking to avoid in the first place. You could write all of that middy-jsonapi straight in your handler and it would still work. You wouldn't need middlewares or middy, for that matter.
  2. If I'm not mistaken, I believe your implementation still suffers from the same issues. Let me point you to a concrete example. I noticed in the README.md that the jsonapi middleware is paired with others, including cors.
    .use(cors())
    .use(jsonapi({ response }))

Since jsonapi takes care of formatting the response and cors modifies that response, the order in which these operations happen is different depending on whether an error was thrown. In case of an error, cors would add headers first; in case of success, cors would do it afterwards. Which is, arguably, not what you need or want. You would require the response to be formatted and serialized properly, every time, only once, after all other middleware are done extending or manipulating the response. Otherwise, unless you take care of formatting/serializing/etc. on each middleware that handles the response, you'd have no way of ensuring that your response is correct before your lambda finishes the execution.

I thought about adding in finally for v2 but decided against it due to the reason you outlined above. Swallowing errors is unlikely something we'll introduce, and a global error event just adds unneeded complexity.

  1. "Swallowing errors is unlikely something we'll introduce". I worded that poorly - my bad. I meant users should take care of the "swallowing of errors". middy would just expect all finally/cleanup functions to not throw. This is exactly how the semantics of try/catch/finally work in the language. It's the developer's responsibility to ensure that the finally clause runs properly.
  2. "A global error event just adds unneeded complexity". How come? At its simplest form, all that it'd be required is to set a single optional, callback on the middy instance.
middy(handler)
  .use(cors())
  .use(httpError())
   // Naming could be improved - maybe onCleanupError?
  .onUncaughtError(err => log.error('Something unexpected happened', err));

That's it. I don't see the need for further complexity. Unless you were thinking of something else?

Have you tried inline middlewares? If not checkout the docs for more details.

Thanks for pointing this out. I have, yes. I frankly don't see them as an alternative to the predicaments I'm facing. Just take a hard look at your proposed solution. Would you honestly say that's easy to read? Or maintain? Or put together in the first place? IMO, it looks more like forcing an awkward solution to navigating around the difficulties of the API, rather than the intended way of working with middy.

Here's how the same would look by having support for a cleanup phase.

middy(handler)
 .use(normalizeResponse()) // implements `cleanup`
 .use(json()) // implements `cleanup`
 .use(error()); // implements `onError`

Much simpler, nicer looking, does exactly what it's supposed to do every time - and, more importantly, it's more resilient to future changes and addition/removal of other middlewares in the chain.

nfantone commented 3 years ago

@willfarrell I went ahead and implemented a PoC for this in #623. It was actually pretty straightforward. I wasn't very thorough about adding new tests or documentation as I wanted to run this by you first and hear your comments.

willfarrell commented 3 years ago

Thanks for the detailed response. All great points. My favourite quote:

It's the developer's responsibility to ensure that the finally clause runs properly.

Wouldn't that be nice.

I'll take a look at the PR and discuss with @lmammino if / when this is something we want to incorporate and what the cascading implications are throughout the codebase. I cannot guarantee it will be included in v2.

willfarrell commented 3 years ago

So, we had a good discussion on this and came up with some early options.

Change onError order

Create a breaking change by reversing the order of the onError stack, and maybe remove the return in the error handler middleware. We think this is the best option but will need some testing to ensure how big of a breaking change it will be. I can test after v2 is released next week. There will be a release/3.x branch then.

Add in a finally middleware stack

Allow middleware to define a finally handler. This adds complexity and requires extra care and testing by the developer.

Add useGroup

Add in a new function to middy that allows an array of middleware that would preserve the order for onError to match after. Feels more like a band aid and would add confusion to the docs.

Thanks for your patience on this. There will be more discussions for sure.

@lmammino let me know if I miss spoke on any of these.

nfantone commented 3 years ago

@willfarrell The first option sounds reasonable to me. And more inline with how middlewares should work within middy, intuitively. Although it wouldn’t allow the same kind of semantics a cleanup (a.k.a. finally) hook would provide, it'd also cover and solve my original issues.

andrew0 commented 3 years ago

I agree that it makes more sense to have the onError handlers execute in the same order as after. Is there a reason why the middleware handlers need to be separated by phase at all, though? Why not make middlewares just a function with a next parameter like Koa, and rely on regular control flow?

const errorHandlerMiddleware = async (handler, next) => {
  try {
    await next();
  } catch (e) {
    handler.response = {
      statusCode: e.statusCode || 500,
      body: e.message
    };
  }
};

const loggerMiddleware = async (handler, next) => {
  const startTime = performance.now();
  try {
    await next();
  } finally {
    const elapsed = performance.now() - startTime;
    console.log(`request took ${elapsed} ms`);
  }
};
nfantone commented 3 years ago

@andrew0 As a Koa member and contributor, I'd like that very much. However, it'd mean a restructuring of middy core - which has recently dropped their own next callbacks in v2.

nponeccop commented 3 years ago

it'd mean a restructuring of middy

Personally, I think that middy is still in its infancy, and 3.0 would break things anyway. We all could make a 3.0 alpha and try it in our own middy stacks, to see if it works in the wild or not. I use my own stack with very little of upstream middy due to #641 performance issues.

nfantone commented 3 years ago

I don't use any of the @middy/* middlewares myself, either. I always write my own versions of those, to be frank. In fact, the ajv validator is, IMHO, pretty useless considering AWS already supports the exact same set of features through API Gateway with minimal setup.

With regards to breaking middy for a 3.0.0-alpha, I'm all for it. I would love to see @middy/core made slimmer, more performant and support natural try { catch } flow. I could try putting together a quick PR as a first pass towards that. If nothing else, to gather ideas from the community.

nponeccop commented 3 years ago

the ajv validator is, IMHO, pretty useless considering AWS already supports the exact same set of features

Only in APIGatewayV1, http events in serverless parlance. Validation doesn't work for APIGatewayV2, httpApi in serverless parlance. httpApi doesn't support validation at all and ignores all the OpenAPI extensions except the integrations setup.

I always write my own versions of those, to be frank

I do it too. The reason for that is that the stock versions consume ~250 ms of cold start time to do things I don't need. With zero impact and zero useless features more people would be able to use stock @middy middlewares as is.

It would be nice if you could share the middlewares you use. Looking at real life middlewares could help develop better stock middlewares. I have some at https://github.com/nponeccop/serverless-openapi-templates/tree/master/validation-middy

nfantone commented 3 years ago

Only in APIGatewayV1, http events in serverless parlance. Validation doesn't work for APIGatewayV2, httpApi in serverless parlance. httpApi doesn't support validation at all and ignores all the OpenAPI extensions except the integrations setup.

True. But still, API Gateway v1 is widely used and the added benefit of not consuming lambda runtime when validating makes the option a no-brainer. Users on v1 would/should see no benefit from middy own validation middlewares.

With zero impact and zero useless features more people would be able to use stock @middy middlewares as is.

Would be nice to define and measure "zero impact". Obviously, that's just a gold standard or aspiration to strive for more than achievable goal. Something similar could be said about "useless features". What's useless to me might not be for others.

It would be nice if you could share the middlewares you use.

I don't have any public repos with examples, unfortunately. But here's a Gist of a yup event validation middleware that I often use (uses minimal ramda and some helpers as well).

nponeccop commented 3 years ago

Users on v1 would/should see no benefit from middy own validation middlewares.

Yes. As long as they set up the Body property in CloudFormation. Which is currently hard to do with Serverless. I'm working on getting that addressed: https://github.com/serverless/serverless/issues/8442

Also, some users of v1 might not be satisfied with the stock error messages, or need custom async validators which aren't portable and thus cannot be sensibly supported by AWS.

Would be nice to define and measure "zero impact".

It isn't very hard to quantify. We just need to agree on something arbitrary. :-)

I think as long as you can't do 2x better with handwritten code it's zero impact for many people. The problem is that current approach is 10x worse in serverless-offline. If it translates to a similar difference on AWS it's quite terrible. I think we can define zero impact as less than 10% difference on your monthly AWS bill, or something like that.

I don't have any public repos with examples

Yeah, a quick and dirty gist is fine.

nfantone commented 3 years ago

@nponeccop @andrew0 @willfarrell I took the liberty of opening #651 for us to start the conversation on how v3 should look like. Let me know what you think.

willfarrell commented 2 years ago

The reverse ordering of onError is now included in the v3 branch and will be release later this month! Closing for now.