Closed hamilton-s closed 8 months ago
Thanks for reporting. As you mentioned "We're also not sure why the issue has suddenly started happening as the issues didn't seem to coincide with a particular upgrade/code change.", which makes me think there was a change on the AWS infra side, but I don't see anything in the docs that jumps out as changed. I wonder if there was an SDK change (or one of it's deps)?
If you reached out the AWS support, can you shared their response here.
I'll do some digging as well.
Thanks for your reply!
We've discovered a significant problem in our Observability lambda layer that seems to slow down boot-up times due to instrumenting the aws-sdk. Initially, we thought this was a middy issue because it only occurred with middy, but we've now determined that it's related to aws-sdk being in the dependency tree. We're addressing this with our Observability partner and hope it resolves our problem. If others are also facing this issue, it could remain open, but for now, we're focusing on fixing our lambda layer to see if it resolves the problem. We can close this unless others are experiencing the same issue.
For what you described, that sounds like it could easily cause this issue. I'll close for now. If you you need to reopen, please do so. Other are welcome to comment if they're also running into this.
Hello, I've hit the same error on Lambda@Edge recently. Setup looks like
export default middy(handler)
.use(doNotWaitForEmptyEventLoop())
.use(ssm({
fetchData: {
paramA: "path"
},
setToContext: true,
awsClientOptions: {
region: process.env.REGION || 'us-east-1'
}
}))
Environment
INFO InvalidSignatureException: Signature expired: 20231119T162823Z is now earlier than 20231119T163124Z (20231119T163624Z - 5 min.)
at throwDefaultError (/var/runtime/node_modules/@aws-sdk/smithy-client/dist-cjs/default-error-handler.js:8:22)
at /var/runtime/node_modules/@aws-sdk/smithy-client/dist-cjs/default-error-handler.js:18:39
at de_GetParametersByPathCommandError (/var/runtime/node_modules/@aws-sdk/client-ssm/dist-cjs/protocols/Aws_json1_1.js:4242:20)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async /var/runtime/node_modules/@aws-sdk/middleware-serde/dist-cjs/deserializerMiddleware.js:7:24
at async /var/runtime/node_modules/@aws-sdk/middleware-signing/dist-cjs/awsAuthMiddleware.js:14:20
at async /var/runtime/node_modules/@aws-sdk/middleware-retry/dist-cjs/retryMiddleware.js:27:46
at async /var/runtime/node_modules/@aws-sdk/middleware-logger/dist-cjs/loggerMiddleware.js:7:26
at async Promise.allSettled (index 0)
at async to (/var/task/src/edgeGate/handler.js:63:43001) {
'$fault': 'client',
'$metadata': {
httpStatusCode: 400,
requestId: '55d1f065-3954-4531-9ffe-af7e2e2a8a29',
extendedRequestId: undefined,
cfId: undefined,
attempts: 1,
totalRetryDelay: 0
},
__type: 'InvalidSignatureException'
}
It happens randomly to a small number of requests. I couldn't figure out a pattern yet.
We managed to fix our observability lambda layer issue but we are still experiencing the middy issue. @willfarrell Can we please reopen this issue since others appear to have the same issue in the thread above too?
My theory on what is happening in our case:
After some digging, I think I have a theory.
How to fix, All middlewares that fetch from aws services will need to catch InvalidSignatureException
and force a retry during the request. I'll have to think on how to best implement this.
Would love to hear if the above steps makes sense those running into this issue.
Ref: https://repost.aws/knowledge-center/lambda-sdk-signature
I pushed a PR, if someone could test irl that would be great.
I'll try to copy-paste your changes, we are still on v4, and run it for a few days.
@HumbleBeck Any feedback on this?
Hi @willfarrell. While this bug rarely happens to us, I can confirm that the fix works, and it started recovering expired signature calls.
Awesome, I'll update the PR to cover all AWS service middleware (just in case) and merge in. Thanks a lot for testing it out.
Hi, If you are still on version 4. One workaround is overriding the retry strategy and passing it to middy
.
const middy = require('@middy/core');
const ssm = require('@middy/ssm');
const {ConfiguredRetryStrategy} = require('@smithy/util-retry');
class ClockSkewRetryStrategy extends ConfiguredRetryStrategy {
constructor(maxAttempts, computeNextBackoffDelay) {
super(maxAttempts, computeNextBackoffDelay);
}
isRetryableError(errorType) {
return errorType === 'CLIENT_ERROR' || super.isRetryableError(errorType);
}
}
...
middy()
.use(
ssm({
...
awsClientOptions: {
retryStrategy: new ClockSkewRetryStrategy(3, 500),
},
...
})
)
.before(async (request) => {
...
});
Describe the bug Since the end of October, we have seen our Lambda functions intermittently fail due to SSM parameters not being fetched. The error we are seeing looks like the following:
We have managed to bring our errors down by disabling prefetch and reducing the cacheExpiry - however, we would prefer to keep these options as they were. We're also not sure why the issue has suddenly started happening as the issues didn't seem to coincide with a particular upgrade/code change.
To Reproduce
We notice it only fails in around 1-2% of cases in our production environments.
Environments
Additional context We've noticed this across a range of our services, with different versions of @middy/ssm