slackapi / bolt-js

A framework to build Slack apps using JavaScript
https://tools.slack.dev/bolt-js/
MIT License
2.74k stars 394 forks source link

Lambda Custom Function triggered by workflow gets token_revoked error when API calls take longer than 3 seconds #2288

Open mihirkothari25 opened 1 day ago

mihirkothari25 commented 1 day ago

I created a sample custom function to demonstrate the issue we are seeing - https://github.com/mihirkothari25/bolt-js-getting-started-app. It's adapted from the sample starter app. I run ngrok and serverless-offline to run the app locally and point the event subscription URL in the slack app to the ngrok URL. I see that even after responding immediately from the function if the next interaction with Slack takes more than 3 seconds, the token_revoked error is thrown. If an API call takes longer than 3 seconds after the function has responded the first time, there will be a token_revoked error on the next complete or whatever slack function is being called.

In the real world scenario -

  1. Slack user invokes the workflow.
  2. Workflow sends an event to the /slack/events endpoint configured in API Gateway.
  3. That invokes the lambda function.
  4. The lambda function starts execution, but the API calls take over 3 seconds.
  5. The lambda function executes the complete, and it results in a token_revoked error.

To mitigate this issue, we tried to immediately respond after the lambda function starts execution. As soon as the lambda function starts executing, it would post an ephemeral message back to the user however, when the AWS Lambda cold starts, that takes too long, and the same error is encountered. If the AWS Lambda is warm, the error is less frequent.

My sample app impersonates the API call by just doing a sleep timeout. It also impersonates the AWS infrastructure by using serverless-offline.

@slack/bolt version

3.22.0

Your App and Receiver Configuration

Link to Application code - https://github.com/mihirkothari25/bolt-js-getting-started-app/blob/main/app.js

Using the AwsLambdaReceiver

Node.js runtime version

v21.7.3

Steps to reproduce:

https://github.com/mihirkothari25/bolt-js-getting-started-app/tree/main - forked and adapted starter app. Run npm install to install dependencies.

  1. Run ngrok on port 3000 ngrok http 3000
  2. Run serverless-offline with the command npm run serverless
  3. Update the Slack app event subscription URL with the URL generated by the ngrok command in step 1.
  4. Run the workflow from Slack.
  5. The Slack app will send the event to the ngrok URL from step 1.
  6. Ngrok will tunnel that event to localhost:3000 where serverless is running.
  7. Serverless will simulate invoking the lambda and execute the lambda function and respond to slack.

Expected result:

Slack should respond initially that it is executing the request and then respond with the result once it is done.

Actual result:

If the sleep is longer than 3 seconds, the function ends with the token_revoked error. This is despite responding immediately from the function and then sleeping for longer than 3 seconds.

Is the expectation that the function should keep pushing heartbeats to slack while it is finishing the execution of underlying API calls?

These are the errors -

[ERROR]  bolt-app Error: An API error occurred: token_revoked
    at Jae (/var/task/index.js:13:32210)
    at t.apiCall (/var/task/index.js:15:6190)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async /var/task/index.js:239:426

Unhandled Promise Rejection
{
    "errorType": "Runtime.UnhandledPromiseRejection",
    "errorMessage": "Error: An API error occurred: token_revoked",
    "reason":
    {
        "errorType": "Error",
        "errorMessage": "An API error occurred: token_revoked",
        "code": "slack_webapi_platform_error",
        "data":
        {
            "ok": false,
            "error": "token_revoked",
            "response_metadata":
            {}
        },
        "stack":
        [
            "Error: An API error occurred: token_revoked",
            "    at Jae (/var/task/index.js:13:32210)",
            "    at t.apiCall (/var/task/index.js:15:6190)",
            "    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)"
        ]
    },
    "promise":
    {},
    "stack":
    [
        "Runtime.UnhandledPromiseRejection: Error: An API error occurred: token_revoked",
        "    at process.<anonymous> (file:///var/runtime/index.mjs:1276:17)",
        "    at process.emit (node:events:517:28)",
        "    at emit (node:internal/process/promises:149:20)",
        "    at processPromiseRejections (node:internal/process/promises:283:27)",
        "    at process.processTicksAndRejections (node:internal/process/task_queues:96:32)"
    ]
}
[ERROR]  bolt-app Error: An API error occurred: token_revoked
    at Jae (/var/task/index.js:13:32210)
    at t.apiCall (/var/task/index.js:15:6190)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async /var/task/index.js:239:426
    at async RM (/var/task/index.js:83:23240)
    at async Array.<anonymous> (/var/task/index.js:80:445651)
    at async Array.<anonymous> (/var/task/index.js:80:441542)
    at async z9.processEvent (/var/task/index.js:83:35901)
    at async /var/task/index.js:121:12324 {
  code: 'slack_webapi_platform_error',
  data: { ok: false, error: 'token_revoked', response_metadata: {} }
}
Error:{"code":"slack_webapi_platform_error","data":{"ok":false,"error":"token_revoked","response_metadata":{}}}
misscoded commented 7 hours ago

Hi @mihirkothari25! Thanks for raising this. A quick search internally reveals that this is not the first time that we've heard this issue from folks, but it's not immediately clear to me what the resolution is (if it even exists). I'll do a bit more digging and get back to you with more information (and hopefully a solution) next week.