slackapi / java-slack-sdk

Slack Developer Kit (including Bolt for Java) for any JVM language
https://slack.dev/java-slack-sdk/
MIT License
571 stars 212 forks source link

Certain team ids getting 401 HTTP response from event subscription URL while others work #1334

Closed jshao-brex closed 2 weeks ago

jshao-brex commented 2 months ago

Problem:

Our Slack app uses "Event Subscriptions" with Slack JAVA SDK to listen to events triggered in the Slack app. While it works well for the majority of our customers, we got reports from some customers that the Slack app is not responsive for all of their users and showing the following error:

image

Troubleshooting done so far

Our logs show that our event subscription URL is not getting any events for these workspaces/Slack teams.

We opened a help request. One Slack team member checked one of the impacted workspaces, and told us that Slack events for this workspace were sent, but all of them got a 401 HTTP response.

We reviewed the Slack JAVA SDK source code and our own code. It seems like the only place that could return a 401 response is the Request Verification middleware. So, we disabled the built-in Request verification middleware in our app config and created a custom middleware with the same logic but additional logging. However, we are not seeing any requests rejected and returning 401.

We also considered the possibility that the request was rejected at the network level by ingress before it reached our application code. However, this seems not possible since it only and consistently happens to a subset of workspaces/team IDs. The component that rejected it has to understand the payload and know how to extract the team ID from it. Neither the network-level component nor ingress is checking the payload in the Slack event HTTP request...

Our Slack app is not using socket mode.

Could you provide some guidance on the next step in troubleshooting this issue? Are there any other middleware/components that might reject the requests?

Reproducible in:

The Slack SDK version

1.38.2

Java Runtime version

openjdk version "11.0.22" 2024-01-16

OS info

N/A

misscoded commented 2 months ago

Hi @jshao-brex! I spoke with @seratch, the creator and primary maintainer of this library, to get his thoughts on the issue you've described and to ensure we got you the best answer out the gate. 🙂

One likely cause came to mind, which is a token resolution issue occurring within the authorize function. Code sites to do some additional debugging include here and here. Though it could be something else, one reason it might happen is due to a grid migration, where the app is still available in the original/migrated workspace, but the enterprise_id/team_id is no longer the same.

To avoid this happening in the future, he recommended:

  1. Periodically (ie, daily) running an auth.test API call for all stored tokens and update the metadata, as necessary (such as team_id)
  2. Subscribe to grid migration events in the Events API and update the metadata in database once the migration has finished, when and if that occurs

We hope this is helpful. Let us know if this ends up being the cause of the issue you're seeing!

jshao-brex commented 2 months ago

Thank you both @misscoded and @seratch!! These are very helpful. We are looking at these and will let you know how it goes.

github-actions[bot] commented 1 month ago

👋 It looks like this issue has been open for 30 days with no activity. We'll mark this as stale for now, and wait 10 days for an update or for further comment before closing this issue out. If you think this issue needs to be prioritized, please comment to get the thread going again! Maintainers also review issues marked as stale on a regular basis and comment or adjust status if the issue needs to be reprioritized.

github-actions[bot] commented 2 weeks ago

As this issue has been inactive for more than one month, we will be closing it. Thank you to all the participants! If you would like to raise a related issue, please create a new issue which includes your specific details and references this issue number.