discord / discord-api-docs

Official Discord API Documentation
https://discord.com/developers/docs/intro
Other
5.91k stars 1.25k forks source link

Random misleading Unknown Interaction errors #5558

Open ImRodry opened 1 year ago

ImRodry commented 1 year ago

Description

I've seen this issue reported by many people but so far no one has been able to gather enough information to reliably explain what's going on. An example can be seen at https://github.com/discordjs/discord.js/issues/7005 In summary, every now and then at a seemingly random chance it's possible that a bot's reply to an interaction fails due to an Unknown Interaction when, in reality, the reply succeeded and was shown to the user (by reply I mean a regular reply, deferred reply or update). I know this because I've been investigating this issue on a bot I manage for around a week now and I asked some users who were impacted by this. In the following screenshots I'm logging the time it took for me to reply by subtracting the current timestamp to the interaction's created_timestamp, and then logging the time it took for the bot to receive the error by subtracting the timestamp at the time the error was received to the one before the request was submitted. You can see that the reply is sent pretty fast and in time for Discord to accept it, however, the error comes 5 seconds later, indicating some sort of issue on Discord's end. image And of course I could be faking those numbers but it would make no sense for me to do that so I'm gonna have to ask you to trust that. I later asked the user impacted by this issue to see what the bot responded with, and they showed that the reply was indeed deferred, which means that that error was a false positive and everything worked fine on our end.

Steps to Reproduce

There are no steps to consistently reproduce this issue as it only happens randomly. What I can tell is that the error comes when the API takes too long to send the response back but actually acknowledges and processes it.

Expected Behavior

The reply is sent correctly (happening) and a success message is returned

Current Behavior

The reply is sent correctly but an "Unknown Interaction" error is thrown

Screenshots/Videos

Can only attach what I've shown above already image image (Bot is thinking but in Portuguese)

Client and System Information

discord.js v14.6.0 on Node v18.11.0 running on Debian 11 (bullseye)

AlecM33 commented 3 months ago

@marcustyphoon I still think there are probably legitimate explanations for the scenarios where this occurs, but I of course won't claim that for sure since everyone's situation is different. Since I had a 100% reliable way to reproduce the problem, I thought it would be useful to provide my explanation and more or less challenge those here, since so much of the info here is anecdotal and difficult to act on from discord's perspective. It does sound like your scenario is simple and somewhat consistent, so perhaps you could provide a minimally reproducible code example. That would probably help this gain traction in the event a maintainer checks in on this.

Just personally, whenever I've run into this it's had a client-side explanation. In any case - I'm not just looking to dismiss people's troubles. Rather I hoped to facilitate since this has been open for some time.

timotejroiko commented 2 months ago

I have been seeing this issue ever since i first implemented slash commands over a year ago but i have largely dismissed it as being caused by network lag and interactions that arrive too late, but now I'm convinced there is an actual issue going on and decided to investigate further, so here are my two cents.

My setup is as follows:

I have a website that receives interactions via webhook URL, hosted on a Hetzner vps located in Ashburn US, running nginx 1.25.4 with an upstream proxy to Node.js 22.4 which runs my own custom code.

Here is an example of the timings i observe multiple times per day:

Sample 1 interaction ID: 1260125566783193129 snowflake timestamp: 2024-07-09T06:49:07.122Z timeline:

  1. ? - received by nginx (logs at the end of the request)
  2. 2024-07-09T06:49:07.173Z - received by node, defer response sent back to nginx
  3. 09/Jul/2024:06:49:07 - nginx logs request completed: 09/Jul/2024:06:49:07 +0000 client=35.196.132.85 host=redacted path=/ request=POST / HTTP/1.1 status=200 request_length=2031 bytes_sent=183 body_bytes_sent=20 user_agent=Discord-Interactions/1.0 (+https://discord.com) upstream_status=200 request_time=0.001 upstream_response_time=0.001 upstream_connect_time=0.000 upstream_header_time=0.001
  4. 2024-07-09T06:49:07.173Z - command code runs
  5. 2024-07-09T06:49:07.729Z - command code ends and response initiates
  6. 2024-07-09T06:49:07.729Z - node http request created (http.request())
  7. 2024-07-09T06:49:07.729Z - node http stream write started (request.write())
  8. 2024-07-09T06:49:07.773Z - node http stream write ended (request.end())
  9. 2024-07-09T06:49:07.773Z - node http request emitted finish event
  10. 2024-07-09T06:49:10.199Z - node http emitted response event
  11. 2024-07-09T06:49:10.199Z - node http response emitted end event
  12. 2024-07-09T06:49:10.202Z - bot emitted error event status 404 "Unknown Webhook" "code: 10015"

Sample 2 interaction ID: 1260117594833293343 snowflake timestamp: 2024-07-09T06:17:26.461Z

  1. ? - received by nginx (logs at the end of the request)
  2. 2024-07-09T06:17:26.484Z - received by node, defer response sent back to nginx
  3. 09/Jul/2024:06:17:26 - nginx logs request completed: 09/Jul/2024:06:17:26 +0000 client=35.237.4.214 host=redacted path=/ request=POST / HTTP/1.1 status=200 request_length=1951 bytes_sent=183 body_bytes_sent=20 user_agent=Discord-Interactions/1.0 (+https://discord.com) upstream_status=200 request_time=0.001 upstream_response_time=0.001 upstream_connect_time=0.000 upstream_header_time=0.001
  4. 2024-07-09T06:17:26.484Z - command code runs
  5. 2024-07-09T06:17:26.485Z - command code ends and response initiates
  6. 2024-07-09T06:17:26.485Z - node http request created (http.request())
  7. 2024-07-09T06:17:26.485Z - node http stream write started (request.write())
  8. 2024-07-09T06:17:26.485Z - node http stream write ended (request.end())
  9. 2024-07-09T06:17:26.485Z - node http request emitted finish event
  10. 2024-07-09T06:17:29.601Z - node http emitted response event
  11. 2024-07-09T06:17:29.601Z - node http response emitted end event
  12. 2024-07-09T06:17:29.603Z - bot emitted error event status 404 "Unknown Webhook" "code: 10015"

My conclusion:

Discord is somehow not acknowledging the defer from the interaction webhook response and then delaying the follow up request until it expires. I thought about the possibility that the follow up is sent too fast, before the response from nginx is received by discord, but it doesn't seem to be the case as the issue persists even when the command takes 500ms+ to run.

I hope this is useful in getting this resolved, i can provide more information and more tests if needed.

(edit: formatting + typos)

SomeBoringNerd commented 1 month ago

Using JDA, with commands that are otherwise instants (a /ping command), I get this error.

The weird part is that it's really random, but once it happens, it just wont go away. The command itself or my code does not seems to be a problem either since when it works, the interaction respond instantly. The device on which the code run is good, and the network is fast.

I was not able to identify a pattern, unfortunately, hope it get fixed soon

edit : the commands fail instantly, it does not wait 3 seconds, and when it happens, JDA dont get a SlashCommandInteractionEvent

edit (8th of june, 12:00) : removing the bot from a guild and re-adding it seems to have fixed it for now ? For the record, the interactions would fail no matter the guild, or if sent through dms, and was persistent accross restarts of the app

edit (10th of june, early morning) : correcting what I said two days ago, once the bug get triggered, it will only do so in existing guilds, adding it to a new guild will make commands work, but only in dms (for members of that guild) and the guild

update :

After more experimentation, it turned out to be a client side issue, for a reason I dont understand, reloading discord did fix it and I dont know why. Since this error seems to be generic, it wont be useful for everyone, but for people that get this error in the same way as I did, it could help

edit (11 of august) : some of the users reported the exact same issue to me, the issue does not seems to appear on mobile, only desktop

milenakos commented 1 week ago

observation: you can solve the issue by deferring the response, this results in an unnecessary request (note: both defer and response happen in under a second, this is a workaround and not the intended use of defers)

mayeradelman commented 1 week ago

observation: you can solve the issue by deferring the response, this results in an unnecessary request (note: both defer and response happen in under a second, this is a workaround and not the intended use of defers)

I've been experiencing this issue exclusively with commands that were already deferred.

timotejroiko commented 1 week ago

update:

I was able to greatly reduce the number of errors by not responding to the webhook itself.

My setup now is as follows:

  1. [nginx] interaction webhook received via http, proxied to node
  2. [node] interaction webhook received via http
  3. [node] initial response sent via rest api callback
  4. [nginx] request terminated with code 499, meaning discord acknowledged the rest api callback and terminated the webhook on their side
  5. [node] reply/followup sent via rest api

This solution is only applicable when receiving interactions via webhook, but it seems to work well for now.