cronokirby / alchemy

A discord library for Elixir
MIT License
152 stars 34 forks source link

Gateway termiantes occasionally #119

Open Zarathustra2 opened 3 years ago

Zarathustra2 commented 3 years ago
Stacktrace

13:00:18.716 [error] ** Websocket client Alchemy.Discord.Gateway terminating in :websocket_handle/3
for the reason :error::function_clause
** Websocket message was {:text, "{\"t\":null,\"s\":null,\"op\":7,\"d\":null}"}
** Handler state was %Alchemy.Discord.Gateway.State{}
** Stacktrace: [{:websocket_client, :handle_response, [:ok, Alchemy.Discord.Gateway, {
                :websocket_req, :wss, 'gateway.discord.gg', 443, '/?v=6&encoding=json', :infinity, :undefined, 1, {
                :sslsocket, {:gen_tcp, #Port<0.107>, :tls_connection, :undefined}, [#PID<0.9973.0>, #PID<0.9972.0>]}, {
                :transport, :ssl, :ssl, :ssl_closed, :ssl_error, [mode: :binary, active: true, verify: :verify_none, packet: 0]}, "SOME_CHAR_SEQUENCE_DELETED", :undefined, 1, 1, :undefined, :undefined}], 
                [file: 'FULL_PATH_DELETED/deps/websocket_client/src/websocket_client.erl', line: 545]}, 
                {:websocket_client, :handle_websocket_frame, 2, [file: 'FULL_PATH_DELETED/deps/websocket_client/src/websocket_client.erl', line: 500]}, 
                {:gen_statem, :loop_state_callback, 11, [file: 'gen_statem.erl', line: 1118]}, {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 249]}]

13:00:18.718 mfa=:gen_statem.error_info/7 [error] GenStateMachine #PID<0.9971.0> terminating
** (exit) :function_clause
(stdlib 3.12.1) gen_statem.erl:1312: :gen_statem.loop_state_callback_result/11

14:49:25.992 mfa=:gen_statem.error_info/7 [error] GenStateMachine #PID<0.7238.0> terminating
    (websocket_client 1.4.2) :websocket_client.disconnected(:cast, :connect, {
        :context, {:websocket_req, :wss, 'gateway.discord.gg', 443, '/?v=6&encoding=json', :infinity, :undefined, 1, 
        {:sslsocket, {:gen_tcp, #Port<0.63>, :tls_connection, :undefined}, [#PID<0.7240.0>, #PID<0.7239.0>]}, 
        {:transport, :ssl, :ssl, :ssl_closed, :ssl_error, [mode: :binary, active: true, verify: :verify_none, packet: 0]}, 
        "SOME_CHAR_SEQUENCE_DELETED", :undefined, 1, 1, :undefined, :undefined}, 
        {:transport, :ssl, :ssl, :ssl_closed, :ssl_error, [mode: :binary, active: true, verify: :verify_none, packet: 0]}, [], 
        {:wss, 'gateway.discord.gg', 443, '/?v=6&encoding=json'}, {Alchemy.Discord.Gateway, %Alchemy.Discord.Gateway.State{}}, "", false, 0})

Hey @cronokirby,

I have seen this error a lot lately. I am not sure what is causing this maybe you can help me there. Nothing important as the genserver just restarts and resumes to work but would be nice if this can be avoided! :)

Also unsure how to reproduce it... :(

Anyways, thanks for the awesome library!

cronokirby commented 3 years ago

I think that the websocket message is malformed, given that all its fields are null.

Without having looked at the websocket gateway in a while, my hunch would be:

  1. Our parser is too strict here
  2. Our handling of the protocol is wrong or outdated, confusing Discord
zLupa commented 3 years ago

This error is appearing here too but in Voice gateway.

Here's the error

15:50:27.789 [debug] Voice Gateway for ID_REDACTED terminated, reason: :function_clause

15:50:28.462 [debug] Voice Gateway for ID_REDACTED connected

15:50:28.620 [error] ** Websocket client Alchemy.Voice.Gateway terminating in :websocket_info/3
   for the reason :error::function_clause
** Last message was {:ssl_closed,
 {:sslsocket, {:gen_tcp, #Port<0.106>, :tls_connection, :undefined},
  [#PID<0.841.0>, #PID<0.840.0>]}}
** Handler state was %Alchemy.Voice.Gateway.State{
  channel: ID_REDACTED,
  discord_ip: nil,
  discord_port: nil,
  guild_id: "ID_REDACTED",
  key: nil,
  my_ip: nil,
  my_port: nil,
  session: "SESSION_REDACTED",
  ssrc: nil,
  token: "TOKEN_REDACTED",
  udp: nil,
  url: "brazil1001.discord.media:443",
  user_id: "ID_REDACTED"
}
** Stacktrace: [
  {Alchemy.Voice.Gateway, :websocket_info,
   [
     {:ssl_closed,
      {:sslsocket, {:gen_tcp, #Port<0.106>, :tls_connection, :undefined},
       [#PID<0.841.0>, #PID<0.840.0>]}},
     {:websocket_req, :wss, 'brazil1001.discord.media', 443, '/', :infinity,
      :undefined, 1,
      {:sslsocket, {:gen_tcp, #Port<0.106>, :tls_connection, :undefined},
       [#PID<0.841.0>, #PID<0.840.0>]},
      {:transport, :ssl, :ssl, :ssl_closed, :ssl_error,
       [mode: :binary, active: true, verify: :verify_none, packet: 0]},
      "__REDACTED__", :undefined, 1, :undefined, :undefined,
      :undefined},
     %Alchemy.Voice.Gateway.State{
       channel: ID_REDACTED,
       discord_ip: nil,
       discord_port: nil,
       guild_id: "ID_REDACTED",
       key: nil,
       my_ip: nil,
       my_port: nil,
       session: "SESSION_REDACTED",
       ssrc: nil,
       token: "TOKEN_REDACTED",
       udp: nil,
       url: "brazil1001.discord.media:443",
       user_id: "ID_REDACTED"
     }
   ], [file: 'lib/Voice/gateway.ex', line: 150]},
  {:websocket_client, :handle_info, 2,
   [file: '/bot/deps/websocket_client/src/websocket_client.erl', line: 448]},
  {:gen_statem, :loop_state_callback, 11, [file: 'gen_statem.erl', line: 1166]},
  {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 226]}
]

15:50:28.621 [debug] Voice Gateway for ID_REDACTED terminated, reason: :function_clause

18:22:35.140 [debug] Shard [0, 1] Disconnected from the Gateway; restarting the Gateway

After this, the bot just stops the Voice and after a while, it disconnects from the main gateway and can't reconnect again.

OvermindDL1 commented 3 years ago

I get this on occasion as well. Considering my bot is hooked up to a busy discord server I just have it watch last messages and if no message in 10 minutes then the bot entirely restarts alchemy and dependency application, it's a bit brute force and it does mean messages are lost since that time, but it's working. Would be nice to be fixed though. :-)

Zarathustra2 commented 3 years ago

@OvermindDL1 Why are you watching the messages for 10 minutes and then force a restart? How is this issue impacting your bot? It restarts the bot as the gen server crashes which is only an issue if the bot is playing a media file as @zLupa stated.

I will try to look into it this week, can't promise though! :)

OvermindDL1 commented 3 years ago

Why are you watching the messages for 10 minutes and then force a restart? How is this issue impacting your bot?

That indicates to the bot that its lost connection to Discord.

It restarts the bot as the gen server crashes which is only an issue if the bot is playing a media file as @zLupa stated.

I don't play media files and I still get gateway termination issues on occasion. Alchemy reconnects fine most of the time but occasionally discord messages just seem to stop, don't see a crash SASL log, etc... just nothing even though plenty is happening on Discord. Not much choice at that point but just to restart alchemy.

Zarathustra2 commented 3 years ago

Hmmm, that is weird. Are you on the latest version?

OvermindDL1 commented 3 years ago

Hmmm, that is weird. Are you on the latest version?

Great question, possibly not. Let me see if I can update with minimal breakages, it's a slow time on discord so it should be a good time to do so...

OvermindDL1 commented 3 years ago

I was a couple minor versions behind, 0.6.4, just updated to 0.6.10, it seems to be working so no breakages, I'll keep watch. It only seems to stop getting data after being up anywhere from a few days to a few weeks (memory, process, and CPU usage do not grow as a note) so I'll try to remember to watch for it, it's easy to miss now since I have it auto-restart (preferred so people don't yell at me, ^.^; ).

Speaking of, slightly OT, does alchemy support Discord's Reply information now? I have some functionality in mind that would be useful to add around that functionality and it didn't seem to exist in 0.6.4 anywhere I could see.

Zarathustra2 commented 3 years ago

@OvermindDL1 I just checked the docs it is possible:

You may create a message as a reply to another message. To do so, include a message_reference with a message_id. The channel_id and guild_id in the message_reference are optional, but will be validated if provided

I don't think send_message supports that yet but looks fairly easy to add, I will take a look at that this week as well! Ping me if I forget it

Zarathustra2 commented 3 years ago

@OvermindDL1 I have opened a PR: https://github.com/cronokirby/alchemy/pull/123, waiting for merge now

Zarathustra2 commented 3 years ago

@OvermindDL1 has been merged, if you set alchemy to use master instead of a version you should be able to try it out! :)

cronokirby commented 3 years ago

Or, use the recently published version 0.7.0

OvermindDL1 commented 3 years ago

Oh awesome, that was fast! You're amazing!

OvermindDL1 commented 3 years ago

For note, I'm still getting a lot of these (snipped anything that looks like it might be private data):

13:12:05.575 [error] GenStateMachine #PID<0.15707.1> terminating
** (FunctionClauseError) no function clause matching in :websocket_client.disconnected/3    
    (websocket_client) ./deps/websocket_client/src/websocket_client.erl:291: :websocket_client.disconnected(:cast, :connect, {:context, {:websocket_req, :wss, 'gateway.discord.gg', 443, '/?v=6&encoding=json', :infinity, :undefined, 1, {:sslsocket, {:gen_tcp, #Port<0.7840>, :tls_connection, :undefined}, [#PID<0.15709.1>, #PID<0.15708.1>]}, {:transport, :ssl, :ssl, :ssl_closed, :ssl_error, [mode: :binary, active: true, verify: :verify_none, packet: 0]}, "snip", :undefined, 1, 2, :undefined, :undefined}, {:transport, :ssl, :ssl, :ssl_closed, :ssl_error, [mode: :binary, active: true, verify: :verify_none, packet: 0]}, [], {:wss, 'gateway.discord.gg', 443, '/?v=6&encoding=json'}, {Alchemy.Discord.Gateway, %Alchemy.Discord.Gateway.State{seq: 970, session_id: "snip", shard: [0, 1], token: "snip", trace: ["[\"gateway-prd-main-snip\",{\"micros\":179429,\"calls\":[\"discord-sessions-snip\",{\"micros\":177864,\"calls\":[\"start_session\",{\"micros\":snip,\"calls\":[\"api-prd-main-snip\",{\"micros\":116880,\"calls\":[\"get_user\",{\"micros\":7471},\"add_authorized_ip\",{\"micros\":6594},\"get_guilds\",{\"micros\":7667},\"coros_wait\",{\"micros\":1}]}]},\"guilds_connect\",{\"micros\":2,\"calls\":[]},\"presence_connect\",{\"micros\":26694,\"calls\":[]}]}]}]"], user_id: "snip"}}, "", false, 0})
    (stdlib) gen_statem.erl:1145: :gen_statem.loop_state_callback/11   
    (stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
OvermindDL1 commented 3 years ago

It looks like I'm able to send reply messages now, but I'm not able to receive such a message with the reply information. It looks like the Alchemy.Message struct doesn't have something that supplies a Alchemy.MessageReference in its struct.

OvermindDL1 commented 3 years ago

In addition, the Alchemy.Message struct is missing the optional guild_id on it, need that to be able to reference as well. :-)