42wim / matterircd

Connect to your mattermost or slack using your IRC-client of choice.
MIT License
294 stars 60 forks source link

Insane amount of replays, with 13/14 second frequency #302

Closed guilhermepiccoli closed 4 years ago

guilhermepiccoli commented 4 years ago

Hi @42wim , I'm running matterircd built yesterday, at commit ed39f4072ff588c. I've seen an insane amount of replays this morning (I have DisableAutoView set), like hundreds or more. In #town-square, the only message was some user leaving..so it kept repeating the following at every 13/14 seconds:

Replaying since 2020-08-17 20:54:14 [20:54] some_user removed from the team. Replaying since 2020-08-17 20:54:14 [20:54] some_user removed from the team. Replaying since 2020-08-17 20:54:14 Thanks in advance for any help with this!
42wim commented 4 years ago

The replaying is only triggered on connect, is this something you can repeat every time when you start matterircd and connect?

If yes, then please add debug and tracing to your matterircd.toml and put the logs somewhere.

debug=true
trace=true

feel free to join also the #matterircd channel on irc.freenode.net

guilhermepiccoli commented 4 years ago

Just for the sake of the registry , I talked with Wim on freenode, it seems in my case we have no login/logout messages, but I could see 413 "Found version 5" strings, one every 13/14 sec, until I restart matterircd.

Tnx again for the great support Wim =)

guilhermepiccoli commented 4 years ago

Hey @42wim , thanks for adding more debug in matterircd! Today the issue reproduced, and I was able to see the following, in a repeating manner:

time="2020-08-19T16:54:25+02:00" level=debug msg="WS PING" prefix=matterclient time="2020-08-19T16:54:25+02:00" level=debug msg="sendWsRequest &model.WebSocketRequest{Seq:1, Action:"ping", Data:map[string]interface {}(nil), Session:model.Session{Id:"", Token:"", CreateAt:0, ExpiresAt:0, LastActivityAt:0, UserId:"", DeviceId:"", Roles:"", IsOAuth:false, Props:model.StringMap(nil), TeamMembers:[]*model.TeamMember(nil), Local:false}, T:(i18n.TranslateFunc)(nil), Locale:""}" prefix=matterclient time="2020-08-19T16:54:30+02:00" level=debug msg="StatusLoop() timeout" prefix=matterclient time="2020-08-19T16:54:30+02:00" level=debug msg="logout as gpiccoli (team: teamname) on chat.teamname.org" prefix=matterclient time="2020-08-19T16:54:30+02:00" level=info msg="Found version 5.23.0.5.23.1.2604dfb92139cce46aebb9d34323f6a7.true" prefix=matterclient time="2020-08-19T16:54:30+02:00" level=debug msg="trying login teamname gpiccoli chat.teamname.org" prefix=matterclient time="2020-08-19T16:54:30+02:00" level=debug msg="trying login with personal token" prefix=matterclient time="2020-08-19T16:54:31+02:00" level=info msg="found 665 users in team teamname" prefix=matterclient time="2020-08-19T16:54:31+02:00" level=debug msg="initUser(): found our team teamname (id: sqmc4sz45prypmkfctwynm5yjr)" prefix=matterclient time="2020-08-19T16:54:31+02:00" level=debug msg="WsClient: making connection: wss://chat.teamname.org/api/v4/websocket" prefix=matterclient

Notice that, after some minutes, I just killed matterircd and restarted..problem vanished! So, seems a bad condition that is transient but triggers a kind of loop of reconnects... Is it enough logging to understand the issue? Wanna add more logs? Or maybe I should run with trace? Thanks again!

42wim commented 4 years ago

No, the StatusLoop() timeout says enough, it means the server stopped responding to pings. Could be a flaky connection, but I'm not sure yet why the reconnects keeps reconnecting after being connected ;-)

guilhermepiccoli commented 4 years ago

OK, to be sure I understood that properly: the mattermost server stopped responding to pings? Or my ZNC stopped responding pings?

If the mattermost server, I'll talk to the company infrastructure team, this is odd...or could be my vps also. One interesting thing is that I'm not seeing the flood of replays anymore; it seems to be replaying only once.

guilhermepiccoli commented 4 years ago

Just a small addition: I just faced another StatusLoop() timeout, but this time I noticed the following right before:

"time="2020-08-21T20:58:34+02:00" level=error msg="error:websocket: unexpected reserved bits 0x60" prefix=matterclient"

Not sure if it helps, hopefully it does. Thanks again!

42wim commented 4 years ago

should be fixed in master.

hloeung commented 4 years ago

Tested with master and confirmed fixed. I can still see my client reconnecting, but no flood of replays nor missed messages so far.

Thanks @42wim !

hloeung commented 4 years ago

Gah, spoke too soon.

Even with the reconnecting issue fixed, it seems with the latest matterircd there are periods where no messages are relayed / received.

guilhermepiccoli commented 4 years ago

should be fixed in master.

@42wim thanks! Do you have a specific commit to point me? Just so I can understand how the fix works ?

42wim commented 4 years ago

@guilhermepiccoli https://github.com/42wim/matterircd/commit/15b76125b88ac517fcc5bf057996e7e43d6c74e8 rewrite of matterclient connection

guilhermepiccoli commented 4 years ago

Thanks, gonna test now =)

42wim commented 4 years ago

@hloeung is this still happening on master for you?

hloeung commented 4 years ago

@42wim fixed with master. Thank you, much appreciated.