42wim / matterircd

Connect to your mattermost or slack using your IRC-client of choice.
MIT License
294 stars 60 forks source link

panic in antiIdle while reconnecting in adverse network conditions #327

Closed vmpjdc closed 4 years ago

vmpjdc commented 4 years ago

I had some network problems recently, and matterircd crashed while it was trying to reconnect:

time="2020-10-13T13:18:16+13:00" level=info msg="Found version 5.27.0.5.27.0.3bd4fc983a4aa622183c3bd1e77029e0.true" prefix=matterclient
time="2020-10-13T13:18:26+13:00" level=error msg="reconnect: login failed: https://chat.example.net/api/v4/users/me: model.client.connecting.app_error, Get "https://chat.example.net/api/v4/users/me": context deadline exceeded (Client.Timeout exceeded while awaiting headers), retrying in 10 seconds" prefix=matterclient
time="2020-10-13T13:18:27+13:00" level=error msg="ChannelView update for yryoj7ka6pr4ukbtojimq8ck8c failed: https://chat.example.net/api/v4/channels/members/ke9yfckxsbrkuyaa9hx4xd3use/view: model.client.connecting.app_error, Post "https://chat.example.net/api/v4/channels/members/ke9yfckxsbrkuyaa9hx4xd3use/view": context deadline exceeded (Client.Timeout exceeded while awaiting headers)" prefix=matterclient
time="2020-10-13T13:18:36+13:00" level=info msg="reconnect: login" prefix=matterclient
time="2020-10-13T13:18:46+13:00" level=error msg="reconnect: login failed: "https://chat.example.net/api/v4/users/logout: model.client.connecting.app_error, Post \"https://chat.example.net/api/v4/users/logout\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)", retrying in 10 seconds" prefix=matterclient
time="2020-10-13T13:18:56+13:00" level=info msg="reconnect: login" prefix=matterclient
time="2020-10-13T13:19:06+13:00" level=error msg="reconnect: login failed: "https://chat.example.net/api/v4/users/logout: model.client.connecting.app_error, Post \"https://chat.example.net/api/v4/users/logout\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)", retrying in 10 seconds" prefix=matterclient
time="2020-10-13T13:19:16+13:00" level=info msg="reconnect: login" prefix=matterclient
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x809cbf]

goroutine 70 [running]:
github.com/42wim/matterircd/pkg/matterclient.(*Client).UpdateLastViewed(0xc0002c0000, 0xc0002d12a0, 0x1a, 0xb, 0xc000e67f30)
    /home/paul/go/src/github.com/42wim/matterircd/pkg/matterclient/channels.go:278 +0x11f
github.com/42wim/matterircd/bridge/mattermost.(*Mattermost).antiIdle(0xc0002be090, 0xc0002d12a0, 0x1a, 0xc0001f40c0)
    /home/paul/go/src/github.com/42wim/matterircd/bridge/mattermost/mattermost.go:188 +0x107
created by github.com/42wim/matterircd/bridge/mattermost.(*Mattermost).loginToMattermost
    /home/paul/go/src/github.com/42wim/matterircd/bridge/mattermost/mattermost.go:104 +0x4d9

Here's where the crash happens:

func (m *Client) UpdateLastViewed(channelID string) error {
    m.logger.Debugf("posting lastview %#v", channelID)

    view := &model.ChannelView{ChannelId: channelID}

    for {
        _, resp := m.Client.ViewChannel(m.User.Id, view)    // <-- HERE

    // [...]
    }
}

I've added some code locally to check which of m.Client or m.User is nil and tried blackholing the server IPs but so far have been unable to reproduce the crash.

Possibly it would be safest to shut down antiIdle while reconnecting, but it's part of the top-level generic code, and the reconnect loop is handled in the bridge code, so this is not entirely trivial to do and might still be racy. So probably the first step is still to work out what is nil in the first place.

42wim commented 4 years ago

Thanks for debugging! It's going to be User that has been nil