mautrix / imessage

A Matrix-iMessage puppeting bridge
https://go.mau.fi/mautrix-imessage/
GNU Affero General Public License v3.0
332 stars 36 forks source link

adding retry timer #188

Closed dltacube closed 4 months ago

dltacube commented 4 months ago

isolated websocket into its own function added a retry var that increases exponentially for each unexpected failure Sleeping log message could be removed, it's in there for debugging at the moment.

Questions

https://github.com/mautrix/imessage/pull/187#discussion_r1492992780

Testing plan

dltacube commented 4 months ago

Not getting a clean break. Need to handle these goroutines.

panic: runtime error: invalid memory address or nil pointer dereference
    panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1006dd5d9]

goroutine 72 [running]:
github.com/gorilla/websocket.(*Conn).Close(...)
    /Users/username/go/pkg/mod/github.com/gorilla/websocket@v1.5.0/conn.go:345
go.mau.fi/mautrix-imessage/imessage/bluebubbles.(*blueBubbles).PollForWebsocketMessages.func1()
    /Users/username/src/imessage/imessage/bluebubbles/api.go:128 +0x19
panic({0x100909b20?, 0x100fe40b0?})
    /usr/local/Cellar/go/1.21.6/libexec/src/runtime/panic.go:914 +0x21f
github.com/gorilla/websocket.(*Conn).NextReader(0x0)
    /Users/username/go/pkg/mod/github.com/gorilla/websocket@v1.5.0/conn.go:1000 +0x17
github.com/gorilla/websocket.(*Conn).ReadMessage(0xc000660fc0?)
    /Users/username/go/pkg/mod/github.com/gorilla/websocket@v1.5.0/conn.go:1093 +0x13
go.mau.fi/mautrix-imessage/imessage/bluebubbles.(*blueBubbles).PollForWebsocketMessages(0xc000246500)
    /Users/username/src/imessage/imessage/bluebubbles/api.go:141 +0x127
created by go.mau.fi/mautrix-imessage/imessage/bluebubbles.(*blueBubbles).Start in goroutine 68
    /Users/username/src/imessage/imessage/bluebubbles/api.go:98 +0x145
Bridge exited
exit status 2
trek-boldly-go commented 4 months ago

@dltacube this looks really good. I like the max retries in there, but I haven't done the math. Ideally it would spend several minutes retrying, thinking of how long it could take for a router or separate computer to boot up.

I know it would be ideal to allow the max retries to be a config, but that would require an edit to the template in the bridge manager (bbctl) repo. For now, a hardcore is probably good enough.

Please do use your new function inside the Start function, to reduce the duplicated logic.

Your testing plan also looks good, as long as everything comes back successful I'll approve the merge (but @tulir is the only one that can merge to master).

dltacube commented 4 months ago

Thanks, I'd rather not mess with the config object for now as I'm already testing the limits of my go abilities (and AI of course).

I'll merge the functionality into the Start function, that's no problem at all.

Testing so far is looking good. My only concern is the way it breaks, I pasted my output above. I'm wondering if there's something wrong with my logic flow and it's continuing where it should be breaking or if it's a goroutine thing I need to handle manually. I'm leading towards the former since that's how it's currently implement (it just breaks).

dltacube commented 4 months ago

As for backoff durations, they're exponential so after a few tries it gets quite long. I don't think exponential is even ideal but that can be adjusted. Right now it's: 200ms, 400ms, 800ms, 1.6s, 3.2s, etc...

I think ~12~ 6 attempts puts us at around 6 ~minute~ seconds backoff? I'll double check that but feel free to suggest another base number from what I've got. I have the max low right now so I can let it expire without waiting too long.

dltacube commented 4 months ago

I figured out how to get a clean break. I need to return from the function and break from PollForWebsocketMessages!

dltacube commented 4 months ago

Is it kosher if I skip the defer function since the websocket doesn't exist anymore? Let me know if you can think of a better way to handle that.

dltacube commented 4 months ago

I'm going to hold off on refactoring and wait until I can figure that part out. I think breaking cleanly might not matter but I'd like to hear from you guys as well since it's otherwise working.

Side note: skipping the deferred function (websocket close) causes some strange behavior where bb won't close but seemingly has no websocket connections!

/edit i must have introduced a bug while reverting something...

trek-boldly-go commented 4 months ago

@cnuss can you speak to the defer close websocket? I'm not sure the best way to handle the close if it is retrying.

dltacube commented 4 months ago

I just want to say the current iteration works but if you're testing with bb gui you have to force quit or else it stays open until retries expire. I don't know if that's a bad thing?