ergochat / irc-go

Libraries to help with IRC development in Go.
ISC License
41 stars 6 forks source link

expose clientHasQuit error #99

Closed frrad closed 1 year ago

frrad commented 1 year ago

In https://github.com/autobrr/autobrr/pull/1239 we would like to check if the error returned is a clientHasQuit error and handle it differently if so.

slingamn commented 1 year ago

I'm open to this, but internally to the library we are just checking whether Quit() was called explicitly, so it sounds like autobrr could check its own manuallyDisconnected flag here:

https://github.com/autobrr/autobrr/blob/8c89481d880c2bcd6d3907c16af807ec0a86a7a0/internal/irc/handler.go#L324

frrad commented 1 year ago

Interesting! Thanks for pointing this out. I honestly haven't really gotten very deep into understanding how things are actually wired up.

To me, it feels less brittle to just explicitly check the error instead of inferring what it must be based on this other state. I think that personally I would prefer even the current string based check. I'm mostly just trying to make this drive-by one line logging change though. Happy to do it however @zze0s prefers.

zze0s commented 1 year ago

Like pointed out we do keep track of manuallyDisconnected so we can use that, but exported errors are pretty nice so we can easily use errors.Is :smile:

I don't have any strong opinions on this, but exported errors are useful.

slingamn commented 1 year ago

I'm open to merging this, but the more I look at this code, the more I think it's unlikely to solve your problems. The only site where this error can be returned is:

https://github.com/ergochat/irc-go/blob/8d1f09a1b8d46e018ef369327057211860565e0e/ircevent/irc.go#L615-L617

This is intended to cover a specific case (client code asynchronously called (*Connection).Quit while we were sleeping in between reconnect attempts), and moreover, it should never be reachable from a manual call to (*Connection).Connect() --- it should only be reachable internally from Loop() during its attempts to reconnect:

https://github.com/ergochat/irc-go/blob/8d1f09a1b8d46e018ef369327057211860565e0e/ircevent/irc.go#L304

If you're calling Connect() directly and seeing this error, my understanding is that it indicates a buggy attempt to reuse a Connection object after calling Quit on it, which can never succeed. But from my reading of autobrr's code, it does not do this; instead, it manually creates a new Connection object each time, which should be OK. @frrad , were you actually seeing this error in log output before you started filtering it out?

frrad commented 1 year ago

I did see it, though not consistently. I was toggling the state of a network, trying to get errors. My hypothesis would be that

  1. This connect with retry https://github.com/autobrr/autobrr/blob/8c89481d880c2bcd6d3907c16af807ec0a86a7a0/internal/irc/handler.go#L218 wrapper tried once and failed.
  2. Then, as the retry delay was counting down, I toggled the network off which calls (I'm guessing) this function https://github.com/autobrr/autobrr/blob/8c89481d880c2bcd6d3907c16af807ec0a86a7a0/internal/irc/handler.go#L321 including a call to h.client.Quit()
  3. When the retry wakes back up it calls connect again https://github.com/autobrr/autobrr/blob/8c89481d880c2bcd6d3907c16af807ec0a86a7a0/internal/irc/handler.go#L222 and gets that error.
slingamn commented 1 year ago

Thanks, that explanation makes sense (it's similar to the "Quit() in between reconnection attempts" scenario described above that occurs internally within the library, and which is solved by returning clientHasQuit). It also matches my understanding from before: if you're seeing Connect() return this error, it's technically a bug in the client code since the Connection object is in an invalid state and Connect() cannot succeed in any case.

Reading the autobrr code, I think there are probably a number of race conditions associated with toggling networks on and off (for example, you can probably end up with two different Handler objects attempting to connect to the same network). I wasn't able to get a comprehensive solution, but here's a band-aid:

https://github.com/frrad/autobrr/pull/1

frrad commented 1 year ago

I replied to this on the downstream PR here:
https://github.com/autobrr/autobrr/pull/1239#issuecomment-1799346340

The part that's relevant to this PR is my suggestion that we

Merge expose clientHasQuit error ergochat/irc-go#99 because this is an error that might be returned to clients and so it would be good to export it for clients to be able to interact with, regardless of the actual resolution of the real issue here. (I do think this is the Right Thing To Do™️, but don't feel super strongly about this. Whatever you think of course @slingamn )

I agree that we can sidestep being able to check the value of this error in this case, but I think as a general principle it's nice to export errors which we plan to return to clients to help the client handle errors more easily.

slingamn commented 1 year ago

I think in principle, exported errors should correspond to expected conditions that the caller might want to check and respond to with specific behaviors. Given that clientIsQuit is never expected, and indicates that the caller is buggy, I'm not convinced that exporting it will help people write more correct code --- it seems like the correct strategy is to log it and then fix the caller so that it doesn't produce the error, not to add a case that handles it specially.

That said, there's no very compelling reason not to merge this (it doesn't really affect API stability to have an extra exported error, even one that eventually becomes obsolete), so I'll just merge this. I'm not going to prioritize including it in a tagged release, though.

frrad commented 1 year ago

That seems like a good balance. Thanks for merging :)