meejah / txtorcon

Twisted-based asynchronous Tor control protocol implementation. Includes unit-tests, examples, state-tracking code and configuration abstraction.
http://fjblvrw2jrxnhtg67qpbzi45r7ofojaoo3orzykesly2j3c2m3htapid.onion/
MIT License
251 stars 72 forks source link

When tor process exits unexpectedly tor.protocol.get_info hangs #389

Closed hellais closed 1 year ago

hellais commented 1 year ago

Here is a minimal reproduction for this issue:

from twisted.internet.task import react
from twisted.internet.defer import ensureDeferred

import txtorcon

EXIT_RELAY_FP = [
'130CFCF38BA3327E3001A1DB2A4B5ACBDAE248D9', # this triggers the getinfo fail

'127F6358F68FFB7E437DBA51D6D4DAC47B9F78A7',
]

async def main(reactor):
        try:
            tor = await txtorcon.launch(
                reactor,
                kill_on_stderr=False,
                progress_updates=lambda x, y, z: print(f"{x}%: {y} - {z}"),
            )
        except Exception as exc:
            print(f"FAILED to start tor {exc}")
            return

        state = await tor.create_state()
        for exit_fp in EXIT_RELAY_FP:
            print(f"doing {exit_fp}")
            try:
                print("calling GETINFO")
                info = await tor.protocol.get_info("md/id/" + exit_fp)
                print(f"got {info}")
            except Exception as exc:
                print(f"FAILED to get info for {exit_fp} {exc}")

@react
def _main(reactor):
    return ensureDeferred(main(reactor))

You can see that the last log lines are:

doing 130CFCF38BA3327E3001A1DB2A4B5ACBDAE248D9
calling GETINFO
FAILED to get info for 130CFCF38BA3327E3001A1DB2A4B5ACBDAE248D9 Tor unexpectedly disconnected while running: GETINFO md/id/130CFCF38BA3327E3001A1DB2A4B5ACBDAE248D9
doing 127F6358F68FFB7E437DBA51D6D4DAC47B9F78A7
calling GETINFO
Unhandled Error
Traceback (most recent call last):
Failure: builtins.RuntimeError: Tor exited with error-code 0

Where we are waiting on the tor.protocol.get_info("md/id/" + exit_fp) call.

I would expect to either get an errback, because the process is dead or have some other way to tell that I should not be calling it.

It would also be nice to be able to somehow know that tor has exited this way, as it's currently not possible to listen for the "tor exited" event.

This probably is also a tor bug, as it should not happen that the tor process exits when issuing this specific get_info command.

meejah commented 1 year ago

Okay, so it does errback() on the getinfo (properly) --- but still "hangs". However, if you change the test-case to "raise" right after the "FAILED" print() then it'll exit.

Although it's not quite the same, there is TorControlProtocol.when_disconnected() which will effectively tell you if tor exited (although it's a bit more general, because there may be other ways to lose the connection to Tor besides it exiting).

That said, adding a when_exited() or similar to TorProcessProtocol probably makes sense.

hellais commented 1 year ago

Okay, so it does errback() on the getinfo (properly) --- but still "hangs".

It calls the errback on the first failure, that is this log line

doing 130CFCF38BA3327E3001A1DB2A4B5ACBDAE248D9
calling GETINFO
FAILED to get info for 130CFCF38BA3327E3001A1DB2A4B5ACBDAE248D9 Tor unexpectedly disconnected while running: GETINFO md/id/130CFCF38BA3327E3001A1DB2A4B5ACBDAE248D9

But then when the second get_info call is made, it hangs without any errback, see these log lines:

doing 127F6358F68FFB7E437DBA51D6D4DAC47B9F78A7
calling GETINFO
Unhandled Error
Traceback (most recent call last):
Failure: builtins.RuntimeError: Tor exited with error-code 0

However, if you change the test-case to "raise" right after the "FAILED" print() then it'll exit.

I am not sure this is the correct thing to do in this case though, since the first failure could potentially be a non-final failure (ex. I passed an invalid descriptor) in which case I just would like to skip that specific relay.

meejah commented 1 year ago

Yes, definitely at least one issue here, thanks for the detailed bug report!

meejah commented 1 year ago

By default, Twisted's LineOnlyReceiver limits to 16384 bytes maximum for "a line" and this particular microdescriptor exceeds that.

So, we either need to "not limit" line-length (not really possible with Twisted right now, but we could set MAX_INT or something I suppose) and / or find out what the maximum length of a microdescriptor can be .....

meejah commented 1 year ago

To test if this is the only thing holding your case back, you could set tor.protocol.MAX_LENGTH = 16384000 before issuing any of the GETINFOs