transmission / transmission

Official Transmission BitTorrent client repository
https://transmissionbt.com
Other
11.67k stars 1.18k forks source link

fix: process BT messages that immediately follows handshake #6913

Open tearfur opened 2 weeks ago

tearfur commented 2 weeks ago

Fixes #6909.

Notes: Fixed a bug where BT messages that immediately follows a handshake might be discarded.

tearfur commented 2 weeks ago

CC @reardonia

reardonia commented 2 weeks ago

Should this also get a 4.0.x backport?

tearfur commented 2 weeks ago

There was a discussion about how long do we keep supporting 4.0.x. If that still holds, then the next release should be 4.1.0-beta1, and there will be no more 4.0.x releases.

tearfur commented 1 week ago

You might be worried because notify_bandwidth_consumed() is a recursive function, but the depth will at most be 2-3 levels in practice, so it is still cheap.

Same goes for clamp().

reardonia commented 6 days ago

I don't think this patch works. The change to READ_NOW from READ_LATER in handshake done() doesn't yield anything. The only way I've been able to work around this is the following in peerIo::try_read :

    if (error)
    {
        if (!can_retry_from_error(error.code()))
        {
            tr_logAddTraceIo(this, fmt::format("try_read err: n_read:{} errno:{} ({})", n_read, error.code(), error.message()));
// TRR
            if (!std::empty(buf))
            {
                tr_logAddWarnIo(this, fmt::format("try_read err: n_read:{} errno:{} ({}) have:{}",
                    n_read, error.code(), error.message(), read_buffer_size()));
                can_read_wrapper();
            }
            call_error_callback(error);
        }
    }
    else if (!std::empty(buf))
    {
        can_read_wrapper();
    }

If we don't force a call to can_read_wrapper here then it just sits there unconsumed. Maybe I'm misreading code again, but the handoff from handshake back to peerIo doesn't follow the READ_NOW / READ_LATER codes further. peerIo basically starts from scratch and ignores prior return values.

I'll verify again later, as I may have messed up the test, but I'm fairly confident.

tearfur commented 6 days ago

Did you also apply the refactor to remove the loop in tr_handshake::can_read()? It's not just a code cleanup, it is needed for this fix.

reardonia commented 5 days ago

Did you also apply the refactor to remove the loop in tr_handshake::can_read()? It's not just a code cleanup, it is needed for this fix.

Yes, applied the full PR

tearfur commented 5 days ago

Well, from this log I got, looks like it works just fine though... This is from https://github.com/tearfur/transmission/tree/tmp, where I rebased 6891, this PR and 6917 on top of eacher other.

[2024-06-19 03:15:20.465] trc handshake REDACTED.22:36217 handling can_read; state is [awaiting vc] (handshake.cc:566)
[2024-06-19 03:15:20.465] trc handshake REDACTED.22:36217 found ENCRYPT(VC)! (handshake.cc:160)
[2024-06-19 03:15:20.465] trc handshake REDACTED.22:36217 handling can_read; state is [awaiting crypto select] (handshake.cc:566)
[2024-06-19 03:15:20.465] trc handshake REDACTED.22:36217 crypto select is 2 (handshake.cc:184)
[2024-06-19 03:15:20.465] trc handshake REDACTED.22:36217 len(PadD) is 352 (handshake.cc:193)
[2024-06-19 03:15:20.465] trc handshake REDACTED.22:36217 handling can_read; state is [awaiting pad d] (handshake.cc:566)
[2024-06-19 03:15:20.465] trc handshake REDACTED.22:36217 PadD: need 352, got 352 (handshake.cc:206)
[2024-06-19 03:15:20.465] trc REDACTED.22:36217 48 overhead bytes via utp (peer-io.cc:780)
[2024-06-19 03:15:20.465] trc REDACTED.22:36217 48 overhead bytes via utp (peer-io.cc:780)
[2024-06-19 03:15:20.465] trc handshake REDACTED.22:36217 handling can_read; state is [awaiting handshake] (handshake.cc:566)
[2024-06-19 03:15:20.465] trc handshake REDACTED.22:36217 read_handshake: need 48, got 280 (handshake.cc:230)
[2024-06-19 03:15:20.465] trc handshake REDACTED.22:36217 handling can_read; state is [awaiting peer id] (handshake.cc:566)
[2024-06-19 03:15:20.465] trc handshake REDACTED.22:36217 read_peer_id: need 20, got 232 (handshake.cc:320)
[2024-06-19 03:15:20.465] trc handshake REDACTED.22:36217 peer-id is 'qBittorrent 4.6.4' ... isIncoming is false (handshake.cc:330)
[2024-06-19 03:15:20.465] trc ubuntu-24.04-desktop-amd64.iso REDACTED.22:36217 [qBittorrent 4.6.4]: sending an ltep handshake (peer-msgs.cc:1077)
[2024-06-19 03:15:20.465] trc ubuntu-24.04-desktop-amd64.iso REDACTED.22:36217 [qBittorrent 4.6.4]: sending 'ltep' 0 [] (peer-msgs.cc:846)
[2024-06-19 03:15:20.465] trc ubuntu-24.04-desktop-amd64.iso REDACTED.22:36217 [qBittorrent 4.6.4]: sending 'fext-have-all' (peer-msgs.cc:846)
[2024-06-19 03:15:20.465] trc ubuntu-24.04-desktop-amd64.iso REDACTED.22:36217 [qBittorrent 4.6.4]: sending 'port' 51413 (peer-msgs.cc:846)
[2024-06-19 03:15:20.465] trc ubuntu-24.04-desktop-amd64.iso REDACTED.22:36217 [qBittorrent 4.6.4]: read 195 payload bytes; 0 left to go (peer-msgs.cc:1761)
[2024-06-19 03:15:20.465] trc ubuntu-24.04-desktop-amd64.iso REDACTED.22:36217 [qBittorrent 4.6.4]: got peer msg 'ltep' (20) with payload len 195 (peer-msgs.cc:1353)
[2024-06-19 03:15:20.465] trc ubuntu-24.04-desktop-amd64.iso REDACTED.22:36217 [qBittorrent 4.6.4]: Got a BtPeerMsgs::Ltep (peer-msgs.cc:1574)
[2024-06-19 03:15:20.465] trc ubuntu-24.04-desktop-amd64.iso REDACTED.22:36217 [qBittorrent 4.6.4]: got ltep handshake (peer-msgs.cc:898)
[2024-06-19 03:15:20.465] trc ubuntu-24.04-desktop-amd64.iso REDACTED.22:36217 [qBittorrent 4.6.4]: here is the base64-encoded handshake: [REDACTED] (peer-msgs.cc:1184)
[2024-06-19 03:15:20.465] trc ubuntu-24.04-desktop-amd64.iso REDACTED.22:36217 [qBittorrent 4.6.4]: msgs->ut_pex is 1 (peer-msgs.cc:1203)
[2024-06-19 03:15:20.465] trc ubuntu-24.04-desktop-amd64.iso REDACTED.22:36217 [qBittorrent 4.6.4]: msgs->ut_metadata_id_ is 2 (peer-msgs.cc:1210)
[2024-06-19 03:15:20.465] trc ubuntu-24.04-desktop-amd64.iso REDACTED.22:36217 [qBittorrent/4.6.4]: pex: old IPv4 peer count 0, new peer count 1, added 1, dropped 0 (peer-msgs.cc:1021)
[2024-06-19 03:15:20.465] trc ubuntu-24.04-desktop-amd64.iso REDACTED.22:36217 [qBittorrent/4.6.4]: pex: old IPv6 peer count 0, new peer count 0, added 0, dropped 0 (peer-msgs.cc:1021)
[2024-06-19 03:15:20.466] trc ubuntu-24.04-desktop-amd64.iso REDACTED.22:36217 [qBittorrent/4.6.4]: sending 'ltep' 1 [] (peer-msgs.cc:846)
[2024-06-19 03:15:20.466] trc ubuntu-24.04-desktop-amd64.iso REDACTED.22:36217 [qBittorrent/4.6.4]: read 0 payload bytes; 0 left to go (peer-msgs.cc:1761)
[2024-06-19 03:15:20.466] trc ubuntu-24.04-desktop-amd64.iso REDACTED.22:36217 [qBittorrent/4.6.4]: got peer msg 'fext-have-none' (15) with payload len 0 (peer-msgs.cc:1353)
[2024-06-19 03:15:20.466] trc ubuntu-24.04-desktop-amd64.iso REDACTED.22:36217 [qBittorrent/4.6.4]: Got a BtPeerMsgs::FextHaveNone (peer-msgs.cc:1537)
[2024-06-19 03:15:20.466] trc ubuntu-24.04-desktop-amd64.iso REDACTED.22:36217 [qBittorrent/4.6.4]: read 2 payload bytes; 0 left to go (peer-msgs.cc:1761)
[2024-06-19 03:15:20.466] trc ubuntu-24.04-desktop-amd64.iso REDACTED.22:36217 [qBittorrent/4.6.4]: got peer msg 'port' (9) with payload len 2 (peer-msgs.cc:1353)
[2024-06-19 03:15:20.466] trc ubuntu-24.04-desktop-amd64.iso REDACTED.22:36217 [qBittorrent/4.6.4]: Got a BtPeerMsgs::Port (peer-msgs.cc:1476)
reardonia commented 2 days ago

Well, from this log I got, looks like it works just fine though... This is from https://github.com/tearfur/transmission/tree/tmp, where I rebased 6891, this PR and 6917 on top of eacher other.

The easiest way to diagnose, I think, is to trace unread bytes (aka "have") in try_read on error. That's how I stumbled onto this bug in the first place. After this PR, my log is still full of unread messages unless I force can_read_wrapper in the error path within try_read

EDIT: sorry, wasn't testing with #6917 as well. Will retry the combo now.

tearfur commented 2 days ago

EDIT: sorry, wasn't testing with #6917 as well. Will retry the combo now.

I don't think 6917 is needed for this fix... But anyway, let me try do some more testing by tracing unread bytes in try_read() this time.

reardonia commented 1 day ago

I'm now convinced this PR is correct; #6942 tricked me.