ergochat / ergo

A modern IRC server (daemon/ircd) written in Go.
https://ergo.chat/
MIT License
2.25k stars 177 forks source link

ergo.chat/abbreviated-handshake #1420

Open slingamn opened 3 years ago

slingamn commented 3 years ago

The intent here is to speed up reattaches for mobile clients. Basically:

  1. The client will reconnect, do CAP negotiation and SASL as usual
  2. The client will additionally request a vendor cap oragono.io/abbreviated-handshake
  3. Upon CAP END, the server will send 001 but none of the other conventional registration numerics; in particular 005 and the MOTD will be skipped
  4. The server will probably send JOIN lines for all joined channels (as usual), but will not send RPL_NAMREPLY

@kylef is going to take a look at the potential impact of this.

DanielOaks commented 3 years ago

Regarding the 005 and all, sometimes those do change for legitimate reasons. Could we send some hashed value that represents those, that client client can store, and if the hashed value they get from us differs from what they have stored the client knows they should run VERSION and get the updated isupport tokens?

DanielOaks commented 3 years ago

Note from koragg, clients use the endofmotd/nomotd numerics to basically see that they are connected. We'll need to either send one of those, or send our own bespoke custom verb so clients can realise they're connected.

wrt ^, maybe the value of oragono.io/abbreviated-handshake could be the isupport timestamp/value/etc, since that can also get updated by the server anytime with cap-notify? idk.

slingamn commented 3 years ago

From some initial benchmarking, it seems to me that the cost of the handshake is dominated by establishing the TLS connection and then the back-and-forth of conventional CAP negotiation, not by the registration numerics. Here's the test I ran (contains a non-sensitive password):

https://gist.github.com/slingamn/c84b95452153e208b5d78f240930d664

script.txt is what the client side of the handshake would look like under oragono.io/abbreviated-handshake, plus a final QUIT. bench.go sends this entire handshake optimistically at once, without waiting for server responses. In my testing, once the TLS connection is established, the entire handshake can be done in approximately 1 RTT to the server. (This includes password checking, which is about a millisecond with the recommended Oragono config.) Increasing the size of the MOTD didn't seem to change much, which suggests that optimizing the length of the server-sent registration numerics won't have that much impact.

Caveat: all of this is on wires, it could be that on mobile it's more important to get the server's response under the MTU, or into fewer send calls.

cc @ProgVal

kylef commented 3 years ago

@kylef is going to take a look at the potential impact of this.

These figures are best effort and may be slightly off, I didn't necessarily make this data 100% ms accurate and are subject to visually rounding them and not computing ms when the times are >1 second. This is more to get a feel of the difference, if we have a particular aspect that we want to look at in more detail I can get more specific and accurate data.

Oragono

Connecting to a very small Oragono instance. Small amount of channels with practically empty NAMES list. Completing a reconnect with znc-playback which contains no new messages from prior session.

Generally results in receiving 72 messages with a total of 8577 bytes (just counting for the IRC messages, not their representation in TLS/TCP/IP). Client sends 15 messages coming in at 584 bytes.

Capability Negotiation looks like the following, including a PLAIN SASL login.

2021-02-28 15:43:38.343491+0000 C: CAP LS 302
2021-02-28 15:43:38.344397+0000 C: NICK kyle
2021-02-28 15:43:38.345008+0000 C: USER kyle 0 * Kyle
2021-02-28 15:43:39.224672+0000 S: :irc.example.com CAP * LS * :account-notify account-tag away-notify batch cap-notify chghost draft/channel-rename draft/chathistory draft/event-playback draft/languages=16,en,~bs,~de,~el,~en-AU,~es,~fi,~fr-FR,~it,~nl,~no,~pl,~pt-BR,~ro,~tr-TR,~zh-CN draft/multiline=max-bytes=4096,max-lines=100 draft/register=before-connect draft/relaymsg=/ draft/resume-0.5 echo-message extended-join invite-notify labeled-response message-tags multi-prefix oragono.io/nope sasl=PLAIN,EXTERNAL server-time setname
2021-02-28 15:43:39.238604+0000 S: :irc.example.com CAP * LS :sts=duration=2592000,port=6697 userhost-in-names znc.in/playback znc.in/self-message
2021-02-28 15:43:39.244438+0000 C: CAP REQ :account-tag batch chghost echo-message invite-notify labeled-response message-tags multi-prefix sasl server-time userhost-in-names znc.in/playback znc.in/self-message
2021-02-28 15:43:40.993935+0000 S: @time=2021-02-28T15:43:40.596Z :irc.example.com CAP * ACK :account-tag batch chghost echo-message invite-notify labeled-response message-tags multi-prefix sasl server-time userhost-in-names znc.in/playback znc.in/self-message
2021-02-28 15:43:40.996027+0000 C: AUTHENTICATE ******************
2021-02-28 15:43:41.854455+0000 S: AUTHENTICATE +
2021-02-28 15:43:41.864019+0000 C: AUTHENTICATE ******************
2021-02-28 15:43:42.820749+0000 S: @time=2021-02-28T15:43:42.422Z :irc.example.com 900 * * kylef :You are now logged in as kylef
2021-02-28 15:43:42.823244+0000 S: @time=2021-02-28T15:43:42.422Z :irc.example.com 903 * :Authentication successful
2021-02-28 15:43:42.824470+0000 C: CAP END
Connection Connect Time Capability Negotiation Registration to fully synced
Edge 7s 4s 4s
3G 1s 1s 1s
LTE 790ms 700ms 550ms
DSL 250ms 500ms 500ms

ZNC + Freenode + znc-playback

ZNC generally skips a back and forth with SASL because it uses a server password which the client is able to send without negotiating sasl, asking for plain, etc. This also includes a cap-notify update after cap end because ZNC will update client once it knows which network it is connected to, with some upstream server features. This includes the client sending a push notification identification for Palaver Push.

ZNC test does contain a lot more data because it is more "real world" I have joined larger channels with large NAMES for this test. Across the full session, we received 201 messages with 54277 bytes. We sent 20 messages at 767 bytes.

The handshake (prior to 001) looks something like the following:

2021-02-28 16:36:25.945505+0000 Connecting
2021-02-28 16:36:29.904985+0000 C: CAP LS 302
2021-02-28 16:36:29.905952+0000 C: PASS *************
2021-02-28 16:36:29.906527+0000 C: NICK kyle
2021-02-28 16:36:29.907093+0000 C: USER kyle 0 * Kyle
2021-02-28 16:36:30.799897+0000 S: :irc.znc.in CAP unknown-nick LS :batch cap-notify echo-message multi-prefix palaverapp.com server-time userhost-in-names znc.in/batch znc.in/playback znc.in/self-message znc.in/server-time-iso
2021-02-28 16:36:30.803899+0000 C: CAP REQ :batch multi-prefix palaverapp.com server-time userhost-in-names znc.in/playback znc.in/self-message
2021-02-28 16:36:32.934564+0000 S: @time=2021-02-28T16:36:32.526Z :irc.znc.in CAP kyle ACK :batch multi-prefix palaverapp.com server-time userhost-in-names znc.in/playback znc.in/self-message
2021-02-28 16:36:32.937400+0000 C: PALAVER IDENTIFY ***** **** *****
2021-02-28 16:36:32.938299+0000 C: CAP END
2021-02-28 16:36:34.753761+0000 S: @time=2021-02-28T16:36:34.345Z :irc.znc.in CAP _kylef_ NEW :account-notify away-notify extended-join
2021-02-28 16:36:34.803777+0000 S: @time=2021-02-28T16:07:30.014Z :moon.freenode.net 001 kyle :Welcome to the freenode Internet Relay Chat Network jjjjj
Connection Connect Time Capability Negotiation Registration to fully synced
Edge 4s 5s 5s
3G 1s 1s 2.5s
LTE 700ms 1s 1s
DSL 500ms 1s 1s
kylef commented 3 years ago

I'd also point out that is for single server, 1 connection. In real world you may be connected a few networks at once which competes for networking resources.

grawity commented 3 years ago

ZNC generally skips a back and forth with SASL because it uses a server password which the client is able to send without negotiating sasl, asking for plain, etc.

I kind of want to make a "RFC 4959 but for IRC" to cut out this one round-trip...