Implement QUIC server address validation

nhorman commented 4 months ago

Issue https://github.com/openssl/project/issues/302 depends on other issues, but contains no statement of work itself. Investigation is required to determine what, if any work is needed to complete that task.

This issue will potentially result in the creation of additional tracking isssues

### Tasks

Sashan commented 3 months ago

Client address validation is covered by section 8. In a nutshell: to validate client IP address server must either:

complete TLS handhake with client
or ask client to present a validation token generated by server (this is covered by section 8)

whenever client sends its first UDP packet to server, the server may reply with 'RETRY' packet. The retry packet contains a validation token. Client is supposed to send token back to server as a reply to retry packet. This way can server verofu c;oemt's IP address before TLS handshake completes.

The branch here is a proof of concept which generates retry packet with token. The remaining piece to finish is to add implementation of ossl_quic_verify_retry_integrity_token() (if I remember correct). Then the whole piece needs to be debugged.

My complexity estimate to finish it is 8.

nhorman commented 3 months ago

Outstanding task here is to to turn @Sashan poc branch into a pr, fininsh implementation of ossl_quic_verify_retry_integrity_token and get it reviewed and merged

Sashan commented 1 week ago

The current code handles QUIC initial packet inch_rx()` function. To simplify testing the function currently sends QUIC retry packet to every QUIC initial packet which is received without validation token. What I can currently see is the retry packet with validation token gets sent to client. Client receives the retry packet and sends intiial packet again. The new initial packet contains the token.

Unfortunately things are still not quire right at server. What happens there is the server accepts the connection prematurely: the blocking call to SSL_accept_connection() bails out upon reception of the first initial packet (the packet without validation token), this is wrong in my opinion. What should be happening instead is to let finish quic server validation of client's address (receive validation token back) and then let unblock waiting SSL_accept_connection().

The current story as I understand goes as follows:

server application calls to SSL_accept_connection()
the list of incoming channels (port::incoming_channel_list) we end up in ossl_quic_reactor_block_until_pred() where we wait for data from network.
once I/O subsystems signals data is ready things we call ossl_quic_reactor_tick() which calls down to qeng_tick().
the qeng_tick() further calls down to ossl_quic_port_subtick() which in turn reads data from network in port_rx_pre()
the execution will eventually end up in port_default_packet_handler() where new clients are handled.

the story above can be illustrated by stack as follows:

#0  port_on_new_conn (port=0x4b4b2fb5000, peer=0x4b4b2fa55a0, scid=0x702570bde9f8, dcid=0x4b4b2fa55a0, new_ch=0x0) at ssl/quic/quic_port.c:546
#1  0x000004b48d02afab in port_default_packet_handler (e=0x4b4b2fb5000, arg=0x4b4b2fa55a0, dcid=0x702570bde9f8) at ssl/quic/quic_port.c:695
#2  0x000004b48d0179fe in demux_process_pending_urxe (demux=0x4b4dcb64600, e=0x4b4b2fb5000) at ssl/quic/quic_demux.c:352
#3  0x000004b48d016d81 in demux_process_pending_urxl (demux=0x4b4dcb64600) at ssl/quic/quic_demux.c:370
#4  0x000004b48d016755 in ossl_quic_demux_pump (demux=0x4b4dcb64600) at ssl/quic/quic_demux.c:399
#5  0x000004b48d02aa51 in port_rx_pre (port=0x4b4b2fa55a0) at ssl/quic/quic_port.c:526
#6  0x000004b48d02a91b in ossl_quic_port_subtick (port=0x4b4b2fa55a0, res=0x702570bdeb48, flags=0) at ssl/quic/quic_port.c:486
#7  0x000004b48d01879f in qeng_tick (res=0x702570bdeba8, arg=0x4b4dcb57240, flags=0) at ssl/quic/quic_engine.c:191
#8  0x000004b48d02dbc9 in ossl_quic_reactor_tick (rtor=0x4b4dcb57268, flags=0) at ssl/quic/quic_reactor.c:146
#9  0x000004b48d02de10 in ossl_quic_reactor_block_until_pred (rtor=0x4b4dcb57268, pred=0x4b48d023430 <quic_accept_connection_wait>, pred_arg=0x4b4b2fa55a0, flags=0) at ssl/quic/quic_reactor.c:477
#10 0x000004b48d01da3f in block_until_pred (ctx=0x702570bdecd0, pred=0x4b48d023430 <quic_accept_connection_wait>, pred_arg=0x4b4b2fa55a0, flags=0) at ssl/quic/quic_impl.c:536
#11 0x000004b48d0232ec in ossl_quic_accept_connection (ssl=0x4b4cb680000, flags=0) at ssl/quic/quic_impl.c:4391
#12 0x000004b48cfdaafa in SSL_accept_connection (ssl=0x4b4cb680000, flags=0) at ssl/ssl_lib.c:7817
#13 0x000004b234fcb6ff in run_quic_server (ctx=0x4b4cb660700, fd=3) at server.c:165
#14 0x000004b234fcb3a5 in main (argc=4, argv=0x702570bdee48) at server.c:227

The function port_on_new_conn() creates a new channel for remote client. Chanel gets inserted to port::incoming_channel_list (the list which is consumed by accept).

Once code execution unwinds back to ossl_quic_port_subtick(), where we call ossl_quic_channel_subtick() on all channels bound to port including the newly created channel in port_on_new_conn(). The call to ossl_quic_channel_subtick() let each channel to process data from network and also puts data from local application to network.

Once all channels and ports are handled by tick, the call stack starts to unwind back to SSL_accept_connection(), where new channel is ready making SSL_accept_connection() ready to return to caller. This is not desired.

I think port_default_packet_handler() needs to be extended such it makes a decision whether packet should create a new channel or if we need to validate client first. If channel gets created after we successfully validate client here, then we get desired behavior with no further changes to existing code.

Another option is to let channel itself to decide if client should be validated. But it also needs some more thought:

currently we create channel for every unknown client (goes against idea of validation where we don't want to create any resources before we are sure we are not acting on behalf of spoofed packet)
channels created must be marked as validation pending, those channels must not be accepted by application

Perhaps port can get dedicated channel to perform client validation, there will be just one such channel. As soon as valid token is seen by on the channel, the channel for client gets created. The newly created channel is inserted to incoming_channel_list so it can get consumed by accept.

Sashan commented 6 days ago

code below comes from port_default_packet_handler(). It implements the client validation done by server:

    /*
     * We only care about Initial packets which might be trying to establish a
     * connection.
     */
    if (hdr.type != QUIC_PKT_TYPE_INITIAL)
        goto undesirable;

    /*
     * TODO: there should be some logic similar to accounting half-open
     * states in TCP. If we reach certain threshold, then we want to
     * validate clients.
     */
#define VALIDATE_CLIENT 1
    if (VALIDATE_CLIENT) {
        if (hdr.token == NULL) {
            port_send_retry(port, &e->peer, &hdr);
            goto undesirable;
        } else if (port_validate_token(&hdr) == 0) {
            goto undesirable;
        } else {
            port_bind_channel(port, &e->peer, &hdr.src_conn_id, &hdr.dst_conn_id,
                              &new_ch);
        }
    } else if (hdr.token == NULL || port_validate_token(&hdr) == 1) {
        /*
         * client validation is optional. However if client presents
         * token, then the token must be valid.
         */
        if (hdr.token != NULL) {
            port_bind_channel(port, &e->peer, &hdr.src_conn_id, &hdr.dst_conn_id,
                              &new_ch);
        } else {
            /*
             * Try to process this as a valid attempt to initiate a connection.
             */
            port_on_new_conn(port, &e->peer, &hdr.src_conn_id, &hdr.dst_conn_id,
                             &new_ch);
        }
    }

    /*
     * The channel will do all the LCID registration needed, but as an
     * optimization inject this packet directly into the channel's QRX for
     * processing without going through the DEMUX again.
     */
    if (new_ch != NULL) {
        ossl_qrx_inject_urxe(new_ch->qrx, e);
        return;
    }

undesirable:
    ossl_quic_demux_release_urxe(port->demux, e);

It is desirable that port_send_retry() does not keep any state/resource on server. It does two things:

generates validation token which client must send back with next initial packet
must generate unused local connection ID (lcid) the client will use
perhaps the token should contain DCID we got from client (odcid)

as soon as server successfully validates token from client we must create a new channel and bind connection id to it. Unlike existing case where server calls to port_on_new_conn() where channel and server's connection ID get created at once here we use a port_bind_channel() which creates channel and binds it to pre-existing connection id which got generated with validation token. in this step server also uses connection id (odcid) retreived from token.

if connection id we hope to use for the channel becomes unaviable, then we can not proceed. server must discard such initial packet. server also may send a connection reset as a response. The reset makes client to fail faster instead of waiting for time out to elapse.

Sashan commented 4 days ago

There is a PR https://github.com/openssl/openssl/pull/25842 with changes as found in branch.

nhorman commented 54 minutes ago

Starting to take a look at this while @Sashan is off:

Current notes: 1) updated the quic feature branch and this pr locally to pull in the sslkeylog file updates from master 2) Took some tcpdumps and decoded them to visualize an exchange a) initial packet is sent properly without retry token b) server sends back retry packet with retry token and new scid c) client sends new initial packet with same retry token and dcid set to new scid as per rfc 9000 17.2.5.2 d) server validates the token and accepts the connection e) note: retry token is listed in tcpdump as 21 bytes long, but token should just be 12 bytes long (need to investigate) f ) server, after accepting connection, blocks in quic_do_handshake, waiting on predicate to complete, but it never does

openssl / project

Implement QUIC server address validation #715