ngtcp2 / ngtcp2

ngtcp2 project is an effort to implement IETF QUIC protocol
https://nghttp2.org/ngtcp2/
MIT License
1.14k stars 237 forks source link

Low Transfer Speed with ngtcp2 QUIC Client Compared to quic-go Proxy Server #1029

Closed RuiSaka closed 9 months ago

RuiSaka commented 10 months ago

Issue Description

I'm facing a transfer speed problem while using ngtcp2 to develop a QUIC client. Specifically, I have set up a quic-go proxy server in my local LAN, and it seems that my ngtcp2 client is experiencing performance issues during transmission.

Environment Information

Configuration of ngtcp2

    ngtcp2_settings_default(&settings);
    settings.handshake_timeout = 5 * NGTCP2_SECONDS;
    settings.initial_rtt = NGTCP2_DEFAULT_INITIAL_RTT;
    settings.max_window = 1024 * 1024 * 1024;
    settings.max_stream_window = 1024 * 1024 * 1024;
    settings.cc_algo = NGTCP2_CC_ALGO_BBR;

    ngtcp2_transport_params_default(&params);
    params.initial_max_stream_data_bidi_local = 1024 * 128;
    params.initial_max_stream_data_bidi_remote = 1024 * 128;
    params.initial_max_stream_data_uni = 1024 * 128;
    params.initial_max_data = 1024 * 1024;
    params.initial_max_streams_uni = 1024;
    params.initial_max_streams_bidi = 1024;
    params.max_datagram_frame_size = 1200;
    params.max_idle_timeout = 30 * NGTCP2_SECONDS;
    params.active_connection_id_limit = 7;
    params.grease_quic_bit = 1;
//read code,socket receive buffer = 65535
int on_read() {
    ngtcp2_pkt_info pi;
    for (;;) {
        struct sockaddr_in sockaddr4;
        socklen_t sockaddr4len = sizeof(sockaddr4);

        size_t bufSize = 1024 * 64;
        char *buf[bufSize];
        int res = recvfrom(socket4FD, buf, bufSize, 0, (struct sockaddr *)&sockaddr4, &sockaddr4len);

        if (res == -1) {
            if (errno != EAGAIN && errno != EWOULDBLOCK) {
                return -1;
            }
            break;
        }

        for (;;) {
            //feed data by ngtcp2
            feedData(buf,&pi);
        }
    }
    updateTimer();
    return 0;
}

int feedData(...) {
    int rv = ngtcp2_conn_read_pkt(conn, &path, pi, data, dataLen, lnev_ngtcp2_timestamp());
    if (rv != 0) {
        return -1;
    }
    return 0
}

void updateTimer()
{
    ngtcp2_tstamp expiry = ngtcp2_conn_get_expiry(conn);
    uint64_t now = lnev_ngtcp2_timestamp();

    if (expiry < now) {
        ngtcp2Timeout();
        return;
    }

    double offset = (double)(expiry - now) / NGTCP2_SECONDS;
    restartNgtcp2Timeout(offset);
}

int ngtcp2Timeout() 
{
    int rv = handleExpiry();
    if (rv != 0) {
        return;
    }

    on_write();
}

int handleExpiry()
{
    uint64_t now = lnev_ngtcp2_timestamp();
    int rv = ngtcp2_conn_handle_expiry(conn, now);
    if (rv != 0) {
        ngtcp2_ccerr_set_liberr(&last_error, rv, NULL, 0);
        closeQuic(err);
        return -1;
    }
    return 0;
}
//write code
int on_write() {
    //deal with write block
    ...

    //write stream
    int rv = write_stream()
    if (rv != 0) {
        return rv;
    }

    updateTimer();
    return 0;
}

int write_stream() {
    for (;;) {
        //Similar to the sending method in client.cc
    }
}

Expected Behavior

The download speed can reach the speed of the bandwidth(1000Mb/s).

Actual Behavior

100Mb/s

Attempted Solutions

I attempted to call the on_write function after every invocation of the feedData function, causing more frequent ack transmissions. However, this also prevents the consolidation of acks, leading to a slower speed.

I also attempted to call on_write before the updateTimer function within the on_read function, but it did not have the intended effect.

I observed a significant number of lost packages in quic-go's qlog, and I believe this might be causing the inability to continue expanding the receive window. However, the specific reasons for the high number of lost packages are still unknown.

Additional Information

The qlog files are quite large. Let me take some screenshots to show you.

截屏2023-12-04 16 09 31 截屏2023-12-04 16 09 59 截屏2023-12-04 16 10 17 截屏2023-12-04 16 09 01

I don't understand why ACKs are being segmented for responses. I'm feeding data in real-time, so shouldn't ACKs be responded to in complete segments? Also, I'm testing within a local network where UDP packet loss should be minimal. I'm seeking assistance on when and at what frequency to respond with ACKs to achieve optimal speed.

Thanks for your help!

tatsuhiro-t commented 10 months ago

Packet losses happen because of UDP buffer overflow. You need to at least set net.core.rmem_default and net.core.rmem_max to higher values. quic-go might send packet too quickly or too aggressively I don't know.

tatsuhiro-t commented 10 months ago

Ah, so you are using iOS. I have no idea how to tune UDP buffers on those apple proprietary platforms.

RuiSaka commented 10 months ago

Thank you for your reply. I adjusted the receive buffer for UDP, and the speed increased to 120 Mb/s with no lost packages. However, the speed still doesn't reach the expected level. Currently, it seems that quic-go is not the issue, as I tested with other QUIC clients and didn't encounter any problems. I'm not sure if there's an issue with my program. According to the congestion control graph, there is a sudden decrease in ACK responses at a certain point. Could you provide some optimization insights?

One more point, my program is asynchronous. The receive socket is on one thread, and processing data using ngtcp2 is on another thread.

截屏2023-12-04 22 26 38

Thanks for your help!

tatsuhiro-t commented 10 months ago

ngtcp2_conn is not thread safe. You must place locks if you are manipulating ngtcp2_conn from more than 2 threads at the same time.

RuiSaka commented 10 months ago

Thank you for your reply. I am aware that ngtcp2 is not thread-safe, so all operations related to ngtcp2 are performed within the same thread.

tatsuhiro-t commented 10 months ago

What happens if the unmodified ngtcp2 examples/client is used against the proxy server?

RuiSaka commented 10 months ago

I also attempted to directly use libev as the event-driven mechanism for the socket. All operations (socket and ngtcp2) are guaranteed to be within the same thread. However, no matter how much I enlarge the receive buffer, lost packages still occur. I have carefully reviewed and verified all read, write, and update timer operations, which are the same as in the client.

RuiSaka commented 10 months ago

Is there a potential issue with my timers? Every time I invoke updateTimer, timeouts occur, and this triggers the invocation of on_write.

static uint64_t ln_ngtcp2_timestamp(void) {
    struct timespec tp;
    if (clock_gettime(CLOCK_MONOTONIC, &tp) != 0) {
        DDLogWarn(@"clock_gettime: %s", strerror(errno));
        return 0;
    }
    return (uint64_t)tp.tv_sec * NGTCP2_SECONDS + (uint64_t)tp.tv_nsec;
}
tatsuhiro-t commented 10 months ago

So what happens if the unmodified ngtcp2 examples/client is used against the proxy server?

RuiSaka commented 10 months ago

Due to the proxy server not involving HTTP/3, I cannot directly use the example/client. I have to use a modified version. I am making an effort to keep the test program as similar as possible to the client's code, and then I will conduct the testing again.

github-actions[bot] commented 9 months ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.