Closed AmarOk1412 closed 1 year ago
First, I have to note that there are some differences between your codebase and ours: https://github.com/savoirfairelinux/pjproject/blob/master/pjlib/src/pj/sock_bsd.c#L591 and https://github.com/pjsip/pjproject/blob/master/pjlib/src/pj/sock_bsd.c#L577
And here's what ChatGPT
says about TCP_NODELAY
:
"
Implications of Using TCP_NODELAY:
Reduced Delay: The primary implication of using TCP_NODELAY is that it reduces the delay in sending data over a TCP connection. By disabling Nagle's algorithm, which is responsible for aggregating small outgoing packets into larger ones to improve network efficiency, TCP_NODELAY ensures that data is sent immediately without waiting for more data to accumulate. This can be beneficial for applications where low latency is crucial, such as real-time communication protocols (e.g., VoIP, online gaming) and interactive applications.
Predictable Timing: With TCP_NODELAY, you have more control over when data is sent. This predictability can be essential in scenarios where precise timing is required, such as multimedia streaming or control systems.
Potential Downsides of Using TCP_NODELAY:
Increased Network Traffic: One of the main downsides of disabling Nagle's algorithm is that it can lead to increased network traffic. By sending smaller packets more frequently, you may saturate the network with smaller messages, which can be less efficient than sending larger packets when appropriate. This can potentially impact overall network performance and consume more bandwidth.
Higher CPU Utilization: Disabling Nagle's algorithm may lead to increased CPU utilization, as it can result in more frequent packet processing and context switches. This additional processing overhead may become noticeable in high-throughput scenarios, especially on systems with limited CPU resources.
Potential for Packet Fragmentation: Sending small packets without aggregation can increase the chances of packet fragmentation, particularly when dealing with networks with a lower Maximum Transmission Unit (MTU). Fragmentation can lead to additional processing overhead and potentially introduce latency.
Inefficient for Bulk Data Transfer: For applications that primarily transfer large amounts of data in a streaming fashion, disabling Nagle's algorithm and using TCP_NODELAY may not be beneficial. In such cases, the efficiency gained from aggregating data into larger packets can outweigh the latency reduction.
Risk of Congestion: Sending data without considering network congestion control mechanisms can result in congestion-related problems. TCP congestion control algorithms are designed to prevent network congestion by adjusting the sending rate. By disabling Nagle's algorithm, you may bypass these congestion control mechanisms, potentially causing network congestion and packet loss.
In summary, using TCP_NODELAY can be advantageous when low latency is critical, but it should be employed judiciously and with a good understanding of the trade-offs. "
Since it only seems to affect a very specific case, i.e. TLS, and more specifically, gnuTLS, I wonder if the fix would be better put in pj_ssl_sock_param
instead (i.e. add pj_ssl_sock_param.sockopt_nodelay
which defaults to PJ_TRUE #if (PJ_SSL_SOCK_IMP == PJ_SSL_SOCK_IMP_GNUTLS)
).
Our stack is indeed really different (ICE over TCP is supported) but for the minimal code, it's just ioqueue send/select. The other flags may or may not interest you, I didn't open any ticket about other flags ;) (https://blog.cloudflare.com/when-tcp-sockets-refuse-to-die/ may interest you for other flags, but not related to this ticket). The thing is, pjsip supports ssl sockets, and in this use case, other may have the same issue without our stack.
And it's indeed the case only with GnuTLS (as far as we tested), so I definitely agree with your last point.
And it's indeed the case only with GnuTLS (as far as we tested), so I definitely agree with your last point.
So will you create a patch for this or do you want us to do so?
Update: I have created a patch in #3708
Describe the bug
Build pjsip 2.13 with gnutls for ssl socks and use any app to send/write data in TCP
The polling get random 43ms delays.
Steps to reproduce
Build pjsip 2.13 with gnutls for ssl socks and use any app to send/write data in TCP
The polling get random 43ms delays.
PJSIP version
2.13
Context
Latest version of gnutls/pjsip. This is a log from a simple ping/pong:
Log, call stack, etc