Open fy2462 opened 2 years ago
Application-level framing isn't really in scope of s2n-quic.
If you're sending multiple protobuf messages on a single stream, I would recommend looking at tokio_util::codec
to implement a solution. All streams in s2n-quic implement the required traits to easily integrate with that crate.
You can also open a stream per message and buffer in the application until the stream is finished and decode the message at the end.
Thanks, @camshaft. I have encoded and decoded with tokio_util
now.
Another problem:
I have opened a open_bidirectional_stream
on client-side, and accept_bidirectional_stream
on server side.
and then transfer the video streaming with the single-stream from client to server. but I got the high latency, and the latency grows over time. How to resolve it? Do I need to open_bidirectional_stream
for every frame? but I want to keep the frames relatively ordered.
What do you have a good idea about this case?
Do I need to set the builder limits on both sides? Do you have the recommended limit parameter set for my case?
Without seeing the code I can really only speculate what the source of the slowdown is. Did you compile in release mode? What network are you testing on? Is the sender sending fast enough? Is the receiver reading fast enough? Did you try profiling the endpoints? Are you limited by CPU?
Thanks for your response @camshaft, and sorry for the less clue:
I have tried to debug and release versions, and the experience is almost. The transmission is ok when the network latency is < 20ms. So I think it is not the send and receive problem. No CPU limit.
But I add the latency with sudo tc qdisc add wlp0s20f3 root netem delay 40ms
, and I can see the latency will be up always to server. the data bitrate is 2.5Mbps.
I have set a timestamp in client data. and the server will return the timestamp to the client, and the client print the time difference.
the client log output
[WARN] 2022-06-24T11:32:32.491 to_server_rtt: 344ms,
[WARN] 2022-06-24T11:32:35.062 to_server_rtt: 903ms,
[WARN] 2022-06-24T11:32:37.647 to_server_rtt: 1486ms,
[WARN] 2022-06-24T11:32:40.174 to_server_rtt: 2014ms,
[WARN] 2022-06-24T11:32:42.790 to_server_rtt: 2610ms,
[WARN] 2022-06-24T11:32:45.322 to_server_rtt: 3148ms,
[WARN] 2022-06-24T11:32:48.090 to_server_rtt: 3894ms,
[WARN] 2022-06-24T11:32:50.841 to_server_rtt: 4649ms,
[WARN] 2022-06-24T11:32:53.431 to_server_rtt: 5245ms,
[WARN] 2022-06-24T11:32:56.034 to_server_rtt: 5829ms,
[WARN] 2022-06-24T11:32:58.809 to_server_rtt: 6595ms,
[WARN] 2022-06-24T11:33:01.468 to_server_rtt: 7247ms,
The ping output on the client-side:
64 bytes from 10.10.82.240: icmp_seq=5 ttl=63 time=44.8 ms
64 bytes from 10.10.82.240: icmp_seq=6 ttl=63 time=43.3 ms
64 bytes from 10.10.82.240: icmp_seq=7 ttl=63 time=49.6 ms
64 bytes from 10.10.82.240: icmp_seq=8 ttl=63 time=43.2 ms
64 bytes from 10.10.82.240: icmp_seq=9 ttl=63 time=46.5 ms
64 bytes from 10.10.82.240: icmp_seq=10 ttl=63 time=44.9 ms
64 bytes from 10.10.82.240: icmp_seq=11 ttl=63 time=42.9 ms
I set some limits on both sides, but it still doesn't work.
const ACK_LATENCY: u64 = 0;
const MAX_HANDSHAKE_DURATION: u64 = 3;
const MAX_ACK_RANGES: u8 = 100;
const CONN_TIMEOUT: u64 = 5;
const KEEP_ALIVE_PERIOD: u64 = 2;
Client code:
pub async fn new_for_client_conn(
server_addr: SocketAddr,
local_addr: SocketAddr,
) -> ResultType<BidirectionalStream> {
let io = IoBuilder::default()
.with_receive_address(local_addr)?
.build()?;
let limits = Limits::new()
.with_max_ack_delay(Duration::from_millis(ACK_LATENCY))
.expect("set ack delay failed.");
limits
.with_max_ack_ranges(MAX_ACK_RANGES)
.expect("set ack max rangees failed.");
limits
.with_max_handshake_duration(Duration::from_secs(MAX_HANDSHAKE_DURATION))
.expect("set max handshake duration failed.");
limits
.with_max_idle_timeout(Duration::from_secs(CONN_TIMEOUT))
.expect("set max idle timeout failed.");
limits
.with_max_keep_alive_period(Duration::from_secs(KEEP_ALIVE_PERIOD))
.expect("set max keep alive period failed.");
let client = Client::builder()
.with_tls(Path::new(CERT.cert_pom))?
.with_limits(limits)?
.with_io(io)?
.start()
.unwrap();
let connect = Connect::new(server_addr).with_server_name("localhost");
let mut connection = client.connect(connect).await?;
connection.keep_alive(true)?;
let stream = connection.open_bidirectional_stream().await?;
Ok(stream)
}
Server code
pub fn new_server(bind_addr: SocketAddr) -> ResultType<Server> {
let io = IoBuilder::default()
.with_receive_address(bind_addr)?
.build()?;
let limits = Limits::new()
.with_max_ack_delay(Duration::from_millis(ACK_LATENCY))
.expect("set ack delay failed.");
limits
.with_max_ack_ranges(MAX_ACK_RANGES)
.expect("set ack max rangees failed.");
limits
.with_max_handshake_duration(Duration::from_secs(MAX_HANDSHAKE_DURATION))
.expect("set mac handshake duration failed.");
limits
.with_max_idle_timeout(Duration::from_secs(CONN_TIMEOUT))
.expect("set max idle timeout failed.");
limits
.with_max_keep_alive_period(Duration::from_secs(KEEP_ALIVE_PERIOD))
.expect("set max keep alive period failed.");
let server = Server::builder()
.with_tls((Path::new(CERT.cert_pom), Path::new(CERT.key_pom)))?
.with_limits(limits)?
.with_io(io)?
.start()
.unwrap();
Ok(server)
// Some(mut new_conn) = server.accept() => {
// let client_addr = new_conn.remote_addr()?;
// tokio::spawn(async move {
// while let Ok(Some(stream)) = new_conn.accept_bidirectional_stream().await {
// tokio::spawn(async move {
// loop {
// let Ok(Some(data)) = stream.receive().await {
// .....
// stream.send(data).await.expect("stream should be open");
// }
// }
// });
// }
// });
// }
//}
PS:
What do I miss in my code? and do I need to change my limits?
or what monitor tool can analyze the s2n-quic
latency data?
Any idea about this case, guys? @camshaft @WesleyRosenblum @toidiu Thanks.
The code you shared isn't reproducible so I'm not sure what issue you're seeing. If you make a repo for the minimal amount of code it takes to reproduce this, we can take a look when we get time.
@camshaft got it, I will create a test repo to reproducible. thanks.
@camshaft Sorry for the late response because of busy work. I have coded the demo to reproduce the case.
https://github.com/fy2462/s2n_quic_latency_reproducer
You can get the details by following the README
.
I don't know why there is high latency in the code. please tell me the root cause if you are free to test my code. thanks very much.
I am waiting for your response. thanks again.
Thanks for creating the repo. We will add it to our list of things to do and follow up after investigating it.
In the meantime, I believe you're running into buffer bloat issues. Basically if you're producing data faster than the network can carry, your stream buffers are going to grow and grow until you hit the buffer limits and then s2n-quic will start applying backpressure on the sender. If you want to avoid this behavior you can call flush on the stream after you send a message and this will make sure the peer receives the data before sending the next one. It's possible that this isn't what is happening. But looking at the initial results, that's my hunch.
Thanks a lot, @colmmacc, I think the bitrate is only < 5MBps in my demo, Why is low and stable latency when I use the TCP connection?
More questions:
Hi @camshaft, any test progress for my code. thanks.
Hi @fy2462,
Looking into this is still something we'd like to do, but we haven't gotten to it yet. After we investigate this further we'll let you know.
Thanks!
Got it, thanks @goatgoose. BTW, I have set some s2n-quic limits in my code. you can find it here, but it doesn't work.
Hi, @goatgoose, Any time to look into this? strongly want to know why? Thank you very much again.
@goatgoose @camshaft Hi guys,
s2n-quic
is a great project, and have easy API to use, I am very hope to use it in my production project, But I think there is a little slow for issue response and trance in community. I think it maybe busy for your work. but It is not good for the project growing up.
Anyway, I hope s2n-quic
will be the top 1 project for QUIC. Thanks for your work. looking forward your response.
As path latency increases, your buffering limits also need to increase. You can try to apply the same limits as the perf server:
@camshaft Thanks very much, I will try it.
@camshaft
I set the limits with the example, but panic happened. thread 'tokio-runtime-worker' panicked at 'Receive window must not exceed 32bit range'
, when the client connect to server.
So don't make the value so high then. There's no way you're going to get 5000Gbps over a 400RTT network.
@camshaft Got it, thanks. So I think this comment should be Mbits/s https://github.com/aws/s2n-quic/blob/e5ae19ff115a09d29a8b041728393df85f0b69ea/quic/s2n-quic-qns/src/perf.rs#L102
@camshaft Thanks for your time to response, I try to change the limits as: https://github.com/fy2462/s2n_quic_latency_reproducer/blob/2dac5a513afdadf65ab0ea9ed740019f1abbbc07/hbb_common/src/quic.rs#L170
But the demo still works bad after I mock the network latency, Could you help me to test the demo in your own PC, and find why it bad performance. miss it if you are not free, thanks again.
PS: I mock the latency with sudo tc qdisc add dev eth0 root netem delay 50ms
on the demo client side.
@fy2462 - What is your bandwidth/latency as measured by iperf (https://github.com/esnet/iperf) when you have that netem delay set?
netem delay 50ms
adds a delay to every packet that is going out of that interface.
I am going to assume that your PATH MTU of that interface is the default of 1500 bytes.
This means it's going to take about 118 packets to deliver a 177k buffer (the size of image.png)
I'm not sure where the netem
filter actually applies its delays, but it could be sending every packet in serial, and delaying each one 50ms. If this is the case, it should take 5.9 seconds to complete the delivery of that 177k image.png. In this case, your transfer rate would be 19.8Kbps assuming everything else was instantaneous.
In short, I don't think you're actually hitting the latency of s2n-quic here. If you want to test the overall bandwidth of your server.. I recommend spinning up a couple hundred clients in parallel, then you could get an idea of the overall throughput of the server, even with laggy network.
@fy2462 - Have you run into any other troubles with performance?
Hi guys:
I have a problem worried with me. I send the protobuf bytes with s2n-quic, but I can not parse the proto on receive side when the bytes size is too long.
Do we have any good encode and decode frame data solution?
Thanks all of you.