TTL partial reliability does not function correctly

stfl commented 6 years ago

I am investigating SCTP using one-way delay measurements.

when I assign TTL partial reliability with a deadline of 140ms I still receive packets with a delay >800ms

ttd_time_cmt_040ms_5 0_1 (here I am using CMT but the same results occure when using regular SCTP with only a backup path)

Should usrsctp not allow the sending of packets that are already too long in the queue? Is the pr deadline only evaluated in the SACK processing? When is the deadline calculated? does that happen right after the send() call is it calculated when the chunk moves from the sendqueue to the sentqueue?

The SCTP PR extension: https://tools.ietf.org/html/rfc3758#page-15 states:

TR4) Before transmitting or retransmitting a message for which a TSN is already assigned, the SCTP sender MUST evaluate the lifetime of the message. If the lifetime of the message is expired, the SCTP sender MUST "abandon" the message, ....

I use two GPS time synced linux machines linked together with Gbit eth links. On both links in both directions is use netem with a delay of 40ms and a drop rate of 5%. (simple gilbert with probabilities: good->bad: 5% and bad->good 75%... so actually PDR at about 6.5%) There is an initial delay of ~42ms caused by gstreamer from generating the timestamp and the send() call to the usrsctp socket. The minimum delay achievable is 82ms...

is following socket options:

usrsctp_sysctl_set_sctp_nrsack_enable(1);                /* non-renegable SACKs */
usrsctp_sysctl_set_sctp_max_burst_default(0);
usrsctp_sysctl_set_sctp_use_cwnd_based_maxburst(0);
usrsctp_sysctl_set_sctp_fr_max_burst_default(0);
usrsctp_sysctl_set_sctp_max_chunks_on_queue(8192);
usrsctp_sysctl_set_sctp_cmt_on_off(1);
usrsctp_sysctl_set_sctp_buffer_splitting(1);
usrsctp_setsockopt(sctpsink->sock, IPPROTO_SCTP, SCTP_NODELAY, (const void *)&(int){1}, (socklen_t)sizeof(int));
usrsctp_set_non_blocking(sctpsink->sock, 0);

...
spa.sendv_sndinfo.snd_flags = SCTP_UNORDERED;
spa.sendv_prinfo.pr_policy = SCTP_PR_SCTP_TTL;
spa.sendv_prinfo.pr_value = 140;

tuexen commented 6 years ago

If I remember correctly, when using SCTP_PR_SCTP_TTL, each message is sent at least once. It should be abandoned if the lifetime has expired before it is sent, but this is not implemented yet. Can you verify that the messages with the long delay are only transmitted once? That would backup my above thinking. If they are retransmitted, than there is a different bug in addition to the above.

stfl commented 6 years ago

I am measuring the timestamps outside of usrsctp so whatever I get is already only the payload. I don't directly see weather it's the first transmission or a retransmission. I don't have a pcap at hand of such a test run but if further investigation is required, I will capture a pcap when I do another test run and analyze it.

btw: TTD stands for time to delivery, which is the term I use in my thesis.. ;)

stfl commented 6 years ago

Is there a plan to fix the issue or any work in progress regarding first transmission abandonment?

I have the issue, that I can't really use the the results for my thesis obtained from the current implementation. A fix for this would be help me out a lot.

Are you aware of anybody who has implemented a fix for this, even if it is not a clean fix?

Thank you, best regards, Stefan

tuexen commented 6 years ago

Yepp, there is plan. I'm not aware of someone having an unofficial patch. I can see if @msvoelker (who is working in my lab) can have a look into this... You could help us testing this way... I'll drop a note tomorrow.

msvoelker commented 6 years ago

Hi Stefan, I will have a look at this issue. I'm going to try to reproduce it with packetdrill first (probably next week). I'll get back to you, once I created a first patch.

Timo

nxrighthere commented 5 years ago

The last couple of days, I'm observing the same problem with partial reliability even with zero RTX or TTL set to 1. Messages are transmitted too slow in comparison to other network transports where ordered unreliable messages delivered up to x20-30 times faster under 100-200 RTT with 5-10% of packet loss respectively...

tuexen commented 5 years ago

Which congestion control is the "other" transport protocol using? Can you compare the throughput between reliable and unreliable transfer? Is that different? @msvoelker Can you test this in the lab?

nxrighthere commented 5 years ago

Which congestion control is the "other" transport protocol using?

One is re-implementation of DCCP as described in RFC 4340 with almost the same reliability strategies based on ACK vector. Another is similar to CUBIC TCP, but based on a more traditional sliding window. Both encapsulated into UDP.

Can you compare the throughput between reliable and unreliable transfer? Is that different?

A reliable transmission in comparison to transports that I'm using is quite good and consistent while the one which similar to DCCP shows better latencies under bad network conditions.

The problem is that in SCTP unreliable messages are highly affected by congestion, and as a result, I'm getting very outdated, delayed data that the application no longer needs.

tuexen commented 5 years ago

And which CC are you using for DCCP? The one in RFC 4341? I just want to figure out which effect the CC you are using has, and which effect is based on the way SCTP (and possibly the stack) implemented unreliability.

Right now the default CC for SCTP is new reno and it is know for sub-optimal performance in high RTT, high loss scenarios... One could add a cubic implementation...

nxrighthere commented 5 years ago

And which CC are you using for DCCP?

I'm not sure, but it seems that it's based on CCID 2.

Right now the default CC for SCTP is new reno and it is know for sub-optimal performance in high RTT, high loss scenarios...

Yes, I've tried HSTCP and it shows better results than default CC, but still not really suitable for real-time data transfer...

nxrighthere commented 5 years ago

@tuexen Dunno, it's possible to quickly disable any built-in CC without side effects for implementation of a custom algorithm?

nxrighthere commented 5 years ago

@tuexen I've eliminated CC from the source code and set cwnd to a constant value, but the problem is still there, messages are delayed for a very long intervals for some reason.

tuexen commented 5 years ago

@nxrighthere For ordered or unordered messages? What is the message size?

nxrighthere commented 5 years ago

I've tried both ordered/unordered, and it doesn't make any difference.

Here's captured traffic between two machines connected over the wireless network with around 200-250 ms RTT: sctp_wifi.zip

This one with enabled HSTCP.

msvoelker commented 5 years ago

The last couple of days, I'm observing the same problem with partial reliability even with zero RTX or TTL set to 1. Messages are transmitted too slow in comparison to other network transports where ordered unreliable messages delivered up to x20-30 times faster under 100-200 RTT with 5-10% of packet loss respectively...

I'm trying to reproduce this. Can you provide concrete numbers for a concrete environment? Let's say for 200 ms RTT and 5 % packet loss rate. What is the size of the messages you are sending? How long are you running a test?

I did only small tests so far. My setup is RTT = 200 ms Packet Loss Rate = 5 % Message Size = 20 bytes My test application was able to send in about 21 seconds 52354 messages reliable and 57541 messages unreliable.

nxrighthere commented 5 years ago

The problem is not how many messages the transport is able to transmit. the problem is that messages are highly delayed as it explained by @stfl.

My issue is the same:

The problem is that in SCTP unreliable messages are highly affected by congestion, and as a result, I'm getting very outdated, delayed data that the application no longer needs.

nxrighthere commented 5 years ago

Partial reliability doesn't work properly, to get an idea what's going wrong you need to compare it to other popular semi-reliable transports such as ENet for example, which transmitting packets as efficient as possible but without huge delays.

nxrighthere commented 5 years ago

Here are videos files to get an idea what's going wrong and how it looks visually. There are ~100 ms RTT and 10% of packet loss. Both transports are doing the same amount of work, both filling packets up to full frames below MTU (captured data available here).

nxrighthere commented 5 years ago

What I've tried, and it doesn't help:

Set SCTP_PR_SCTP_RTX to 0
Set SCTP_PR_SCTP_TTL to 1
Set messages to SCTP_UNORDERED
Increase buffers and queues size
Eliminate congestion control and set cwnd to a constant value

msvoelker commented 5 years ago

I identified @stfl problem as related to the bug that sends messages (for the first time) even if the TTL of the message is already expired. This is still an open bug. Since you are also using RTX policy, there seems to be something else.

Have you measured the app-to-app message delay? If so, do you see a high average or peeks for single messages? What is the size of the messages you send?

nxrighthere commented 5 years ago

Have you measured the app-to-app message delay?

Yes, it's 16 ms (the application's framerate locked to 60 frames per second).

If so, do you see a high average or peeks for single messages?

For a single message, delays between packets under congestion cause stalls and unresponsiveness (as you can observe on the video).

What is the size of the messages you send?

I'm enqueuing many small messages <30 bytes in tight loops that aggregated into a single packet below MTU (SCTP_NODELAY is not used).

msvoelker commented 5 years ago

You mean 16 ms is the average before any packet loss; once a packet is lost, congestion control adds some delay, correct?

You wrote, you found a way to disable the congestion control for your test. Do you have a total average of 16 ms then?

nxrighthere commented 5 years ago

You mean 16 ms is the average before any packet loss; once a packet is lost, congestion control adds some delay, correct?

Yes, 16 ms without packet loss and any external latency. Under congestion sstat_primary.spinfo_srtt raises up to 3,000 ms for some reason, while the actual RTT is ~100 ms.

You wrote, you found a way to disable the congestion control for your test. Do you have a total average of 16 ms then?

Nope, I didn't gather that, but it doesn't change much from what I see in my tests.

msvoelker commented 5 years ago

Is 16 ms a good number compared to ENet? When you got a RTT of 100 ms, I would assume an app-to-app delay of at least 50 ms. Do you add the link delay only on the way back to the sender?

Does ENet also use a congestion control? If so, which one? Do you see higher delays in case of packet loss with ENet as well?

nxrighthere commented 5 years ago

Is 16 ms a good number compared to ENet?

Yes, when congestion doesn't occur, both have pretty the same good enough latencies.

When you got a RTT of 100 ms, I would assume an app-to-app delay of at least 50 ms. Do you add the link delay only on the way back to the sender?

Yes, the actual app-to-app delay raises up to ~60 ms. Lag simulated on the server-side at sending and receiving through a virtual device which controls the traffic.

Does ENet also use a congestion control? If so, which one?

Yes, it's CUBIC-like I believe, flow control is fixed sliding window.

Do you see higher delays in case of packet loss with ENet as well?

Yes, but they correspond to the expected numbers, unlike in the case with SCTP.

nxrighthere commented 5 years ago

Found the reason why sstat_primary.spinfo_srtt is showing incorrect values like 3,000 ms while the actual RTT is 150 ms: it was enabled by default explicit congestion notifications (it looks like that the client is also connecting faster than before when this option is disabled).

Still, investigate why there's a huge delay between packets delivery. Messages themselves arrive just in excepted time, but they are held in the buffer for some reason rather than just being dispatched.

tuexen commented 5 years ago

@nxrighthere Can you elaborate how ECN affects the computation of the RTT? Does you network actually do ECN marking? How are you reading ECN markings?

nxrighthere commented 5 years ago

Does you network actually do ECN marking? How are you reading ECN markings?

None of that, it just affects RTT calculation for some reason. If I set usrsctp_sysctl_set_sctp_ecn_enable(0) then RTT is calculated properly.

tuexen commented 5 years ago

I don't think ECN should affect the correctness of RTT calculations. If it does, it is a bug I would like to fix. That is why I'm asking... Can you double check?

nxrighthere commented 5 years ago

Nevermind it was just a coincidence, the problem is still there...

sctplab / usrsctp

TTL partial reliability does not function correctly #193