twilio / twilio-video-ios

Programmable Video SDK by Twilio
http://twilio.com/video
Other
64 stars 22 forks source link

LocalDataTrack delivery guarantees and reliability #138

Closed DeTeam closed 3 years ago

DeTeam commented 3 years ago

Description

Hey there,

While working with Twilio video ios SDK's data track in group rooms we realized that the delivery of messages is not guaranteed. Docs says that one can configure two options to control the reliability of data tracks: https://www.twilio.com/docs/video/ios-v3-using-the-datatrack-api#configuring-datatrack-reliability. In our case, commands that are sent from an event host (moderator, organizer, who's running a session) are lost or delivered with a significant delay.

While digging into the topic and also peeking at the JS implementation, I've found that:

  1. Parameters maxPacketLifeTime & maxRetransmits are mutually exclusive (there'll be an exception thrown if both are used)
  2. WebRTC spec mentioned that by default data channel should be reliable (what does that mean?) and becomes unreliable only when one of the two parameters above is provided.

Based on this, I'm wondering:

  1. What are delivery guarantees for data track messages (both in raw WebRTC and Twilio Video SDK)?
  2. How those parameters should be used, in which cases, what are the best practices?
  3. Is there anything different regarding the configuration of data tracks that Twilio is doing in the media servers?
  4. Shall clients implement constant retransmit for data track messages?

Versions

All relevant version information for the issue.

Video iOS SDK

3.6.1

echamussy commented 3 years ago

Very interested in this answer as well. Our experience has been similar to what the original post describes. Most of the time the data channel is reliable but sometimes it skips messages which causes sync problems in our app data layer. Hopefully they folks at Twilio can provide some answers as of how to make the data channel more reliable.

ceaglest commented 3 years ago

Hey @DeTeam,

Sorry for the late response.

While digging into the topic and also peeking at the JS implementation, I've found that:

  1. Parameters maxPacketLifeTime & maxRetransmits are mutually exclusive (there'll be an exception thrown if both are used)

Correct. Think of maxPacketLifeTime as the amount of time the messages that you send remain relevant. If it isn't possible to send the message by the end of its life then the message expires. You can instead use maxRetransmits to enforce some guarantee of reliability that is independent of the connection round trip time between you and the other Participant(s).

  1. WebRTC spec mentioned that by default data channel should be reliable (what does that mean?) and becomes unreliable only when one of the two parameters above is provided.

Messages are retried until they are delivered. Some future messages might get dropped if the send queue is full.

Based on this, I'm wondering:

  1. What are delivery guarantees for data track messages (both in raw WebRTC and Twilio Video SDK)?

In Twilio Video and WebRTC it's up to your application to decide if you want DataTracks / DataChannels to be fully reliable or only partly reliable (limited by retransmission time or number of attempts). Messages are ephemeral, and this is important because in Twilio Video messages are only sent to subscribers, so if a Participant subscribes "late" they will not get older messages that they missed.

  1. How those parameters should be used, in which cases, what are the best practices?

If you want all subscribers to receive your messages then you should create a reliable LocalDataTrack. If data becomes less relevant over time (maxPacketLifeTime), or is not critical (maxRetransmits) then you should create an unreliable LocalDataTrack.

  1. Is there anything different regarding the configuration of data tracks that Twilio is doing in the media servers?

Twilio's media servers sit in the middle between you and subscribers in a star. This matters if you are setting maxPacketLifetime (quoting from the guide):

In Group Rooms, DataTrack connections are established between Participants via the media server. Under the hood, there is one connection between a local Participant to the Media server and a second connection from the Media server to the remote Participant. Twilio’s media server configures the same maxPacketLifeTime value on each remote Participant's connection. Therefore you should set the maxPacketLifetime to half the acceptable max lifetime for each message you send.

  1. Shall clients implement constant retransmit for data track messages?

If you need a guarantee of delivery then use a reliable DataTrack. If you have data that is not ephemeral you might want persistent storage (examples Twilio Sync, Firebase Cloud Firestore).

If you want to use DataTrack to exchange data that is not ephemeral you need to write some sort of protocol to coordinate the exchange of the data to subscribers. There is an API improvement to be made on for Twilio Video, which is surfacing the state of the DataTrack's sending queue and when messages fail or succeed.

I hope this was helpful, but I would be happy to discuss it further.

Best, Chris

ceaglest commented 3 years ago

Hi @echamussy,

Very interested in this answer as well. Our experience has been similar to what the original post describes. Most of the time the data channel is reliable but sometimes it skips messages which causes sync problems in our app data layer. Hopefully they folks at Twilio can provide some answers as of how to make the data channel more reliable.

How large are the messages that you send? Do you add more reliable messages while the Room is in a reconnecting state? It's possible that the send buffer is becoming full and some future messages are being dropped.

Best, Chris

piyushtank commented 3 years ago

Closing this ticket as there is no activity in the last 2+ months. Feel free to reopen or open a new ticket if you run into any issues.