w3c / webrtc-pc

WebRTC 1.0 API
https://w3c.github.io/webrtc-pc/
Other
437 stars 115 forks source link

Inconsistent rules for rid in RTCRtpEncodingParameters #2732

Open docfaraday opened 2 years ago

docfaraday commented 2 years ago

Spec says that RFC 8851 is the source of truth:

'rid-id = 1*(alpha-numeric / "-" / "_")'

However, RFC 8852 has very different rules, which are actually more important than SDP BNF:

'As with all SDES items, RtpStreamId and RepairedRtpStreamId are limited to a total of 255 octets in length. RtpStreamId and RepairedRtpStreamId are constrained to contain only alphanumeric characters. For avoidance of doubt, the only allowed byte values for these IDs are decimal 48 through 57, 65 through 90, and 97 through 122'

Some examples:

rid: "" -> Invalid according to 8851, valid according to 8852 rid: "a-z" and rid: "a_z" -> Valid according to 8851, invalid according to 8852 rid: "\<insert the complete works of Shakespeare here>" -> Valid according to 8851, invalid according to 8852

I briefly corresponded with abr on this, and his opinion is that 8852 is correct and 8851 is wrong.

That said, the empty string is probably not something we should permit.

lgrahl commented 2 years ago

I'll expand this a bit because this is RFC-level of confusing. So, RFC8851 says rid-id = 1*(alpha-numeric) / "-" / "_"), pointing to RFC4566 for alpha-numeric rid-id = 1*(ALPHA / DIGIT) / "-" / "_"), pointing to RFC4234 for ALPHA and DIGIT :womanfacepalming: `rid-id = 1*(%x41-5A / %x61-7A / %x30-39 / "-" / ""), which is simply ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-, or in other words what we all understand as the regex [A-Za-z0-9-]`.

RFC8852 describes the ranges differently rid = 48 through 57, 65 through 90, and 97 through 122 which expands to almost the same, 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz, obviously a different order but whatever it's a set, right, so this is basically what we all understand as the regex [A-Za-z0-9].

I'm not seeing spaces in that set. It seems the only issue is - and _?

docfaraday commented 2 years ago

Some of the text in the description was not rendering the way I expected it to, but it is now fixed. Really, it boils down to whether '-' and '_' are allowed, and what lengths are allowed (0? 256+?). We also have the question of implementation-specific limitations on rid length (I doubt that 255 is supported in practice).

lgrahl commented 2 years ago

So, if I understand correctly, it is [A-Za-z0-9\-_]{1} from 8851 vs. [A-Za-z0-9]{1,16} for one-byte header and [A-Za-z0-9]{0,255} for two-byte header from 8852 Is that correct?

fippo commented 2 years ago

Chrome does limit mid to 32 bytes (wanted 16 but... this broke someone) since M98. Not sure if this also applies to rid, tagging @alvestrand

see also https://mailarchive.ietf.org/arch/msg/mmusic/_Kew1cU1-xK6mGfVebMi1-00_EM/

docfaraday commented 2 years ago

So, if I understand correctly, it is [A-Za-z0-9\-_]{1} from 8851 vs. [A-Za-z0-9]{1,16} for one-byte header and [A-Za-z0-9]{0,255} for two-byte header from 8852 Is that correct?

8851 says [A-Za-z0-9-]{1,} (or [A-Za-z0-9-]+)

docfaraday commented 2 years ago

So I guess there are the following questions:

  1. Do we want to specify an error to throw if the length of the rid is valid according to RFC 8852, but longer than the implementation is able to handle? I think this is probably something we want to do, given the implementation state.
  2. Do we want to forbid the empty string? It is unclear to me right now whether 8852 ought to forbid that, but I think we ought to forbid it one way or another.
  3. Do we want to switch our reference to 8852, or do we want to put the rules in-line without an RFC reference? I'm fine with either approach, but if we are going to be adding extra validation (eg; empty string) we may want to just do all of it in-line.
alvestrand commented 2 years ago

In practice, people use single-character RIDs, because anything bigger bloats the RTP header. (We also had an issue with the size of MID, which ended up recommending 16 bytes maximum, I believe)

docfaraday commented 2 years ago

In practice, people use single-character RIDs, because anything bigger bloats the RTP header. (We also had an issue with the size of MID, which ended up recommending 16 bytes maximum, I believe)

That's another reason to have additional rules (besides RFC 8852 and 8851) about rid on setParameters/addTransceiver, I think.

fluffy commented 2 years ago

I think this needs to be resolved in the RFC that create the problem.

docfaraday commented 2 years ago

I think this needs to be resolved in the RFC that create the problem.

I could file an erratum on RFC 8851. We may also want to clarify in RFC 8852 that a rid SDES of length 0 (ie; the empty string) is not valid?

fluffy commented 2 years ago

My view is both 8851 and 8852 should be updated to be say 1 or more alphanumeric characters. If this update was made, it might also be good to change the max size from 255 to 16.

Right now from a language lawyer point of view, one might argue that they are not actually in conflict, one limits what you can do in RTP and the other limits what you can do in SDP. JSEP requires you to use the same in both so that would mean the only really useable was the intersection of the two. 8851 requires length longer than 0, and 8852 requires no _ or - characters. Both allow a length up to 255.

jan-ivar commented 2 years ago

JSEP requires you to use the same in both so that would mean the only really useable was the intersection of the two.

IOW, webrtc-pc should require rids to conform to both RFC8851 and RFC8852?

aboba commented 2 years ago

Why not file an errata on both? Seems unlikely that the differences between RFC 8851 and 8852 were intentional.

fluffy commented 2 years ago

Why not file an errata on both? Seems unlikely that the differences between RFC 8851 and 8852 were intentional.

Yes - I agree. I think this is just a simple case of a bug in the spec between those two RFCs

docfaraday commented 2 years ago

Filed https://www.rfc-editor.org/errata/eid7132

docfaraday commented 2 years ago

I discussed the possibility of restricting the size to 16 instead of 255 with abr, and he pointed out that this might be something worth doing in a bis, but it would be hard to justify as an erratum.

henbos commented 9 months ago

Do the implementations limit to 16 characters?