SDP support - Githubissues

gavv commented 5 years ago

Implement SDP parser and formatter. We can then use it for session negotiation (RTSP or SIP, see #34), and announcement (SAP; not planned yet).

References: RFC4566

gavv commented 5 years ago

Implementation

We need a new module roc_sdp with three components:

SDP Description (passive structure)
SDP Parser (parses bytes and fills SDP Description structure)
SDP Composer (formats SDP Description structure to bytes)

To ensure that our implementation is compatible, we also need a set of unit tests with real-world SDP samples captured from other software.

The parser can be implemented using a parser generator like YACC or Bison. RFC 4566 provides a BNF grammar for SDP. There are the following requirements to the generator:

the generator itself should be portable
the generated code should be portable (e.g. it shouldn't be glibc-specific) and ideally suitable for use without OS
we should be able to control memory allocations and employ our own allocator

gavv commented 5 years ago

Fields and attributes

Here is the subset of the SDP fields and attributes that we should be able to handle in the first implementation.

These fields should be handled:

v= (sdp version)
o= (session origin / session ID)
c= (connection type and address)
m= (media type, port, and payload ID)

These attributes should be handled:

a=rtpmap (dynamic payload ID: encoding name, sample rate, channel set)
a=recvonly (session mode / direction)
a=sendrecv (--//--)
a=sendonly (--//--)
a=inactive (--//--)
a=type (session type; defines default session mode if omitted)
a=fmtp (codec-specific parameters; we'll need it for Opus)
a=fec-source-flow (FECFRAME; see RFC 6364)
a=fec-repair-flow (--//--)
a=repair-window (--//--)

These fields should not be handled but it would nice to log them:

s= (session human-readable name)
i= (session or media human-readable description)
u= (session URI)
e= (session email)
p= (session phone number)

Other fields and attributes may be ignored.

gavv commented 5 years ago

We also need to add a command-line option to read SDP description from a file:// URI, both on sender and receiver.

gavv commented 4 years ago

We decided to use Ragel for parsing. We already started using it to parse URIs (see roc_address module).

alexandremgo commented 4 years ago

Hey! I can try to implement the new module roc_sdp. I guess I can look at roc_rtp to have a good idea on how to implement this new module ?

I'll also look at roc_address for the parser

gavv commented 4 years ago

Great!

I guess I can look at roc_rtp to have a good idea on how to implement this new module ?

Yes.

To add a new module, simply create a directory with .h, .cpp and .rl files and add module name to ROC_MODULES in SConstruct.

I'll also look at roc_address for the parser

Yes. See also #282.

I also suggest you to read RFC 4566 (if you haven't read it before) and to take a look at the official Ragel PDF. Feel free to ask questions if you'll need help. (BTW, for extensive discussions I'd suggest to use mailing list, but github is also OK).

Among other things, SDP RFC provides ABNF grammar. Ragel does not support BNF, but, in this specific case, I think it will be straightforward to translate BNF to a regular grammar supported by Ragel.

gavv commented 4 years ago

After implementing roc_sdp module we will want to use it. The main usage will be in RTSP, but it will take some time to implement it.

To start with something simple, we can implement a very primitive form of session negotiation when the server prints SDP to a file or stdout and the client reads SDP from a file or stdin, and the user is responsible to transfer SDP from server to client.

This will be useful for debugging and probably for some specific cases when users employ they own signaling protocols. We can think about it after merging roc_sdp module.

gavv commented 4 years ago

Maybe this will be also useful: https://gavv.github.io/articles/minisaplistener/

gavv commented 4 years ago

Features missing in #300 (to be done in future PRs):

[ ] session name (s=), this field should by mandatory
[ ] attributes (a=), session- and media description-level (I think we don't need generic interface for setting and getting attributes of any type; instead, we should have specific getters and setters for the supported attributes)
[ ] number of addresses for c= in <base multicast address>[/<ttl>]/<number of addresses>
[ ] specifying number of addresses N should produce identical description to the one produced when specified N individual c= fields with subsequent IP addresses
[ ] allow unknown / unsupported fields and attributes
[ ] add MediaDescription getter to retrieve payload IDs (formats) other than the default one; we need them for FEC support
[ ] add format_sdp()
[ ] more unit tests:
- [ ] cover all supported features
- [ ] ensure that we correctly handle fields that we're supposed to ignore (like i= or unknown attributes)
- [ ] add tests for various unsupported and invalid session descriptions
- [ ] ensure that we can get identical textual session description by parsing it and then formatting again ("SDP -> parse -> format -> SDP"); this should work only for the supported subset of SDP of course
- [ ] ensure that we can get identical parsed session description by formatting it and then parsing again ("object -> format -> parse -> object"); this should work only for the supported subset of SDP of course
[ ] gather a few real-world SDP examples from some existing tools and add a few regression tests checking that our parser correctly handles them
[ ] make parser a bit more generous to white-space symbols:
- allow just just "\n" (LF) where only "\r\n" (CRLF) is allowed by spec
- allow multiple number of white-space characters where only a single space (SP) is allowed by spec; but we should ensure that this change wont break anything; if this change will leak to incorrect parsing of some descriptions, we should avoid it (eg. maybe a space may be the first letter of some field and squashing two spaces into one would lead to incorrectly dropping that first space?)

gavv commented 4 years ago

   A session description MUST contain either at least one "c=" field in
   each media description or a single "c=" field at the session level.
   It MAY contain a single session-level "c=" field and additional "c="
   field(s) per media description, in which case the per-media values
   override the session-level settings for the respective media.

So we need to implement the following behavior:

If MediaDescription does not have connection data, it should use SessionDescription's connection data.
If MediaDescription does not have connection data, and SessionDescription also doesn't have one, the parsing should fail.

gavv commented 4 years ago

   Multiple addresses or "c=" lines MAY be specified on a per-media
   basis only if they provide multicast addresses for different layers
   in a hierarchical or layered encoding scheme.  They MUST NOT be
   specified for a session-level "c=" field.

   The slash notation for multiple addresses described above MUST NOT be
   used for IP unicast addresses.

We should check that if /<number of addresses> is present, or multiple c= fields are present, the address is multicast (SocketAddr has a method to check this). If it's not, the parsing should fail.
Since we don't support hierarchical / layered encoding currently, let's parse all c= fields, but use only the first one for now.

roc-streaming / roc-toolkit

SDP support #200

Implementation

Fields and attributes