Closed eceozturk closed 4 years ago
The regular expression (([A-Za-z0-9\-\._~?-?]|(%[0-9A-Fa-f][0-9A-Fa-f])|[!$&'()*+;=:@]))+(,(([A-Za-z0-9\-\._~?-?]|(%[0-9A-Fa-f][0-9A-Fa-f])|[!$&'()*+;=:@]))*)*
does not allow the slash ('/') in a profile name
On another aspect, the profile name `http://dashig.org/guidelines/dash264" does not comply with the 4th edition definition of a URL profile identifier which states
When a URL is used, it should also contain a month-date in the form mmyyyy;
Hi Paul, thanks for digging into this. Are you saying that the regex does not permit the required slash, or something else? Also, the date is a "should" and thus not conformance testable. I believe that should be removed from the regex.
@mikedo correct that the regex does not permit the slash needed for a URL.
just add the necessary /s to the regex '(([A-Za-z0-9-.~?-?]|(%[0-9A-Fa-f][0-9A-Fa-f])|[!$&/'()*+;=:@]))+(,(([A-Za-z0-9-.~?-?]|(%[0-9A-Fa-f][0-9A-Fa-f])|[!$&'/()+;=:@])))*'
That said, I am not sure what this part ~?-?
of the regex is supposed to match.
OK, ~?-?
is just a formatting anomaly from ~ -ÿ
Stepping back, either the regex conforms to the 4th Ed or not. If it does, then the example test vectors are non-conformant and DASH-IF would have to decide how to resolve this (change test content or propose an amendment to 4th Ed). If the regex does not conform to the 4th Ed, then it first needs to be fixed.
If @profiles were a space-separated list of profiles, then a reasonable (although not perfectly constrained) data type could be simply formed from a "list of xs:anyURI", but alas....
Below is a summary of the relevant 4th Ed provisions:
The 4th Ed clause 5 clearly constrains them to only URL syntax, yet clause 8 clearly says they are either URLs or URNs). Since the 4th Ed defines specific URN profiles then one might assume the intent was per clause 8 and the normative statements in clause 5 should be assumed to apply to the URL syntax only. Clause 8 also adds the comma separated list syntax. With that interpretation, then it is a comma separated list of either URLs as constrained in clause 5 or URNs as constrained in clause 8...
From 4th Ed, clause 5.3.1.2 for MPD@profiles:
The contents of this attribute shall conform to either the pro-simple or pro-fancy productions of IETF RFC 6381:2011, subclause 4.5, without the enclosing DQUOTE characters, i.e. including only the unencodedv or encodedv elements respectively. As profile identifier, the URI defined for the conforming Media Presentation profiles as described in Clause 8 shall be used.
RFC 6381, clause 4.5:
pro-simple := "profiles" "=" unencodedv pro-fancy := "profiles*" "=" encodedv
RFC 6381, clause 3.2:
The BNF syntax is as follows:
codecs := cod-simple / cod-fancy
cod-simple := "codecs" "=" unencodedv
unencodedv := id-simple / simp-list
simp-list := DQUOTE id-simple *( "," id-simple ) DQUOTE
id-simple := element
; "." reserved as hierarchy delimiter
element := 1*octet-sim
octet-sim := <any TOKEN character>
; Within a 'codecs' parameter value, "." is reserved
; as a hierarchy delimiter
cod-fancy := "codecs*" "=" encodedv
encodedv := fancy-sing / fancy-list
fancy-sing := [charset] "'" [language] "'" id-encoded
; Parsers MAY ignore <language>
; Parsers MAY support only US-ASCII and UTF-8
fancy-list := DQUOTE [charset] "'" [language] "'" id-list DQUOTE
; Parsers MAY ignore <language>
; Parsers MAY support only US-ASCII and UTF-8
id-list := id-encoded *( "," id-encoded )
id-encoded := encoded-elm *( "." encoded-elm )
; "." reserved as hierarchy delimiter
encoded-elm := 1*octet-fancy
octet-fancy := ext-octet / attribute-char
DQUOTE := %x22 ; " (double quote)
4th Ed, clause 8:
A profile has an identifier, which is a URI. The profiles with which an MPD complies are indicated in the MPD@profiles attribute as a comma‐separated list of profile identifiers. Profile identifiers defined in this document are URNs and shall conform to IETF RFC 8141.
RFC 8141, clause 2:
namestring = assigned-name
[ rq-components ] [ "#" f-component ] assigned-name = "urn" ":" NID ":" NSS NID = (alphanum) 030(ldh) (alphanum) ldh = alphanum / "-" NSS = pchar (pchar / "/") rq-components = [ "?+" r-component ] [ "?=" q-component ] r-component = pchar ( pchar / "/" / "?" ) q-component = pchar ( pchar / "/" / "?" ) f-component = fragment
The question mark character "?" can be used without percent-encoding inside r-components, q-components, and f-components. Other than inside those components, a "?" that is not immediately followed by "=" or "+" is not defined for URNs and SHOULD be treated as a syntax error by URN-specific parsers and other processors.
Note that RFC 8141, clause 2 has more constraints on the above.
So, I read it that the schema is incorrect.
The profile name is a URI, and MPEG defined profiles use URNs to satisfy this,
Profile identifiers defined in this document are URNs and shall conform to IETF RFC 8141. The schema, as published, supports this, but then also in clause 8 Externally defined profiles may use profile identifiers that are URNs or URLs The schema, as published, does not support this.
Looking at the 3rd -> 4th edition diff, there does not seem to be any spec text changes, but the schema changed from
<xs:attribute name="profiles" type="xs:string" use="required"/>
to
<xs:simpleType name="ListOfProfilesType">
<xs:restriction base="xs:string">
<xs:pattern value="(([A-Za-z0-9\-\._~ -ÿ]|(%[0-9A-Fa-f][0-9A-Fa-f])|[!$&'()*+;=:@]))+(,(([A-Za-z0-9\-\._~ -ÿ]|(%[0-9A-Fa-f][0-9A-Fa-f])|[!$&'()*+;=:@]))*)*"/>
</xs:restriction>
</xs:simpleType>
I concur. There are several issues with this I think. I'm tempted to restore the 3rd Ed data type, at least temporarily so that users (e.g. DASH-IF) can otherwise exercise the schema against old test assets and author new MPDs right. Any concerns about me committing a PR to dev branch while we sort this out?
As an additional note, apart from the forward slash character (/
), the number character (#
) is also not included in the profile string restriction regex.
author new MPDs right
@mikedo what do you mean by DASH-IF authoring new MPDs right? What is wrong with the MPDs? I am totally lost since there are observations in the above thread that the schema is incorrect...
If @profiles were a space-separated list of profiles, then a reasonable (although not perfectly constrained) data type could be simply formed from a "list of xs:anyURI", but alas....
Agree - can't we deprecate the use of comma separated lists for items that do not have spaces (i.e. identifiers)? It seems that allowing spaces in identifiers is no longer allowed (or at least scorned upon)
You could deprecate it from being allowed in specifications, but I don't think you could modify it in the profiles attribute in DASH - space separated isn't even a valid option there at the moment.
@waqarz As Paul and I have (also) concluded, there are several errors with the profiles regex, making the 4th Ed schema unusable for validating MPDs with URL profiles syntax. This forces ISO users that want URL profiles syntax (including DASH-IF and DASH-IF users) to either create their own schemas or revert to the 3rd Ed schema, neither of which are a good idea since they would enable the creation of potentially new non-conformant MPDs. I did not say DASH-IF had non-conformant MPDs. Sorry for any misunderstanding.
@eceozturk Thanks for the revised schema. Let us know when everyone is happy with it.
From @eceozturk in m52459, there are missing '/' and '#' chars. Fixing the regex (at least for DASH-IF test assets is in the private commit here: https://github.com/eceozturk/DASHSchema/commit/3e411a603950c6ee01acb20ef2ad13583e978f2c
MPD@profiles string in most of the DASH test vectors that are available at the link fails when tested with the 4th edition DASH schema. Further information is provided below.
Error message:
NOTE that the value of the MPD@profiles string given in the error message is just an example. It changes according to the tested MPD.
Tested DASH schema location (4th edition): https://raw.githubusercontent.com/MPEGGroup/DASHSchema/21e8bf2c973c20ad02db3df9f61bbd8759bb16f1/DASH-MPD.xsd
Testing URL: http://54.72.87.160/conformance/current/Conformance-Frontend/Conformancetest.php?schema=https://raw.githubusercontent.com/MPEGGroup/DASHSchema/21e8bf2c973c20ad02db3df9f61bbd8759bb16f1/DASH-MPD.xsd
Some example test vectors with fail status: