WGBH / PBCore2.0

Public Broadcasting Metadata Dictionary Project
http://www.pbcore.org
33 stars 9 forks source link

For 3.0, make required elements first in each sequence #96

Open mccalluc opened 9 years ago

mccalluc commented 9 years ago

In the current schema, some required elements come after optional elements. If the required elements are missing from a document, and you try to validate, the error message is hard to understand: It may list all the missing optional elements which could have gone in there, obscuring the real problem.

This change in the schema would not be backwards compatible, so it can not be done before 3.0.

awead commented 9 years ago

Would making validation not dependent on node order solve this?

mccalluc commented 9 years ago

I don't know, and I feel like that could introduce other problems, at least as far as human readability.

awead commented 9 years ago

Although they are probably in the minority, my thoughts are that XML, and PBcore for that matter, should never be for human readability. Computers should be parsing it, not us. Furthermore, ensuring that elements are order independent reduces the likelihood of additional problems later. For me, getting exported PBCore to validate always involved an extra step of reordering the nodes. Removing that requirement would make that process easier.

johnnypass commented 9 years ago

I'm with you on this one, Adam. Also, is this related to an already closed issue (#19) from 4 years ago? I can't tell exactly from the comments why the issue was closed.

mccalluc commented 9 years ago

I don't understand those comments either; Sorry.

A less ordered schema can work, but getting the cardinality constraints right might be tricky: I don't know that there's a way with xsd:all to say a given element must occur exactly once: If there's only required element you could imagine something like

xsd:sequence
   xsd:all (except required)
   required
   xsd:all (except required)

... but if there is more than one required element, then you'd have to have choice for all the different possible orderings ...

One way of resolving this is to segregate the required elements; actually, the xhtml schema is a good example of this: body is free-form, but you want at most one title, and so it goes in the head element instead, which is tighter.

Similarly, the handling of repeating elements will need to be be reconsidered... Will it be confusing if occurrences are separated? Perhaps not.