OpenMath / OMSTD

The OpenMath Standard (starting with OpenMath 2)
9 stars 5 forks source link

Specify or forbid whitespace normalisation #24

Open kohlhase opened 7 years ago

kohlhase commented 7 years ago

see https://github.com/OpenMath/OM3/issues/134

kohlhase commented 7 years ago

This was already raised by David on the CDs repository. I think we should clarify this in the standard and the normative RNCs (actually everywhere). Assigning David and James to come up with a pull request.

davidcarlisle commented 7 years ago

The original OM3 issue says:

It is common OpenMath CD authoring practise (none that I personally like, but it exists) to assume whitespace normalisation, e.g. <Name> foo </Name> is assumed equivalent to <Name>foo</Name>. However, in contrast to MathML this behaviour is not specified. It should, or otherwise all CDs using that syntax are invalid.

I think the initial comment is not correct actually. The existing examples with spaces are valid.

The standard in words says CDName contains the name and for syntactic requirements explicitly defers to the relax schema, that says

CDName = element CDName { xsd:NCName }

which implies that the content is tokenized including white space normalisation and the resulting token checked against the NCName production. so

<CDName> arith1 </CDName>

is valid and equivalent to

<CDName>arith1</CDName>

and

<CDName>a r ith1</CDName>

is invalid.

So no change is needed but personally I'd be quite happy to remove the spaces from the core CDs they may not be invalid but they are not a good example to follow.

The same implicit tokenisation applies to xsd:date and xsd:integer etc, so the only ones where the status of the space was/is ambiguous is CDBase and CDURL which use xsd:anyURI the definition of which depends on exactly which version of XSD that you use (and is self-contradictory in XSD version 1)

That said, just because a change isn't needed doesn't mean an explanatory note couldn't be added somewhere.

kohlhase commented 7 years ago

personally I'd be quite happy to remove the spaces from the core CDs they may not be invalid but they are not a good example to follow.

I think that is the least we should do. The spaces are indeed a very bad practice. Would you?

kohlhase commented 7 years ago

That said, just because a change isn't needed doesn't mean an explanatory note couldn't be added somewhere.

I think this would be helpful, and also say that the spaces around are "bad practice" (deprecated). Would you make a pull request of this?