ivoa-std / VOTable

VOTable Format Definition
4 stars 15 forks source link

Removing the "recommendation" to use xmlns to do utype prefix binding. #35

Closed msdemlei closed 1 year ago

msdemlei commented 1 year ago

The idea has been to have utypes like "ssa:foo.bar", where ssa would have been bound to some URI by an xmlns:ssa attribute somewhere in the tree.

This has been a terrible idea from the start (for instance because utype isn't defined as a QName and thus sane XML processors would remove the (from its perspective) gratuitous prefix binding). It also never had a usable semantics (are two utypes different if their prefixes are bound to two different URIs in that way?).

Let's just drop the language; whatever is still done with utypes (e.g., datalink's adhoc prefix) works just fine without it, and the potential confusion (e.g., people could be tempted to interpret utypes as QNames/CURIEs) makes this not only dead but actually dangerous text.

mbtaylor commented 1 year ago

(I don't understand the point about XML processors removing the prefix binding, presumably the binding is still present in the XML infoset and can be recovered by applications if they want to see it and use it to decode utype value prefixes. But that's not really relevant because...)

I agree that the idea of namespace bindings adds complication without buying us anything we need, since utype usage is not so pervasive that namespace collisions are a significant worry. Since the existing text only recommends rather than requires the xmlns binding, I think it's unlikely that existing code relies on or uses such bindings, so I don't expect withdrawing this recommendation to cause problems.

But the attribute="prefix:value" form typically taken by utypes is suggestive of XML namespaces, so to avoid confusion in future I would suggest adding a comment here that these used to be considered as xmlns prefixes, but should in fact not be interpreted as such.

msdemlei commented 1 year ago

On Thu, Mar 02, 2023 at 01:19:41AM -0800, Mark Taylor wrote:

(I don't understand the point about XML processors removing the prefix binding, presumably the binding is still present in the XML infoset and can be recovered by applications if they want to see it

No, prefix bindings are parsing details. In the infoset, the namespaced tags are pairs of the URI bound and the tag name, which is why something coming in as vot:RESOURCE may come out again as ns1:RESOURCE or even just RESOURCE and it's still the same infoset.

But the attribute="prefix:value" form typically taken by utypes is suggestive of XML namespaces, so to avoid confusion in future I would suggest adding a comment here that these used to be considered as xmlns prefixes, but should in fact not be interpreted as such.

I've tried my hand at some language. I force-pushed the change so this remains a single commit. I now notice that talk of QNames may be a bit overly scary. Feel free to write something else...