Closed rdeltour closed 5 years ago
My answer that you quoted still stands. Although the subsequent discussion in that thread was diverted into issues about the usage of WebIDL and not whether to use WebIDL or not (which also led to substantive changes), the remark I made there:
All that being said, I understand and share your unease about the usage of WebIDL; I think we would be happy to consider an alternative. We just did not find any...
still stands.
Although I admit I only had a cursory look https://github.com/w3c/manifest/issues/611 (it is a very long thread) I did not see any alternative emerging either. At some point I raised the (half serious) idea of using TypeScript for the same purpose; I guess it would be perfectly readable and concise (I actually did, for my own learning, a version of the data structure definition), but we also agreed that it would not be wise to bind to an existing programming language.
That being said: if there is an accepted formalism coming to the fore in the coming months to replace WebIDL, switching to it should be an editorial change and not a substantive one. If that is that case, I think it would be acceptable to switch to it while we are in CR. I agree we should watch this space but, in my view, we should not consider this issue as a road block.
We don't rely on WebIDL in quite the same way that web app manifest does (i.e., they have webidl written into their definitions and processing). I can see why they'd want to refactor all of that out. Whether we use WebIDL is, in some ways, largely irrelevant to our specification since it's just a general reference to how the processed data gets structured internally; it's swapped in and out with anything else quite easily. (Our processing steps only use pseudo code with JSON examples, after all.)
I'm fine with alternatives, but given there is no consensus whatever we pick has an uncertain future. We might just want to punt on this and see if the landscape changes before we move on from CR - swapping in something else at CR wouldn't materially change our specification, so doesn't seem like it would be a controversial change.
One strong argument against WebIDL, though, is that it gives the appearance of an API to our specification, where one doesn't exist (something we do try to clarify, though).
One tangential concern I've had growing in my mind lately is the prominence of the WebIDL. Given that it's just a reference for developers, having it so high in the spec may lead to authoring confusion -- i.e., what is required in the internal representation not always matching up with what is required to be authored. I don't believe in the ability of readers to not just look at the WebIDL and expect that it describes what has to be in their manifests. It might be better to return it to an appendix.
It might be better to return it to an appendix.
Or perhaps moving it into the processing section might be the most appropriate place.
Although I admit I only had a cursory look w3c/manifest#611 (it is a very long thread) I did not see any alternative emerging either.
It seems that they have a plan to describe the structure using the types from the Infra standard.
That being said: if there is an accepted formalism coming to the fore in the coming months to replace WebIDL, switching to it should be an editorial change and not a substantive one. If that is that case, I think it would be acceptable to switch to it while we are in CR. I agree we should watch this space but, in my view, we should not consider this issue as a road block.
Good! This at least makes me feel a bit less guilty of having raised this issue 😅
I think it would be acceptable to switch to it while we are in CR.
Heh, I didn't even notice we were saying the same thing. That's reading on the weekend for you... :)
It might be better to return it to an appendix.
Or perhaps moving it into the processing section might be the most appropriate place.
I just wanted to make this proposal:-)
to describe the structure using the types from the Infra standard
We might want to go all in on infra for the processing. We use the general language, but fall back on some loosely defined concepts.
It might not be much harder than adding the conversion to infra types step and switching /object/Map/ and /array/list/.
But, as far as I can see, infra is trying to unify the processing step language. Which is very useful. But it does not give a general view of the data structure like the current WebIDL does
But, as far as I can see, infra is trying to unify the processing step language.
It also defines general types... and how to convert JSON into those types.
Which is very useful.
Indeed :)
But it does not give a general view of the data structure like the current WebIDL does
It doesn't provide a syntax for defining those structures. But it does give the data types.
I was actually wondering about using, simply, the original OMG IDL. After all, this is at the basis of WebIDL but, if we use this, we take away the ambiguities around the fact that we do not define any API, the data structure can be used by a Web processor but, also, by something else, etc. On the other hand, it is not a big departure from WebIDL.
I have not fully absorbed the spec, but I noticed one thing. What OMG IDL calls char
(and strings, that consist of char
-s) are strictly 8-bit, essentially ASCII characters. No good for us. It also has, however, "wide" characters, called wchar
(and, consequently, wstring
). It does not really say what a "wide" character is, but it can consist of several bytes, i.e., it could be used to store Unicode code points encoded in, e.g., UTF-8. Unfortunately, the spec does not refer to Unicode at all (for a spec that has been updated in 2018, it is a bit surprising). This is a detail we must check if we go down that route.
But it does not give a general view of the data structure like the current WebIDL does
It doesn't provide a syntax for defining those structures. But it does give the data types.
Ya, I'm not suggesting we drop the webidl, at least not yet. All I'm suggesting is that for the processing steps we use the infra spec more completely.
For example, instead of these two steps to parse the json:
Let manifest be the result of parsing text as JSON [ecmascript]. If parsing throws an error, this is a fatal error. Return failure.
If typeof(manifest) is not Object [ecmascript], this is a fatal error. Return failure.
By the infra spec we could instead use something like:
- Let manifest be the result of parsing JSON into Infra values given text.
- If manifest is not a map, this is a fatal error. Return failure.
After that, there's only a few instances where we use different data type names, which we assume are widely understood but wouldn't hurt to tie to the infra datatypes instead.
I am fine rewording the steps if it is not tooooo much trouble (I do not know how stable infra is). But this looks like orthogonal to the original issue.
In the context of the TAG issue "the proliferation of manifests at W3C" by @tantek, I believe that things like how to describe data structures is typically worth being looked into, and see if these specs can or need to adopt a unified approach.
I worry about the proliferation of manifests, too. I've been experimenting with manifests that are both pub manifests and web app manifests (with link rel="manifest publication"
) At least online web app manifest validators don't seem bothered by all our contexts and extra information—they just return the members they recognize. But I haven't tried to get such a manifest to actually install on Android.
I also think it is important that ordinary web developers to be able to do useful things with our manifest. Having things defined in terms of infra might help?
I do not know how stable infra is
Very. It sole purpose is to be the bedrock on which other standards are built.
I also think it is important that ordinary web developers to be able to do useful things with our manifest. Having things defined in terms of infra might help?
I think this conflates separate concerns. Infra just gives us generic data types. We still need for those things to be processed in a logical way into some canonical form.
I think I need to do the conversion to Infra with web manifest to show how this works, then the pub specs can leverage the data processing algorithms to piggyback on-top of web manifest. That is, assuming the pub spec can be used on top of web manifest.
This issue was discussed in a meeting.
Another insane idea...
The reason we have WebIDL is to have a programming-language independent, but easily graspable overview of the data structure that is generated. I believe it is important to have this. The problem using something like TypeScript is that it is a specific programming language and could be misunderstood.
However, what about embracing the TypeScript option and put, side-by-side the same data structure in other programming languages that do have typing. So we could put into an informative appendix the same data structure in TypeScript, Rust, Java... I do not know about typing in Swift; unfortunately, Javascript or Python would not qualify because we cannot really express types for those. But if we have at least those three, this would take care of the possible misunderstanding of a single language, and we could drop WebIDL.
I know it is insane... but maybe it works nevertheless
The reason we have WebIDL is to have a programming-language independent, but easily graspable overview of the data structure that is generated.
I think this was an original goal, but now it defines how data between JS and C++ (and maybe Rust a tiny fraction of the time) pass data between each other in a somewhat secure, type-coercing, error-handling manner... amongst other things.
But if we have at least those three, this would take care of the possible misunderstanding of a single language, and we could drop WebIDL.
I think we may be getting ahead of ourselves here. I think we need to answer: Who, exactly, is supposed to process the data in this specification?
If it’s a text editor, then JSON-Schema or maybe TypeScript might be very useful. If it’s a browser (and the data never ends up in a JS environment), then Infra types are best, for instance.
Let’s start by answering the question above. What is the primary conforming user agent you are targeting?
As an example, note that Web Manifest links to a non-normative JSON-Schema that was created for use with Visual Studio.
Let’s start by answering the question above. What is the primary conforming user agent you are targeting?
There isn't a single answer to that question, which is what complicates a lot of our decision making. It could be a browser. It could be a JS-based reading app. It could be a standalone reading app.
We define an informative json schema, as well, for authoring, but there's greater flexibility in authoring so it doesn't show the expected internal representation (not clearly, anyway).
A major chunk of processing is normalizing the data so we can be flexible to the kinds of patterns people use with schema.org metadata (i.e., not following strict typing). The idea being that you don't have to define SEO metadata separately from the manifest metadata.
I'm still partial to presenting this kind of data visualization aid outside the specification. It may be helpful, but when it's not critical and causes confusion it's probably not worth the hassle.
I think this was an original goal, but now it defines how data between JS and C++ (and maybe Rust a tiny fraction of the time) pass data between each other in a somewhat secure, type-coercing, error-handling manner... amongst other things.
Yep, that is the problem we are fighting with: WebIDL has outgrown that original usage, hence the awkwardness of using it here.
To answer to your question, @marcoscaceres: the processing of the data is supposed to be done by what this community calls a "reading system". That can be a separate application rendering audiobooks (or other forms of publications), can be a plugin in a browser, or the browser itself (what was the case, for EPUB, in the now defunct Edge functionality on EPUB-s).
Our fearless editor, @mattgarrish, is working on transforming the text to use the terminology/style of infra. But I still believe that something like (the original:-) WebIDL would be useful, if for nothing else to make the spec, and the "target" of the processing step more understandable to whoever reads the spec. Hence my (insane:) proposal to possibly use several examples on how that data structure is represented in specific languages (that would not be normative).
B.t.w., we also have a json schema for the manifest...
This issue was discussed in a meeting.
Closed by virtue of the merge of PR #103
The Publication Manifest specification relies on WebIDL to define the internal data structure of manifests.
One of the reasons we chose WebIDL is that we were influenced by its use in Web App Manifest. Now, it seems that the Web App Manifest folks are reconsidering that, see the entire discussion in w3c/manifest#611. (Thanks @marcoscaceres 🎂 for hinting at this discussion at TPAC).
The TAG also commented on our use of WebIDL, to which @iherman replied.
Are we sure that WebIDL is the right approach and won't create more future issues than it solves? (I’m not an expert in WebIDL myself, and still need to digest the issues raised for Web App Manifest).
In the context of the TAG issue "the proliferation of manifests at W3C" by @tantek, I believe that things like how to describe data structures is typically worth being looked into, and see if these specs can or need to adopt a unified approach.
I know it’s late in our editing process, and we’ve had lengthy discussions on using WebIDL, and even more time spent on actual editing. But we'd better make sure we're on the right track before moving to CR.