WebVMT: Investigate better out-of-band link handling

w3c / sdw

Repository for the Spatial Data on the Web Working Group

https://www.w3.org/2020/sdw/

148 stars 81 forks source link

WebVMT: Investigate better out-of-band link handling #1107

Open rjksmith opened 5 years ago

rjksmith commented 5 years ago

VMT files include a url attribute which defines the linked media file. This information is duplicated in the tileurl attribute of an HTML5 <track> element when displayed in a web page.

The SDW IG review (#1094) raised the question of whether there is a better way to handle out-of-band (or non-embedded) links, to mitigate errors due to file renaming.

After further thought, the current approach seems entirely consistent with HTML's basic design. For example, an HTML file containing <img src="myimage.jpg"> displays an image in a web page and suffers exactly the same issue if the linked image file is renamed, which indicates that the current design is probably the best.

HTML makes best effort to display the page, despite errors, so probably the best approach is to trigger a warning message if the URL is incorrect and cannot be resolved. However, I'm interested to hear any constructive alternatives.

tidoust commented 5 years ago

Not that I believe that this issue is that crucial but I think the above summary misses the point that I am trying to make, so allow me to reformulate:

In the image example, the main starting point is the HTML page. If the link is broken, the image won't render. Such an error is easy to detect and fix because it has a practical consequence: the browser fails to fetch the image, cannot render the page properly, and can report a warning. Or the reader sees the broken image and yells one way or the other. In short, someone notices.

The image itself does not link back to the HTML page. Metadata in the image itself could probably include a link back to the HTML page that embeds it. However, such a link is much more error-prone because nothing breaks when that link is broken: browsers won't fetch the link to render the image or to render the HTML page, and these are by far the main usage scenarios. Browsers won't see the error. Users won't see anything wrong either. In short, no one notices.

What could "trigger a warning message if the URL is incorrect and cannot be resolved"? Search engines when they index the content? How would they report the warning? Authoring tools would be able to detect the error, but that supposes that people will use authoring tools, and experience with HTML shows that people are actually happy to write code by hand and copy-and-paste things from one place to another, etc. WebVMT files being text-based would see the same kind of behavior.

I believe the same thing applies to WebVTT, WebVMT, and other out-of-band tracks. Main usage is going to be rendering the video. The starting point will be either the audio/video file, or the HTML page that embeds it, so link should be from the audio/video file or the HTML page to the WebVMT file, and that's going to be the only link that can be somewhat trusted.

For better or worse, links on the Web are not bidirectional. Search engines that index images will manage to reconstruct the link from the image to HTML page(s) simply because they will crawl HTML pages to start with. Similarly, indexation can reconstruct the link from the WebVMT file to the media file(s) by crawling HTML pages and/or media files to start with.

In summary, I'm suggesting that having a link to the media file from the WebVMT file may not be a good idea because it can break easily without practical consequence, and thus is a non reliable source of information. The alternative is not to have a link from the WebVMT file to the media file / HTML page: the link goes the other way round and that seems enough.

rjksmith commented 5 years ago

Thanks for your clarification.

If I've understood correctly, your concerns are:

Errors in the WebVMT media url cannot be easily identified, so this information may be unreliable.
Web links should not be bidirectional, e.g. the image should not contain a link to the HTML.

Firstly, warnings could be displayed by the WebVMT engine in the rendered webpage, e.g. by displaying text in the map element instead of the rendered map. Is there a precedent for displaying errors or warnings within the browser, e.g. broken image link?

Secondly, to clarify my webpage-image analogy, the VMT file contains a URL to the media file in the same way as an HTML file contains a URL to the image file. I can't see a bidirectional link - there's no link from the VMT to the HTML. Have I misunderstood?

HTML -> image VMT -> media

While the HTML use case is important for WebVMT, it's not the only use case. The VMT file should contain sufficient data for all use cases, and the purpose of the WebVMT media block is to identify the linked media - particularly in non-HTML use cases, e.g. for the mobile demo. In the HTML use case, some of the data can be duplicated into the HTML file, which is a product of the VMT source data, e.g. by a PHP script.

If the linked media URL is not in the VMT file, where would that information be recorded?

tidoust commented 5 years ago

In the WebVMT case, my working assumption is that the main starting point is either the HTML page (HTML -> media, VMT) or the media file (media -> VMT). Use cases that start from the VMT file seem less likely to me. That may be where we have different perspectives.

Firstly, warnings could be displayed by the WebVMT engine in the rendered webpage, e.g. by displaying text in the map element instead of the rendered map. Is there a precedent for displaying errors or warnings within the browser, e.g. broken image link?

Sure, that's typically the kind of information that you get on the developer console when you browse a page. However, that information is always restricted to resources that the browser needs to fetch to render the page. If it already has the URL of the media file and of the WebVMT file, the WebVMT engine does not need to fetch the link between the VMT file and the media file to render things. That would take extra time, CPU and network usage. So I wouldn't expect it to report on it being broken.

Secondly, to clarify my webpage-image analogy, the VMT file contains a URL to the media file in the same way as an HTML file contains a URL to the image file. I can't see a bidirectional link - there's no link from the VMT to the HTML. Have I misunderstood?

The analogy does not work at 100% because there may be 3 resources at play in your case (HTML, media, VMT). In the webpage-image analogy, the main starting point for most use cases is the HTML page, so link is HTML -> image.

While the HTML use case is important for WebVMT, it's not the only use case. The VMT file should contain sufficient data for all use cases, and the purpose of the WebVMT media block is to identify the linked media - particularly in non-HTML use cases, e.g. for the mobile demo. In the HTML use case, some of the data can be duplicated into the HTML file, which is a product of the VMT source data, e.g. by a PHP script.

In the demo you link to, I wouldn't expect users to point the player at the VMT file directly, in the same way as I wouldn't expect users to point a video player at a subtitles file directly (that is, it may work, but I wouldn't expect the video to appear as if by magic).

If the linked media URL is not in the VMT file, where would that information be recorded?

I don't know. How is the link between a video file and a subtitles file recorded? I'd record it as metadata in the video file (again because that's the starting point for me)

rjksmith commented 5 years ago

In the WebVMT case, my working assumption is that the main starting point is either the HTML page (HTML -> media, VMT) or the media file (media -> VMT). Use cases that start from the VMT file seem less likely to me. That may be where we have different perspectives.

I agree that the HTML is a starting point (HTML -> media, VMT), and that is an important use case.

I disagree that the media file is a starting point, as it is contains no reference to the VMT file. Perhaps I've misunderstood. Do you mean 'starting point' for DOM parsing, or something else?

The VMT file would be pretty meaningless without the linked media (VMT -> media), though technically it could just run on a timer in isolation. However, there are HTML use cases where this information is essential - see Playlist Use Case below.

Sure, that's typically the kind of information that you get on the developer console when you browse a page. However, that information is always restricted to resources that the browser needs to fetch to render the page. If it already has the URL of the media file and of the WebVMT file, the WebVMT engine does not need to fetch the link between the VMT file and the media file to render things. That would take extra time, CPU and network usage. So I wouldn't expect it to report on it being broken.

If the media URL is in the HTML file, I agree that there is no need for the WebVMT engine to fetch the resource (and it doesn't). However, the media URL may not be in the HTML, which comes back to the question of 'where is it recorded?' - see Playlist Use Case below.

The analogy does not work at 100% because there may be 3 resources at play in your case (HTML, media, VMT). In the webpage-image analogy, the main starting point for most use cases is the HTML page, so link is HTML -> image.

In the case you describe, HTML -> media & VMT and also VMT -> media. There's some duplication in this case, but I don't think it's bidirectional.

In the demo you link to, I wouldn't expect users to point the player at the VMT file directly, in the same way as I wouldn't expect users to point a video player at a subtitles file directly (that is, it may work, but I wouldn't expect the video to appear as if by magic).

In the mobile demo, the user must load both files: media and VMT. Just loading one file doesn't magically make the other appear, but sufficient information is required to allow the browser to make the correct association between them. In this case, there is no information in the HTML as the files are loaded afterwards by the user via an HTML input element, as you can see from this HTML excerpt:

<input id="vmt-load" multiple type="file" accept="video/*,video/mp4,.vmt">
<video id="vmt-video" controls src="">
   <track kind="metadata" map-id="vmt-map" tile-url="https://{s}.tile.openstreetmap.com/{z}/{x}/{y}.png?apikey=MY_KEY" />
</video>

The Playlist Use Case below explores this in more detail.

rjksmith commented 5 years ago

Playlist Use Case

This use case is designed to demonstrate why linked media information is required, even if it's not present in HTML.

Consider media and VMT files loaded into a webpage to dynamically create a media playlist. The user loads multiple files via an input element to play the media in sequence, in a similar way to a YouTube playlist. Note that the HTML contains no URLs for either the media or VMT files.

<input id="vmt-load" multiple type="file" accept="video/*,video/mp4,.vmt">
<video id="vmt-video" controls src="">
   <track kind="metadata" map-id="vmt-map" tile-url="https://{s}.tile.openstreetmap.com/{z}/{x}/{y}.png?apikey=MY_KEY" />
</video>

The browser must create a playlist and correctly pair the media and VMT files, though no association details exist in the HTML. The only sources of information are the files that are loaded by the user and the media files are unmodified by WebVMT, so the only place this information can be recorded is in the VMT file.

It should be noted that:

HTML successfully supports this use case without WebVMT content;
This use case is equally applicable to WebVTT content.

tidoust commented 5 years ago

This use case is equally applicable to WebVTT content.

Precisely. There is nothing specific to WebVMT there and so it seems useful to look at how things get done for other companion track files.

The usual media-centric response is to embed all tracks in the media container. WebVMT starts on the premise that the file will be maintained externally. However, I note that people might still choose to embed WebVMT content within the media container, as done for WebVTT in Matroska.

For external subtitles, a naming convention seems to be the most common pattern: given a local media file, media players will look for subtitle files that have the same name (and e.g. that have an srt extension) in the same folder. That is not very satisfactory, but this seems to be how things are done.

I note that a media file can be associated with a number of external files, for instance external subtitles, a WebVMT file, a file that describes notable events, a video file with sign language, etc. The linking problem extends to all of these files. I'd say that there are four main ways to manage the links:

Through a file naming convention. I don't think that works over HTTP where one cannot list available resources on an HTTP server to know which ones to fetch.
By embedding the files into a common media container. That is the usual media-centric response. It requires more tooling and makes editing "by hand" impossible.
By including links to external files in the media container. I'm not sure whether container formats allow to include such metadata. That doesn't seem to be something that gets done in practice.
By recording the links at a different layer as typically done in HTML through the track element. I guess one could also envision HTTP Link headers (which the media player could parse when it fetches the initial media file). And obviously this could be done using RDF or a simple JSON or text-based data structure. That is how things are done on the Web. The obvious drawback is that one needs to manage yet another file.

Putting a link to the media file in each of the external track file seems wrong to me since it does not give you a way to assemble these files together unless you already know that all of these files exist. But then, if you know that all of the files exist as in the playlist use case, the common practice seems to be an implicit link through a naming convention.

All in all, I don't have a good solution to propose for managing these links, I just feel that this is better addressed at a different level and not within WebVMT, where it wouldn't provide a solution for all use cases. That's all your call though, feel free to proceed with the WebVMT media block as currently specified!

rjksmith commented 5 years ago

Thanks for your feedback.

I'm not sure I understand what you mean by:

it does not give you a way to assemble these files together unless you already know that all of these files exist

If we know that one file exists, there are three possibilities:

a. Starting from an HTML file, I agree that the media and VMT files are already identified in HTML, unless they're loaded by the client. The link information is duplicated in the VMT file and the HTML takes precedence. In the static HTML case there is redundancy, and for dynamic HTML (e.g. from a PHP script) the VMT file includes the required information to construct the HTML.

b. Starting from a media file, there's no embedded information so the media file is unaware of the VMT file and ignores it. This allows metadata to be added without modifying existing media, which is important for certain use cases, e.g. YouTube demo, Police Evidence.

c. Starting from the VMT file, the media block identifies the linked media. I acknowledge that this could be achieved with file naming conventions in most cases, though I think that would be weak design, and I don't see how it would work for the YouTube use case as the media file is not accessed directly and its name is unknown. Including explicit link (i.e. url) details leaves no margin for error, which I believe is important for an XML variant of the data model, to ensure correct validation.

@lvdbrink: Can you offer a comment on this XML validation issue?

it wouldn't provide a solution for all use cases

Although I hadn't considered embedding VMT content within the media file, the media block could point to the current file in this case.

Perhaps I've misunderstood. Please outline a use case where the media block would not provide a solution.

lvdbrink commented 5 years ago

Concerning XML validation: in an XML variant of the VMT file, it would be possible to define (in an XML Schema) that there must be an explicit link and the contents of the link element must be of type anyURI. XML validation does not allow you to validate if the link target exists, however.

rjksmith commented 5 years ago

@lvdbrink Thanks. Good points.

Is there any best practice guidance that says linked files/media should be identified in XML, or do you know of existing examples of this?
Is there a way to validate that URI target resources can be accessed?

tidoust commented 5 years ago

I'm not sure I understand what you mean by:

it does not give you a way to assemble these files together unless you already know that all of these files exist

In the generic case, there is a list of files to assemble together, for instance a video file, a subtitles file, and a WebVMT file. If you start from any of these files, you cannot find the other two. A link from the WebVMT file to the media file won't give you the subtitles file.

In the media world, the usual approach to link these files together is to embed them in a media container to get back to a situation where there is only one file. On the Web, the usual approach is to create a fourth file that links to the 3 files, and to use this fourth file as starting point.

rjksmith commented 5 years ago

Thanks for the clarification.

In the three file case you describe, and further to items b and c above, the WebVMT media block allows the media to be found from the VMT file, which addresses your issue. It doesn't allow the subtitle file to be found, which is beyond the scope of WebVMT, or allow the media file to find the VMT file as no content is embedded, i.e. by design.

Perhaps we should include the media world embedded use case in the Editor's Draft as a way to explore that idea, either as an example or a use case. I hadn't considered using WebVMT in this way, and agree that it would potentially create a bidirectional link, thought this could be addressed in the media block by omitting the media url, as there is no linked media if it's already embedded.

The HTML file (item a above) provides the fourth file for the webpage use case.

Does that address your concerns?

tidoust commented 5 years ago

Does that address your concerns?

Nope but I don't object to it, please proceed! :)