hypothesis / via

Proxies third-party PDF files and HTML pages with the Hypothesis client embedded, so you can annotate them
https://via.hypothes.is/
BSD 2-Clause "Simplified" License
20 stars 7 forks source link

Implement general video annotation based on HTML `<video>` and related standards #1293

Closed robertknight closed 7 months ago

robertknight commented 8 months ago

The current video annotation tool works with YouTube embeds. We'd like to extend it to more sources, starting with Canvas Studio. Preferably we'd like to avoid having to build a new player UI for every video service that we integrate with. One way to do this is to build a more general purpose video annotation feature using browser-native video support. The way this would work is:

  1. The Hypothesis LMS app, or another service, obtains the media and transcript URLs, eg. by using APIs provided by the media owner.
  2. The LMS app, or other service, launches the video annotation tool in Via (eg. via a form submission or URL query params) supplying the following data:
    • Media URLs for use with a <video>'s <source> list. These must be in widely supported formats such as MP4. These may be temporary signed URLs.
    • The canonical URL for the media, to store with annotation as the document URL
    • Metadata (title etc.) for the media, to store as document metadata
    • The content of the transcript, or a URL pointing to it, in WebVTT or SRT formats
  3. The Via backend would fetch, or parse, the transcript, and convert it into the JSON format that the video annotation frontend consumes. The backend would then serve the video annotation frontend, along with the transcript data and canonical URL. The frontend would display the video using a <video> element, and the rest of the UI would work the same way it does with YouTube.
robertknight commented 8 months ago

In the case of Canvas Studio specifically, there are APIs for downloading the media, which returned a signed URL for an MP4 file (or possibly other formats?), plus the transcript in SRT format (or possibly other formats?). We need to check whether there are any other formats we might get back that we'll have to handle.

robertknight commented 7 months ago

This was implemented in https://github.com/hypothesis/via/pull/1294.