hypothesis / client

The Hypothesis web-based annotation client.
Other
641 stars 197 forks source link

Get Hypothesis client working in new VitalSource reader #3590

Closed robertknight closed 2 years ago

robertknight commented 3 years ago

The Hypothesis-enabled version of the VitalSource Bookshelf application has been updated to use a new version of their reader. It can be tested at https://hypothesis.instructure.com/courses/125/assignments/1518. (Caveat: As of 2021-07-19 there is currently an issue that the new reader is not used if the window size is below some threshold (~1200px?))

The Hypothesis client is being loaded and configured in the new reader, but the sidebar is not visible. When a selection is made the Annotate and Highlight buttons appear, but often obscured by the existing VS controls. Clicking on the buttons does not cause the sidebar to appear however.

Update 2021-07-25: The issue with the sidebar not appearing has been resolved, but this now reveals an additional problem where the sidebar controls become inaccessible when the viewport is scrolled. See https://github.com/hypothesis/client/issues/3590#issuecomment-886229080. We need to explore some options and talk with the VS team about ways to resolve this.

robertknight commented 3 years ago

Ah - I think the problem here is one that we encountered while looking into a new inter-frame communication system. The client's current logic for the sidebar discovering guests involves the sidebar recursively enumerating frames in the window using window.frames. However that property does not include iframes inside Shadow DOM. The new VitalSource viewer renders the book content inside an iframe that is contained within a shadow root belonging to a <mosaic-book> element.

robertknight commented 3 years ago

Using this hack enables the sidebar to discover the host frame even if the host frame is contained within a shadow root:

diff --git a/src/shared/discovery.js b/src/shared/discovery.js
index faa1297a8..a5caa2663 100644
--- a/src/shared/discovery.js
+++ b/src/shared/discovery.js
@@ -110,7 +110,7 @@ export default class Discovery {

     // Perform a top-down, breadth-first traversal of frames in the current
     // window and send messages to them.
-    const queue = [this.target.top];
+    const queue = [this.target.parent];
     while (queue.length > 0) {
       const parent = /** @type {Window} */ (queue.shift());
       if (parent !== this.target) {

I tested this by unloading the production client from within the frame in VitalSource viewer that contains the book chapter content and Hypothesis client, and then loading in my local development client:

var s = document.createElement('script');s.src='http://localhost:5000/embed.js';document.body.appendChild(s);

This allows the Hypothesis sidebar to show up, although it looks odd because it is constrained to the iframe where the book content is displayed:

Hypothesis client in new reader
robertknight commented 3 years ago

Looking at the URL that the Hypothesis client gets for the test document I'm looking at, there is something that looks like a junk/test query param (favre):

https://jigsaw.vitalsource.com/books/9781400847402/epub/OEBPS/13.chapterfive.xhtml?favre=brett

We need to make sure annotations are not lost if that query param is later lost or changed. We might want to parse the book ID and chapter reference out of the URL and turn that into some custom URN that we can use as an alternative search URI for the document, similar to how the document fingerprint is used for PDFs.

robertknight commented 3 years ago

Following https://github.com/hypothesis/client/pull/3599, Hypothesis is now minimally functional with the new VitalSource reader. See test assignment at https://hypothesis.instructure.com/courses/125/assignments/1518.

Hypothesis new VS reader

Leaving aside the fact that the current sidebar UI design looks weird floating in mid-air, the main functional issue is that the iframe in which the book content is presented and highlights appear is not contrained to the real viewport but is in fact really tall:

Tall iframe

As a result, UI elements that should appear at the top and bottom of the viewport in the viewer do not. Instead they are outside the visible area:

Sidebar toolbar off screen

I'm not sure in this situation if there is a way for the Hypothesis client to determine what proportion of the document is visible in the containing document. Might be worth testing whether this is possible with IntersectionObserver.

I suspect having a really tall iframe whose viewport lies outside the bounds of what is actually visible to the user is going to cause problems in browsers as well as for us. I also wonder whether it might affect browser rendering optimizations that make use of knowledge about what content is inside or outside the viewport.

I'm going to re-open this issue and update the PR description.

robertknight commented 3 years ago

An additional issue with both the old and new VitalSource readers is that they don't work in Safari, when the VS reader is loaded inside an iframe. Upon loading the assignment the user is greeted with:

VS reader Safari

Additionally the "Support Centre" link doesn't work. Clicking it results the following console error:

Refused to display 'https://support.apple.com/en-gb/guide/safari/sfri11471/mac' in a frame because it set 'X-Frame-Options' to 'SAMEORIGIN, SAMEORIGIN'.

I expect this problem relates to Safari's tracking prevention features. The Support Centre link that this form tries to direct Safari users to doesn't clearly help them. The only obviously relevant option is "Block all cookies", which is de-selected by default, and just making sure that is unchecked is not enough in current Safari versions.

There is also a "Prevent cross-site tracking" option. Unchecking that does allow the reader to work in Safari, but we really don't want to have ask users to make that global change, even if only temporarily.

robertknight commented 3 years ago

A possible solution for the tall-iframe issue raised in https://github.com/hypothesis/client/issues/3590#issuecomment-886229080 is to load the Hypothesis sidebar into the parent frame of the one that actually contains the book chapter content.

The structure of the frame tree when looking at a VitalSource + Hypothesis assignment, configured to load in a new tab, is:

1. lms.hypothes.is (our LMS app)
|-- 2. hypothesis.vitalsource.com (frame we created to launch VS book viewer)
      |-- 3. jigsaw.vitalsource.com/mosaic/wrapper.html (container of chapter content)
           |-- 4. jigsaw.vitalsource.com/books/{bookID}/... (actual chapter content)

The Hypothesis client is currently being loaded into frame (4). We could instead load the client into frame (3) and use the client's support for annotating same-origin iframes to support annotating in frame (4) while displaying the sidebar in frame (3). I tried doing this manually by unloading the client from frame (4), adding an enable-annotation attribute to the <iframe> for frame (4) and then loading the client into frame (3). Initially this didn't work because the <iframe> element containing frame (4) is contained within a shadow root in frame (3). It seems that our code for discovering annotation-enabled iframes is not able to find iframes in shadow DOM. I then tried manually loading the client into frame (4), but configured to operate in guest-only mode by setting the subFrameIdentifier: "some-random-strike" configuration. This worked:

Sidebar correct placement

The screenshot above shows that the sidebar now displays correctly when the content is scrolled, unlike before. Having the content in a separate frame from the sidebar causes other problems however:

  1. The bucket bar does not currently display indicators for highlights that are not in the host frame (ie. where the sidebar is)
  2. Clicking an annotation in the sidebar will scroll the frame containing the highlight to the correct location, but does not scroll the parent frame if needed. In this case frame (4)'s viewport is a really tall iframe where all of the chapter content lies within the iframe's viewport at all times, but only part of that iframe is visible in the parent frame.
  3. The Hypothesis client has a number of long-standing issues related to what happens when multiple iframes are connected to the sidebar at once. For example:
    • Annotations for all guests are sent to all frames, not just the annotations that are relevant for a particular frame. Amongst other issues, this can result if an annotation sometimes incorrectly being marked as an orphan if the anchoring result comes back as a failure from a frame that the annotation should not have been sent to in the first place
    • Various UI elements in the sidebar only reflect the URL of the top-level frame (the one containing the sidebar), not any additional guest frames
  4. When navigating between chapters, if that changes the document in frame (4), we need to make sure that the information about connected frames in the sidebar is correctly updated. I'm not sure this happens currently.

Fully resolving all issues with multi-frame support in (3) is probably quite a large project. In this case we might be able to simplify the problem by instead still supporting only a single guest, but allow it to be a different frame than the sidebar.

In addition to the above issues, some code in the new reader is triggering frequent console errors when the client is loaded into frame (4) and annotation is happening in frame (3):

Message errors

This error is happening in Discovery._onMessage because it gets passed a message event which is not in fact a MessageEvent but instead a CustomEvent with type "message". This event is being programmatically generated internally by some code in the VitalSource viewer that looks like this:

const triggerBrowserEvent = (eventName, data = {}) => {
  let event = new CustomEvent(eventName, {
    detail: { ...data },
  });

  // trigger the event in this wrapping window case other inner docs care
  dispatchEvent(event);
  // also pass the event up the chain (this method could use a better name)
  sendEventUp(event, data);
};

The problem of unexpected objects being passed to message event handlers can be partly addressed by the ongoing work in the client to replace window.postMessage-based messaging with MessageChannel. However we'll still need to be mindful of this hazard because even in a MessageChannel-based world, there will still be a small amount of window.postMessage usage in order to set up the initial MessageChannel-based connections. Any handlers for the message event on a Window will need to either verify that events are MessageEvent instances or make sure they check for all the existence of all fields they rely on.

robertknight commented 3 years ago

The console errors mentioned in https://github.com/hypothesis/client/issues/3590#issuecomment-887336734 should be fixed by https://github.com/hypothesis/client/pull/3611 which eliminates most usage of window.postMessage for inter-frame communication, except for the initial connection which is required to transfer a MessagePort for subsequent communication using the Channel Messaging API.

robertknight commented 3 years ago

VitalSource have modified their reader so that it is no longer using a tall iframe. This means that loading the Hypothesis client directly into the frame where the current book content is displayed is now an option again:

vs-chapter-frame

Loading the client into the parent frame still has an advantage in that it avoids the client being rebooted each time you switch between frames, which can allow us to fetch and display annotations for the new content more quickly.

robertknight commented 2 years ago

The Hypothesis client is now working with the new VS reader for ebooks, in the case where the client is loaded directly into the content frame, as it currently is in the LMS. When the client is loaded into the parent frame, as is the case when using the browser extension for example, things partly work but there is big a list of issues with our multi-frame support that we are working our way through. There is a tracking issue at https://github.com/hypothesis/client/issues/3798 which lists all the parts of this.

One major element of fixing these multi-frame support issues is the deployment of new infrastructure in the client for setting up the communication between different frames. We have now shipped this, but ran into issues with errors in various cases when a certain frames try to connect. See https://github.com/hypothesis/client/issues/3986. It turns out that many of these are longstanding issues which just hadn't come to light before now. In many cases these may be due to issues of the environment beyond our control (eg. a web-based proxy breaks browser APIs in strange ways) and so we want to just ignore them. Nevertheless we're having to work our way through them and make sure we haven't broken anything important.

In future we will probably want to make VS in the LMS context work like it does in the browser extension, with the client loaded into the container frame. This will make navigating between chapters / PDF pages quicker, as the client doesn't have to be reloaded each time, and also allow us to preserve state when navigating between chapters.

Work to support PDF-based books is ongoing. We deployed initial support and have two major tasks ahead of us: