readium / ts-toolkit

A toolkit for ebooks, audiobooks and comics written in Typescript
BSD 3-Clause "New" or "Revised" License
60 stars 9 forks source link

Navigator + navigator-html-injectables implementation #12

Closed chocolatkey closed 2 years ago

chocolatkey commented 2 years ago

Changelog

Note: the navigator is not complete in terms of the public API functions that are expected of a readium navigator, this will be updated as the functionality is added.

Review/Implementor Notes

(that should probably be a part of documentation in the future)

General

Expect builds to fail or be incomplete. Along with the switch to Yarn PnP, the move to Yarn Workspaces (read this for more info) will be made, allowing for the packages to have separate build processes and dependencies, but depend directly on each other. Originally, the idea for this repo seems to have been to have one single build for everything, but with the advent of packages such as navigator-html, it's become evident that we need to split up builds. Trying to build completely separate bundles with different dependencies and browser support requirements is a pain, it's much easier to just have them be separate packages. And tsdx, which was originally used to simplify the building of the repo, has unfortunately slowly fallen behind trends (as well as being inadequate for certain build scenarios now needed), and so a build system inspired by what tsdx originally provided (ts, jest etc.) will be implemented.

navigator-html package

Communication ("Comms")

Message data sent over the communications channel are simple JSON objects with this structure:

export interface CommsMessage {
    _readium: number; // Sanity/version-checking field
    _channel: string; // Channel ID
    id?: string; // Optional (but recommended!) unique identifier
    strict?: boolean; // Whether or not the event *must* be handled by the receiver
    key: CommsEventKey | CommsCommandKey; // The "key" for identification to the listener
    data: unknown; // The data to be sent to the module
}

An example message could look like this:

{
    "_readium": 1,
    "_channel": "11f714d8-4fee-47de-94ab-1bc971a61529",
    "id": "1927-x7jbpd-13c5hmd",
    "data": [
        "--USER__appearance",
        "readium-night-on"
    ],
    "key": "set_property",
    "strict": false
}

The high-level overview for establishing communication with the IFrame ("IFrame" can be substituted with "Webview"):

  1. Comms class is constructed in the IFrame. It starts listening for messages from anywhere.
  2. Navigator sends a _ping message to the IFrame, using its origin (on the web, blob:xxx) to ensure it is the only recipient, with data that looks like this:
    {
    "_readium": 1,
    "_channel": "11f714d8-4fee-47de-94ab-1bc971a61529",
    "id": "1917-b4ciqy-10fdb1g",
    "key": "_ping",
    "strict": false
    }
  3. The IFrame receives this _ping message and restricts any further messages to those coming from the same origin and using the same channel ID (from the _channel key). The source of the message is also stored, and used to then send any future messages along with the relevant channel ID and origin. Overall, this helps enhance security and prevent accidental message propagation to and from the wrong IFrame.
  4. The IFrame unlocks normal communication, and sends a _pong message back to the Navigator.
  5. The Navigator receives the _pong and is now ready for normal communication as well.

Note that since module initialization can happen before/at the same time as this "handshake", the IFrame stores a buffer of unsent log messages originating from the modules, which are then all sent after step 4. This is the only type of message that modules should attempt to send of their own initiative during initialization.

Note that both the channel ID and the message ID are random strings. The _channel ID should be a secure random (on the web, using window.crypto.randomUUID()). The message id doesn't have to be secure, but it is recommended that it be sufficiently random as to avoid collisions. A more important reason not to use a secure random, and why the IFrame does not do so, is because the performance of messages is critical to ensuring a smooth reading experience, and secure randoms tend to take longer to generate, which would slow down communication. The IDs used by the IFrame and web implementation are a makeshift "mid" scheme, implemented [here] (https://github.com/readium/ts-toolkit/blob/navigator-html/navigator-html/src/comms/mid.ts). The message IDs are useful for advanced debugging, but also essential for the Navigator to associate received _ack events acknowledging the handling of the event with the original sent command. Both the navigator-html and navigator packages wrap this system hiding this from developers for convenience. In a module, you simply call the ack callback with true/false, and on the web Navigator side, the send function allows for optional providing of a callback that will be invoked when the _ack is received. Example:

// Web Navigator
frame.send("go_prev", undefined, ok => {
    console.log("Did we go back a page?", ok);
});
// IFrame (navigator-html)
comms.register("go_prev", ColumnSnapper.moduleName, (_, ack) => {
    const result = doTheWork(); // Result is false if we weren't able to go to previous
    ack(result);
});

Modules

The module class is pretty simple to implement:

export abstract class Module {
    static readonly moduleName: string;
    abstract mount(wnd: Window, comms: Comms): boolean;
    abstract unmount(wnd: Window, comms: Comms): boolean;
}

You just need a unique module name (snake_case), and to implement a mount and unmount function.

The mount function is called by the Loader, where modules are mounted one after another, and provided the IFrame window and the Comms instance. The Module can then register any commands it should listen to using comms.register, and send any commands using comms.send. Modules must be designed to not use comms.send before the communication handshake has finished (so not during the mount function). The Module can, however, use the comms.log function at any time to send logs for debugging purposes, equivalent to using console.log on the web.

The unmount function should completely clean up any event listeners or active modification of the HTML content, and cease all activity. This is because an implementer should be able to mount and unmount Modules at any time, to, for example, switch reading modes. There should be no difference in behavior after mounting and unmounting a Module many times.

Implementers wishing to add their own modules can do so. Exact/official methods for doing so are not set in stone at the moment, but but you can add a module to ModuleLibrary JS Map, then load and unload it at will by name.

Note that not all basic modules have been implemented yet.

navigator package

This work is inspired by the kotlin toolkit, and at the moment implements the reflowable EPUB navigator only. As previously mentioned, the EPUB navigator uses the navigator-html, and implements communication with the IFrame. Because this is being done on a web platform, and the IFrames are blobs which are fully accessible by parent frames, the navigator-html is actually being executed in the parent, not in the IFrame. This makes it much easier as we don't have to inject a build of the navigator-html into the IFrame, and the relevant JS is already loaded and warmed (in the JIT/cache) before we even create the IFrame. It might seem somewhat strange that the IFrame helper scripts are being run outside of the IFrame, but this possibility is intentional by design for flexibility. If you're trying to communicate with an IFrame cross-domain (e.g. loading by URL instead of using blobs, for whatever reason) then you would of course, in your custom web Navigator implementation, need to inject the navigator-html into the frame on the server-side for security reasons. And then you could still communicate with the IFrame.

On a related note, at the moment, the Navigator still does access the IFrame window directly for certain functionality, but the goal is to completely remove this at a later time so that the navigator can theoretically support interaction/modification of the IFrame purely through the Comms message communication. In addition, the current IFrame's window is currently exposed to the implementor through a getter (EpubNavigator.cframe) as a temporary (but probably for quite a while temporary until we work out the full scope of the necessary IFrame communication) solution for implementers to meddle with the IFrame directly. Try not to use it where possible.

Note: the EpubNavigator is not complete, that's being worked on.

Frame pool

This feature should be transparent to the implementer, and is intended to greatly enhance the user experience. As the name implies, instead of just a single IFrame being used to load reflowable content, there is a pool of (hidden) IFrames that are preloaded as the user browses the publication. When the user reaches the end of an element, the next element in the readingOrder can be shown nearly instantaneously. This is done by

  1. Storing resource content (with injections) as blobs in-memory for the entire duration of the reading session. This means that any re-visitation of a resource requires no additional retrieval request (on the web, an HTTP fetch)
  2. Preloading a limited amount of resources before and after the user's current position (as in Readium locator position) in the publication.

As the user goes fowards, backwards, jumps etc. in the publication, the FramePoolManager determines what the current IFrame is, as well what frames to load and dispose of. I consider the use of the position list for this to be by far the "smartest" choice. A more rudimentary piece of logic could, for example, just use the readingOrder, and say "load the next 2 resources in the reading order before and after the current one". But this would result in much more inefficient preloading. If there is, for example, a very long chapter, why preload the next resource when it will take the user 50 pages to get to it anyway? Why not instead load it only 5 pages before? That is exactly what the usage of the position list enables, so even as a user navigates through pages in a resource. I tried to make a diagram illustrating the sliding disposal/creation window, but it didn't turn out that well. I'll leave it here though, in case someone finds it useful: Illustration of the sliding disposal/creation window It's probably best just to look at implementation itself though, it's not very long.

Note that right now, the readium CSS that is injected into the IFrame is hardcoded to a jsdelivr URL. This will be fixed to use a local version soon.

Also note that right now, because presentation settings are not yet a part of the package, they cannot be sent before the iframe is shown to the user, so despite the preloading of the IFrame, implementations that use EpubNavigator.cframe to inject presentations settings into the iframe won't be able to do it fast enough to prevent an FOUC (flash of unstyled content), especially noticeable with dark mode theme switching.

Known issues

github-actions[bot] commented 2 years ago

size-limit report 📦

Path Size
dist/shared.cjs.production.min.js 11.49 KB (0%)
dist/shared.esm.js 10.29 KB (0%)