pressbooks / pressbooks

Open publishing. Open web. Open source.
https://pressbooks.org/
GNU General Public License v3.0
420 stars 133 forks source link

[Research] Improve handling of interactive elements in EPUB/PDF exports #2287

Closed SteelWagstaff closed 2 years ago

SteelWagstaff commented 3 years ago

When a book includes an audio, video, iframe, or H5P element, we insert a fallback placeholder informing the reader that this element has been excluded from the EPUB or PDF export and providing a link to the element in the book itself. See https://github.com/pressbooks/pressbooks/tree/3217c734b9f3d25d840c247a5ac5b16b6c1698cb/templates/interactive

I'd like us to research the feasibility of the following:

  1. When the excluded activity is an H5P, can we print the full activity itself and let the user know that "This is a static version of an interactive H5P activity. To see the full interactive element, visit chapterlink#h5pAnchor". See https://github.com/pressbooks/pressbooks/issues/2189 for similar idea.
  2. When there are multiple audio, video, or iframe elements inserted consecutively (without non-media content between them), can we simply print a single placeholder that reads: "Multiple audio, video, or iframe elements have been excluded from this version of the text. You can find them online here: 'link to first of the excluded elements' (ideally with anchor)." instead of adding each of them separately? See screenshot below for an example of what we want to avoid: Screenshot from 2021-07-29 07-05-32.png
  3. Can we insert and link to anchors for all other excluded activities like we do now for H5P activities? https://github.com/pressbooks/pressbooks/pull/2153
richard015ar commented 3 years ago

I was understanding how the media content is processed in our XHTML export routine. I was trying to book a meeting with Os at the end of the sprint, but we could not arrange a time, since both were with many meeting. This was that I debugged and learned about our media content process:

richard015ar commented 2 years ago

Google PageSpeed provides a support for screenshots on server side. Testing , for example: https://pagespeedonline.googleapis.com/pagespeedonline/v5/runPagespeed?url=https://integrations.pressbooks.network/ltidemo/chapter/licao-1/#h5p-1&category=CATEGORY_UNSPECIFIED&strategy=DESKTOP&key= It takes too much time to process and as expected it does not work locally (since the URLs generated are local).

SteelWagstaff commented 2 years ago

For #1, can we try to render the content of the HTML activity in the XHTML file before producing the PDF? Look at H5P shortcode generate pattern in their plugin. If this works, render this instead of the current H5P interactive content fallback with this message: "This is a static version of an interactive H5P activity. To see the full interactive element, visit chapterlink#h5pAnchor"

For #2, can we try a CSS rule in EPUB and PDF exports that looks something like .interactive-content + .interactive-content { display: none; }. We can decide about a modified message later.

For #3, I'd like us to add anchor links for all interactive content that we replace in the EPUB/PDF exports, just like we do with H5P activities

richard015ar commented 2 years ago

For point 2: using CSS rule suggested in PDF and EPUB it looks like: Screenshot from 2021-11-11 09-15-27

Instead: Screenshot from 2021-11-11 09-15-20

It seems to work as expected!

SteelWagstaff commented 2 years ago

@richard015ar The PR you submitted works as expected for audio and video elements, but doesn't add anchor links for oEmbed elements or other iframed content. Is it possible to add anchors for these other elements as well?

richard015ar commented 2 years ago

What I got so far:

Next step in this task is research the h5p activities renderization for exports.

richard015ar commented 2 years ago

For h5p render in exports

Next step here would be: understand how the scripts works to render the content and use those in our export routine.

Unfortunately h5p activity does not provide a on demand solution to render the content, and I suspect the main reason is because most of the content use a ton of javascript, which are not only necessary for the interaction, but also for rendering the content.

richard015ar commented 2 years ago

I left a question to the H5P people about the possibility to get a preview to export it: https://h5p.org/documentation/setup/wordpress?page=2#comment-44029

fdalcin commented 2 years ago

Tested pressbooks/pressbooks#2508 on integrations and it works as expected, see below:

EPUB

image image

PDF

image image

PDF version has the long URL that we discussed.

SteelWagstaff commented 2 years ago

Findings:

  1. We don't have a good native H5P render method -- producing these in our XHTML export would be very difficult
  2. We switched to using a shared template for most interactive elements and added anchor links to all interactive fallback elements EXCEPT custom iframes. In order to do it for these, we would need to inspect post content and hook into these. Doable, but not done yet.
  3. We did not implement a method for hiding consecutive interactive elements. There are two possible approaches, both of which require analysis of the whole DOM for each chapter. Method 1 is simpler but less performant (analyze DOM and remove breaklines between interactive elements generated by Blade and then apply CSS to hide those that follow immediately after others) and Method 2 is to analyze the DOM and simply remove activities that follow others which would require a bigger refactor than method 1. A third, simpler method would be to make sure that the blade templates never produce empty paragraphs after them and then write CSS like: .interactive-content + br + .interactive-content, .interactive-content + .interactive-content { display: none; }