atc0005 / notes

Various notes, quick references and topics I want to explore further
MIT License
0 stars 0 forks source link

Repackaging and validating epub files #76

Open atc0005 opened 9 months ago

atc0005 commented 9 months ago

Overview

Multiple times now I've purchased ebooks from packtpub.com only to find that later when read using Google Play Books that code snippets are displayed far to the left, outside of the left margin:

negative text indent example

This appears to come from pre style having a negative text-indent property applied to it:

  pre {
    /*overflow: auto;*/
      -epub-hyphens: none;
      -epub-ruby-position: over;
      font-size: 0.85rem;
      line-height: 1.03rem;
    margin-top: 0.3rem;
    margin-bottom: 0;
    margin-left: 0rem;
    margin-right: 0;
    text-align: left;
    text-decoration: none;
    text-indent: -4.6rem;
  }

Likewise, I purchase titles from leanpub.com and occasionally find that epub stylesheets have invalid settings which prevent those titles from being imported into Google Play Books. I have to manually modify those stylesheets and repackage the epub archives to resolve those issues before I can read them using that platform.

Generate epubcheck container for validation purposes

Before we modify stylesheets, we first build an epubcheck container based on the current stable tag.

I used Ubuntu 22.04 via WSLv2 for this work.

cd /mnt/t/github
git clone https://github.com/w3c/epubcheck
cd epubcheck
git tag -l | sort -V
git checkout v5.1.0
podman build . -t epubcheck-v5.1.0

Having this container will help us validate issues with both the original epub file and any repackaged files (with hotfixes) that we create.

Hotfixes

Negative indentation

For the negative indentation problem, a reliable hotfix appears to be just removing or commenting out that specific property and then repackaging the epub and reuploading it to Google Play Books.

Steps:

  1. Unzip the epub file (e.g., 9781804611654.epub)
  2. Move the original epub file into a temporary directory (e.g., original_file)
  3. Open the */OEBPS/css/style-JRserifv6.css file
  4. Comment out or remove the text-indent property
  5. Save changes

Changes to the file:

  pre {
    /*overflow: auto;*/
      -epub-hyphens: none;
      -epub-ruby-position: over;
      font-size: 0.85rem;
      line-height: 1.03rem;
    margin-top: 0.3rem;
    margin-bottom: 0;
    margin-left: 0rem;
    margin-right: 0;
    text-align: left;
    text-decoration: none;
-    text-indent: -4.6rem;
+    /*text-indent: -4.6rem;*/
  }

CSS-001 epubcheck validation error

I encountered a vague error when attempting to upload a recent copy of the PowerShell 101 book from leanpub.com to Google Play Books. As I have learned to do when encountering those vague upload errors, I use epubcheck to list the specific errors encountered (as I presume that Google Play Books is using the same or similar tool behind the scenes).

In this case, I received a CSS-001 error.

$ podman run -it --rm -v /mnt/t/temp:/data epubcheck-v5.1.0 powershell101.epub
Validating using EPUB version 3.3 rules.
ERROR(CSS-001): powershell101.epub/OEBPS/stylesheet.css(325,3): The "direction" property must not be included in an EPUB Style Sheet.

Check finished with errors
Messages: 0 fatals / 1 error / 0 warnings / 0 infos

EPUBCheck completed

The hotfix:

  1. Unzip the epub file (e.g., powershell101.epub)
  2. Move the original epub file into a temporary directory (e.g., original_file)
  3. Open the */OEBPS/stylesheet.css file
  4. Comment out the div.ltr entry
  5. Save changes

This was the block that was commented out:

div.ltr {
  direction: ltr;
}

Repackage epub

Continuing with the packtpub.com ebook example, we first add the mimetype file to a new epub archive (without compression) before adding other content from the original epub archive (using maximum compression).

  1. cd /path/to/extracted/epub/dir
    • e.g., cd "/mnt/t/temp/9781804611654"
  2. zip -X0 9781804611654.epub mimetype
  3. zip -9 -r 9781804611654.epub META-INF OEBPS

The resulting 9781804611654.epub file is ready for validation and re-upload to Google Play Books.

Validating epub

Continuing from our packtpub.com ebook example.

$ cd /mnt/t/temp/9781804611654
$ podman run -it --rm -v "$PWD":/data epubcheck-v5.1.0 9781804611654.epub
Validating using EPUB version 3.3 rules.
No errors or warnings detected.
Messages: 0 fatals / 0 errors / 0 warnings / 0 infos

EPUBCheck completed

References