standardebooks / tools

The Standard Ebooks toolset for producing our ebook files.
Other
1.43k stars 127 forks source link
ebook epub epub3 python

About

A collection of tools Standard Ebooks uses to produce its ebooks, including basic setup of ebooks, text processing, and build tools.

Installing this toolset using pipx makes the se command line executable available. Its various commands are described below, or you can use se help to list them.

Installation

The toolset requires Python >= 3.8 and <= 3.12.

To install the toolset locally for development and debugging, see Installation for toolset developers.

Optionally, install Ace and the se build --check command will automatically run it as part of the checking process.

Ubuntu 20.04 (Trusty) users

# Install some pre-flight dependencies.
sudo apt install -y calibre default-jre git python3-dev python3-pip python3-venv

# Install pipx.
python3 -m pip install --user pipx
python3 -m pipx ensurepath

# Install the toolset.
pipx install --python=3.12 --fetch-missing-python standardebooks

Optional: Install shell completions

# Install ZSH completions.
sudo ln -s $HOME/.local/pipx/venvs/standardebooks/lib/python3.*/site-packages/se/completions/zsh/_se /usr/share/zsh/vendor-completions/_se && hash -rf && compinit

# Install Bash completions.
sudo ln -s $HOME/.local/pipx/venvs/standardebooks/lib/python3.*/site-packages/se/completions/bash/se /usr/share/bash-completion/completions/se

# Install Fish completions.
ln -s $HOME/.local/pipx/venvs/standardebooks/lib/python3.*/site-packages/se/completions/fish/se $HOME/.config/fish/completions/se.fish

Fedora 41 users

# Install some pre-flight dependencies.
sudo dnf install pipx python3.12 python3.12-devel gcc libxslt-devel calibre git java-21-openjdk-headless

# Ensure PATH environment variable is correctly set up for pipx
pipx ensurepath

# Install the toolset.
pipx install --python=3.12 standardebooks
pipx inject standardebooks setuptools

Optional: Install shell completions

# Install ZSH completions.
sudo ln -s $HOME/.local/share/pipx/venvs/standardebooks/lib/python3.*/site-packages/se/completions/zsh/_se /usr/share/zsh/vendor-completions/_se && hash -rf && compinit

# Install Bash completions.
sudo ln -s $HOME/.local/share/pipx/venvs/standardebooks/lib/python3.*/site-packages/se/completions/bash/se /usr/share/bash-completion/completions/se

# Install Fish completions.
ln -s $HOME/.local/share/pipx/venvs/standardebooks/lib/python3.*/site-packages/se/completions/fish/se $HOME/.config/fish/completions/se.fish

macOS users

  1. Install the Homebrew package manager. Or, if you already have it installed, make sure it’s up to date:

    brew update
  2. Install dependencies:

    # Install some pre-flight dependencies.
    brew install cairo calibre git openjdk pipx python@3.11
    pipx ensurepath
    sudo ln -sfn $(brew --prefix)/opt/openjdk/libexec/openjdk.jdk /Library/Java/JavaVirtualMachines/openjdk.jdk
    
    # Install the toolset.
    pipx install --python python3.11 standardebooks
    
    # Optional: Bash users who have set up bash-completion via brew can install tab completion.
    ln -s $HOME/.local/pipx/venvs/standardebooks/lib/python3.*/site-packages/se/completions/bash/se $(brew --prefix)/etc/bash_completion.d/se
    
    # Optional: Fish users can install tab completion.
    ln -s $HOME/.local/pipx/venvs/standardebooks/lib/python3.*/site-packages/se/completions/fish/se $HOME/.config/fish/completions/se.fish

OpenBSD 6.6 Users

These instructions were tested on OpenBSD 6.6, but may also work on the 6.5 release as well.

  1. Create a text file to feed into pkg_add called ~/standard-ebooks-packages. It should contain the following:

    py3-pip--
    py3-virtualenv--
    py3-gitdb--
    jdk--%11
    calibre--
    git--
  2. Install dependencies using doas pkg_add -ivl ~/standard-ebooks-packages. Follow linking instructions provided by pkg_add to save keystrokes, unless you want to have multiple python versions and pip versions. In my case, I ran doas ln -sf /usr/local/bin/pip3.7 /usr/local/bin/pip.

  3. Add ~/.local/bin to your path.

  4. Run pip install --user pipx

  5. If you’re using ksh from base and have already added ~/.local/bin, you can skip pipx ensurepath because this step is for bash users.

  6. The rest of the process is similar to that used on other platforms:

    # Install the toolset.
    pipx install standardebooks

Installation for toolset developers

If you want to work on the toolset source, it’s helpful to tell pipx to install the package in “editable” mode. This will allow you to edit the source of the package live and see changes immediately, without having to uninstall and re-install the package.

To do that, follow the general installation instructions above; but instead of doing pipx install standardebooks, do the following:

git clone https://github.com/standardebooks/tools.git
pipx install --editable ./tools

Now the se binary is in your path, and any edits you make to source files in the tools/ directory are immediately reflected when executing the binary.

Running commands on the entire corpus

As a developer, it’s often useful to run an se command like se lint or se build on the entire corpus for testing purposes. This can be very time-consuming in a regular invocation (like se lint /path/to/ebook/repos/*), because each argument is processed sequentially. Instead of waiting for a single invocation to process all of its arguments sequentially, use GNU Parallel to start multiple invocations in parallel, with each one processing a single argument. For example:

# Slow, each argument is processed in sequence
se lint /path/to/ebook/repos/*

# Fast, multiple invocations each process a single argument in parallel
export COLUMNS; parallel --keep-order se lint ::: /path/to/ebook/repos/*

The toolset tries to detect when it’s being invoked from parallel, and it adjusts its output to accomodate.

We export COLUMNS because se lint needs to know the width of the terminal so that it can format its tabular output correctly. We pass the --keep-order flag to output results in the order we passed them in, which is useful if comparing the results of multiple runs.

Linting with pylint and mypy

Before we can use pylint or mypy on the toolset source, we have to inject them (and additional typings) into the venv pipx created for the standardebooks package:

pipx inject standardebooks pylint==3.2.2 mypy==1.10.0 types-requests==2.32.0.20240602 types-setuptools==70.0.0.20240524 types-Pillow==10.2.0.20240520

Then make sure to call the pylint and mypy binaries that pipx installed in the standardebooks venv, not any other globally-installed binaries:

cd /path/to/tools/repo
$HOME/.local/pipx/venvs/standardebooks/bin/pylint tests/*.py se

Testing with pytest

Instructions are found in the testing README.

Code style

Help wanted

We need volunteers to take the lead on the following goals:

Tool descriptions

What a Standard Ebooks source directory looks like

Many of these tools act on Standard Ebooks source directories. Such directories have a consistent minimal structure:

.
|__ images/
|   |__ cover.jpg
|   |__ cover.source.jpg
|   |__ cover.svg
|   |__ titlepage.svg
|
|__ src/
|   |__ META-INF/
|   |   |__ container.xml
|   |
|   |__ epub/
|   |   |__ css/
|   |   |   |__ core.css
|   |   |   |__ local.css
|   |   |   |__ se.css
|   |   |
|   |   |__ images/
|   |   |   |__ cover.svg
|   |   |   |__ logo.svg
|   |   |   |__ titlepage.svg
|   |   |
|   |   |__ text/
|   |   |   |__ colophon.xhtml
|   |   |   |__ imprint.xhtml
|   |   |   |__ titlepage.xhtml
|   |   |   |__ uncopyright.xhtml
|   |   |
|   |   |__ content.opf
|   |   |__ onix.xml
|   |   |__ toc.xhtml
|   |
|   |__ mimetype
|
|__ LICENSE.md

./images/ contains source images for the cover and titlepages, as well as ebook-specific source images. Source images should be in their maximum available resolution, then compressed and placed in ./src/epub/images/ for distribution.

./src/epub/ contains the actual epub files.