TTS-ing an entire chapter per audio file output

lukegotjellyfish commented 12 months ago

Edit: This would require a change in the way TTS calls are handled to bypass character limits to work properly.

In an earlier version of Librera I had been using a Line Spacing/LineHeight value of 0 for the TTS recording function to create an output of each book chapter to a single audio file. This was very helpful to manage all the audio files as each numbered file corresponded to a sequential chapter.

I have been replicating this functionality with Custom CSS to achieve the same result.

Could an option be added, perhaps to the Prefereces > Advanced Settings page (or in the TTS menu itself?) to force all of a chapter's text onto a single page for TTS or to export TTS to a single file per chapter?

I see this is a very niche use case but I think it would enhance the Record TTS feature.

Nitrousoxide commented 8 months ago

This would be nice as it would also allow a stitching application after the fact to build an m4b file with the chapters appropriately flagged for an audiobook listener.

Nitrousoxide commented 8 months ago

I have been replicating this functionality with Custom CSS to achieve the same result.

BTW, what custom CSS are you currently using?

lukegotjellyfish commented 8 months ago

I have been replicating this functionality with Custom CSS to achieve the same result.

BTW, what custom CSS are you currently using?

This will depend on how your EPUB is formatted as every EPUB I have uses \

tags for the chapter body with no images.

Go to Preferences>Reading Settings -> Styles "option" button on the far right p {line-height: 0 !important} And just add a char (such as /) before the p to invalidate it like so when you want to quickly enable/disable it. /p {line-height: 0 !important}

I'm no CSS whizz so there may be a better universal solution for more HTML elements but this was all I needed.

Nitrousoxide commented 8 months ago

hmm, thanks. I gave this a shot but noticed that it sometimes reads error lines into the audiofiles at their end saying "character count max reached".

If this ever gets implemented in the app itself it may have to do page-by-page recordings of the tts but then stitch them together afterward to avoid that issue.

Nitrousoxide commented 8 months ago

As a solution for folks here for a work around I have a bash script that will merge the individual page mp3s according to the file structure. Maybe there's a better way to do it via a range csv or something but this seems to work more reliably.

Organize the pages like this:

Book Name
  |
  |--Chapter 1
  |      |--page-0002.mp3
  |      |--page-0003.mp3
  |
  |--Chapter 2
         |--page-0004.mp3
         |--page-0005.mp3

Then put this in your "book name directory"

#!/bin/bash

# Set the root directory where the book and chapter directories are located
root_directory="."

# Create the "output" directory if it doesn't exist
output_directory="$root_directory/output"
mkdir -p "$output_directory"

# Loop through each chapter directory
for chapter_dir in "$root_directory"/*; do
  if [ -d "$chapter_dir" ] && [ "$chapter_dir" != "$output_directory" ]; then
    # Get the chapter name (directory name)
    chapter_name=$(basename "$chapter_dir")

    # Prepare the output file name
    output_file="$output_directory/$chapter_name.mp3"

    # Merge the page MP3s in the chapter into a single MP3
    ffmpeg -i "concat:$(ls -v "$chapter_dir"/*.mp3 | tr '\n' '|')" -c copy "$output_file"
  fi
done

and run it. It should spit out the merged and chapter titled *.mp3's into an "output" directory under your root "book name" directory.

Edit: You can also set your root directory as $PWD then chmod +x the script to allow execution and drop it somewhere in your $PATH. This will allow it to run anywhere so you can just keep the one copy of the script around and invoke it when you need it.

#!/bin/bash

# Set the root directory where the book and chapter directories are located
root_directory="$PWD"
...

lukegotjellyfish commented 8 months ago

hmm, thanks. I gave this a shot but noticed that it sometimes reads error lines into the audiofiles at their end saying "character count max reached".

If this ever gets implemented in the app itself it may have to do page-by-page recordings of the tts but then stitch them together afterward to avoid that issue.

You're absolutely right and I completely missed that 0_0.

Before using my line-height=0 method I had been combining chapter ranges (eg 1-50) into <500MB mp3 files, for playback on my mobile device, using the default viewing settings generating multiple files per chapter.

I wrote a few simple powershell/batch scripts and used mp3cat to join them all together.

I hadn't actually listened to or checked my single-file chapter audio files.

I had incorrectly assumed that the TTS calls were done per sentence as another mobile app I've used had done something like that before (gmathi/NovelLibrary)

lukegotjellyfish commented 8 months ago

As a solution for folks here for a work around I have a bash script that will merge the individual page mp3s according to the file structure. Maybe there's a better way to do it via a range csv or something but this seems to work more reliably.

As you mention here, it might be better or more simple to instead have a single text export alongside the audio files specifying the chapter lengths of the chapters generated like the "Contents" chapter list button in-app does.

foobnix / LibreraReader

TTS-ing an entire chapter per audio file output #1136