๐ซด Generate, sync, and manage subtitle files for any media; Generate your own audiobook subs similar to Kindle's Immersion Reading ๐๐ง
For a glimpse of some of the technologies & techniques we're using depending on the arguments, here's a short list:
Currently I am only developing this tool for Japanese use, though rumor has it, the lang
flag can be used for other languages too.
It requires a modern GPU with decent VRAM, CPU, and RAM. There's also a community built Google Colab notebook available on discord.
Current State of SubPlz alignments:
How does this compare to Alass for video subtitles?
Current State of Alass alignments:
Support for this tool can be found on KanjiEater's thread on The Moe Way Discord
Support for any tool by KanjiEater can be found on KanjiEater's Discord
The Deep Weeb Podcast - Sub Please ๐
If you find my tools useful please consider supporting via Patreon. I have spent countless hours to make these useful for not only myself but other's as well and am now offering them completely 100% free.
If you can't contribute monetarily please consider following on a social platform, joining the discord & sharing a kind message or sharing this with a friend.
m4b
, mkv
or any other audio/video filesrt
, vtt
, ass
, txt
, or epub
/sync/
โโโ /Harry Potter 1/
โโโ Im an audio file.m4b
โโโ Harry Potter.epub
โโโ /Harry Potter 2 The Spooky Sequel/
โโโ Harry Potter 2 The Spooky Sequel.mp3
โโโ script.txt
-d
parameter can multiple audiobooks to process like: subplz sync -d "/mnt/d/sync/Harry Potter 1/" "/mnt/d/sync/Harry Potter 2 The Spooky Sequel/"
subplz sync -d "<full folder path>"
using something like /mnt/d/sync/Harry Potter 1
m4b
, mkv
or any other audio/video file
/sync/
โโโ /NeoOtaku Uprising The Anime/
โโโ NeoOtaku Uprising EP00.mkv
โโโ NeoOtaku Uprising EP01.avi
-d
parameter can multiple files to process like: subplz gen -d "/mnt/d/NeoOtaku Uprising The Anime" --model large-v3
subplz gen -d "<full folder path>" --model large-v3
using something like /mnt/d/sync/NeoOtaku Uprising The Anime
. Large models are highly recommended for gen
(unlike sync
)--lang-ext az
to set a language you wouldn't otherwise need as a designated "AI subtitle", and use it as a fallback when sync doesn't work or you don't have existing subtitles alreadym4b
, mkv
or any other audio/video file*.en.srt
extension in the folder/sync/
โโโ /NeoOtaku Uprising The Anime/
โโโ NeoOtaku Uprising With Embedded Eng Subs EP00.mkv
โโโ NeoOtaku Uprising With Embedded Eng Subs EP00.srt
โโโ NeoOtaku Uprising With No Embedded Eng Subs EP01.mkv
โโโ NeoOtaku Uprising With No Embedded Eng Subs EP01.en.srt
โโโ NeoOtaku Uprising With No Embedded Eng Subs EP01.srt
-d
parameter can multiple files to process like: subplz sync -d "/mnt/d/NeoOtaku Uprising The Anime" --alass --lang-ext "ja" --lang-ext-original "en"
--lang-ext-incorrect "ja"
if you had NeoOtaku Uprising With No Embedded Eng Subs EP01.ja.srt
instead of NeoOtaku Uprising With No Embedded Eng Subs EP01.srt
. This is the incorrect timed sub from Alass--lang-ext-original
extension, make sure the subtitles are sanitized, convert subs to the same format for Alass if need be, and align the incorrect timings with the timed subs to give you correctly timed subs like below:
/sync/
โโโ /NeoOtaku Uprising The Anime/
โโโ NeoOtaku Uprising With Embedded Eng Subs EP00.mkv
โโโ NeoOtaku Uprising With Embedded Eng Subs EP00.en.srt (embedded)
โโโ NeoOtaku Uprising With Embedded Eng Subs EP00.ja.srt (timed)
โโโ NeoOtaku Uprising With Embedded Eng Subs EP00.srt (original/incorrect timings)
โโโ NeoOtaku Uprising With No Embedded Eng Subs EP01.mkv
โโโ NeoOtaku Uprising With No Embedded Eng Subs EP01.en.srt (no change)
โโโ NeoOtaku Uprising With No Embedded Eng Subs EP01.ja.srt (timed)
โโโ NeoOtaku Uprising With No Embedded Eng Subs EP01.srt (original/incorrect timings)
Put a video(s) & sub file(s) that need alignment in a folder.
m4b
, mkv
or any other audio/video file
/sync/
โโโ /NeoOtaku Uprising The Anime/
โโโ NeoOtaku Uprising With Embedded Eng Subs EP00.mkv
โโโ 1.srt
โโโ NeoOtaku Uprising With No Embedded Eng Subs EP01.mkv
โโโ 2.ass
Run subplz rename -d "/mnt/v/Videos/J-Anime Shows/NeoOtaku Uprising The Anime/" --lang-ext "ab" --dry-run
to see what the rename would be
If the renames look right, run it again without the --dry-run
flag: subplz rename -d "/mnt/v/Videos/J-Anime Shows/NeoOtaku Uprising The Anime/" --lang-ext ab --dry-run
/sync/
โโโ /NeoOtaku Uprising The Anime/
โโโ NeoOtaku Uprising With Embedded Eng Subs EP00.mkv
โโโ NeoOtaku Uprising With Embedded Eng Subs EP00.ab.srt
โโโ NeoOtaku Uprising With No Embedded Eng Subs EP01.mkv
โโโ NeoOtaku Uprising With No Embedded Eng Subs EP01.ab.ass
Put a video(s) & sub file(s) that match names in a folder.
m4b
, mkv
or any other audio/video file/sync/
โโโ /NeoOtaku Uprising The Anime/
โโโ NeoOtaku Uprising With Embedded Eng Subs EP00.mkv
โโโ NeoOtaku Uprising With Embedded Eng Subs EP00.ab.cc.srt (notice the hearing impaired cc)
โโโ NeoOtaku Uprising With No Embedded Eng Subs EP01.mkv
โโโ NeoOtaku Uprising With No Embedded Eng Subs EP01.ab.srt
Run subplz rename -d "/mnt/v/Videos/J-Anime Shows/NeoOtaku Uprising The Anime/" --lang-ext jp --lang-ext-original ab
to get:
/sync/
โโโ /NeoOtaku Uprising The Anime/
โโโ NeoOtaku Uprising With Embedded Eng Subs EP00.mkv
โโโ NeoOtaku Uprising With Embedded Eng Subs EP00.jp.srt (notice the removed cc)
โโโ NeoOtaku Uprising With No Embedded Eng Subs EP01.mkv
โโโ NeoOtaku Uprising With No Embedded Eng Subs EP01.jp.srt
Currently supports Docker (preferred), Windows, and unix based OS's like Ubuntu 22.04 on WSL2. Primarily supports Japanese, but other languages may work as well with limited dev support.
sync
on the root of MyDrivesync
folder-d "/content/drive/MyDrive/sync/Harry Potter 1/"
for the quick guide exampledocker run -it --rm --name subplz \
-v <full path to up to content folder>:/sync \
-v <your folder path>:/SyncCache \
kanjieater/subplz:latest \
sync -d "/sync/<content folder>/"
Example:
/mnt/d/sync/
โโโ /ๅคใชๅฎถ/
โโโ ๅคใชๅฎถ.m4b
โโโ ๅคใชๅฎถ.epub
docker run -it --rm --name subplz \
--gpus all \
-v /mnt/d/sync/ๅคใชๅฎถ/:/sync \
-v /mnt/d/SyncCache:/app/SyncCache \
kanjieater/subplz:latest \
sync -d "/sync/"
a. Optional: --gpus all
will allow you to run with GPU. If this doesn't work make sure you've enabled your GPU in docker (outside the scope of this project)
b. -v <your folder path>:/sync
ex: -v /mnt/d/sync:/sync
This is where your files that you want to sync are at. The part to the left of the :
if your machine, the part to the right is what the app will see as the folder name.
c. The SyncCache part is the same thing as the folder syncing. This is just mapping where things are locally to your machine. As long as the app can find the SyncCache folder, it will be able to resync things much faster.
d. <command> <params>
ex: sync -d /sync/
, this runs a subplz <command> <params>
as you would outside of docker
โ docker run --entrypoint ./helpers/subplz.sh -it --rm --name subplz --gpus all -v "/mnt/v/Videos/J-Anime Shows/Under Ninja/Season 01":/sync -v /home/ke/code/subplz/SyncCache:/app/SyncCache kanjieater/subplz:latest /sync/
Install ffmpeg
and make it available on the path
git clone https://github.com/kanjieater/SubPlz.git
Use python >= 3.11.2
(latest working version is always specified in pyproject.toml
)
pip install .
You can get a full list of cli params from subplz sync -h
If you're using a single file for the entire audiobook with chapters you are good to go. If an file with audio is too long it may use up all of your RAM. You can use the docker image m4b-tool
to make a chaptered audio file. Trust me, you want the improved codec's that are included in the docker image. I tested both and noticed a huge drop in sound quality without them. When lossy formats like mp3 are transcoded they lose quality so it's important to use the docker image to retain the best quality if you plan to listen to the audio file.
subplz sync -d "<full folder path>"
eg subplz sync -d "/mnt/d/Editing/Audiobooks/ใใใฟใฎๅญคๅ/"
. This runs each file to get a character level transcript. It then creates a sub format that can be matched to the script.txt
. Each character level subtitle is merged into a phrase level, and your result should be a <name>.srt
file. The video or audio file then can be watched with MPV
, playing audio in time with the subtitle.By default, the -d
parameter will pick up the supported files in the directory(s) given. Ensure that your OS sorts them in an order that you would want them to be patched together in. Sort them by name, and as long as all of the audio files are in order and the all of the text files are in the same order, they'll be "zipped" up individually with each other.
By default the tool will overwrite any existing srt named after the audio file's name. If you don't want it to do this you must explicitly tell it not to.
subplz sync -d "/mnt/v/somefolder" --no-overwrite
For different use cases, different parameters may be optimal.
subplz sync -d "/mnt/d/sync/Harry Potter"
m4b
file will allow us to split up the audio and do things in parallelepub
and txt
files, like where full character spaces aren't pickedup in epub
but are in txt
. A chaptered epub
may be faster, but you can have more control over what text gets synced from a txt
file if you need to manually remove things (but epub
is still probably the easier option, and very reliable)--no-respect-grouping
to let the algorithm remove content for you--model "tiny"
seems to work well, and is much faster than other models. If your transcript is inaccurate, consider using a larger model to compensatesubplz sync --model large-v3 -d "/mnt/v/Videos/J-Anime Shows/Sousou no Frieren"
--model "large-v3"
as subtitles often have sound effects or other things that won't be picked up by transcription models. By using a large model, it will take much longer (a 24 min episode can go from 30 seconds to 4 mins for me), but it will be much more accurate.--respect-grouping
. If you find your subs frequently have very long subtitle lines, consider using --no-respect-grouping
Let's say you want to automate getting the best subs for every piece of media in your library. SubPlz takes advantage of how well video players integrate with language codes by overriding them to map them to algorithms, instead of different languages. This makes it so you can quickly switch between a sub on the fly while watching content, and easily update your preferred option for a series later on if your default doesn't work.
Just run ./helpers/subplz.sh with a sub like sub1.ja.srt and video1.mkv and it will genearate the following: |
Algorithm | Default Language Code | Mnemonic | Description |
---|---|---|---|---|
Bazarr | ab | B for Bazarr | Default potentially untimed subs in target language | |
Alass | as | S for Alass | Subs that have been aligned using en & ab subs via Alass |
|
SubPlz | ak | K for KanjiEater | Generated alignment from AI with the ab subs text |
|
FasterWhisper | az | Z for the last option | Generated purely based on audio. Surprisingly accurate but not perfect. | |
Original | en | Animes subs tend to be in EN | This would be the original timings used for Alass, and what would be extracted from you videos automatically | |
Preferred | ja | Your target language | This is a copy of one of the other options, named with your target language so it plays this by default |
The Anki support currently takes your m4b file in <full_folder_path>
named <name>.m4b
, where <name>
is the name of the media, and it outputs srs audio and a TSV file that can is sent via AnkiConnect to Anki. This is useful for searching across GoldenDict to find sentences that use a word, or to merge automatically with custom scripts (more releases to support this coming hopefully).
ANKICONNECT
as an environment variable. Set export ANKICONNECT=localhost:8755
or export ANKICONNECT="$(hostname).local:8765"
in your ~/.zshrc
or bashrc & activate it.ANKI_MEDIA_DIR
to your anki profile's media path: export ANKI_MEDIA_DIR="/mnt/f/Anki2/KanjiEater/collection.media/"
. You need to change this path.cd ./AudiobookTextSync
pip install .
(only needs to be done once)pip install .[anki]
(only needs to be done once)./anki_importer/mapping.template.json
to ./anki_importer/mapping.json
. mapping.json
is your personal configuration file that you can and should modify to set the mapping of fields that you want populated.
My actual config looks like this:
{
"deckName": "!ๅชๅ
::Y ใกใใฃใข::ๆฌ",
"modelName": "JapaneseNote",
"fields": {
"Audio": 3,
"Expression": 1,
"Vocab": ""
},
"options": {
"allowDuplicate": true
},
"tags": [
"mmi",
"suspendMe"
]
}
The number next to the Expression and Audio maps to the fields like so
1: Text of subtitle: `ใใซในใซๆด่ปใๆฑใใฆใใใฎใงใใใ`
2: Timestamps of sub: `90492-92868`
3: Sound file: `[sound:ใขใซในใฉใผใณๆฆ่จ9ใๆๆๆต่ปข_90492-92868.mp3]`
4: Image (not very really useful for audiobooks): <img src='ใขใซในใฉใผใณๆฆ่จ9ใๆๆๆต่ปข_90492-92868.jpg'>
5: Sub file name: ใขใซในใฉใผใณๆฆ่จ9ใๆๆๆต่ปข.m4b,ใขใซในใฉใผใณๆฆ่จ9ใๆๆๆต่ปข.srt
Notice you can also set fields and tags manually. You can set multiple tags. Or like in my example, you can set Vocab
to be empty, even though it's my first field in Anki.
Command:
./anki.sh "<full_folder_path>"
Example:
./anki.sh "/mnt/d/sync/kokoro/"
It's not recommended. You will have a bad time.
If your audiobook is huge (eg 38 hours long & 31 audio files), then break up each section into an m4b or audio file with a text file for it: one text file per one audio file. This will work fine.
But it can work in very specific circumstances. The exception to the Sort Order rule, is if we find one transcript and multiple audio files. We'll assume that's something like a bunch of mp3
s or other audio files that you want to sync to a single transcript like an epub
. This only works if the epub
chapters and the mp3
match. Txt
files don't work very well for this case currently. I still don't recommend it.
Please use m4b for audiobooks. I know you may have gotten them in mp3 and it's an extra step, but it's the audiobook format.
I've heard of people using https://github.com/yermak/AudioBookConverter
Personally, I use the docker image for m4b-tool
. If you go down this route, make sure you use the docker version of m4b-tool as the improved codecs are included in it. I tested m4b-tool without the docker image and noticed a huge drop in sound quality without them. When lossy formats like mp3 are transcoded they lose quality so it's important to use the docker image to retain the best quality. I use the helpers/merge2.sh
to merge audiobooks together in batch with this method.
Alternatively you could use ChatGPT to help you combine them. Something like this:
!for f in "/content/drive/MyDrive/name/ๆ็ฌใฏๅคฉไธใๅใใซ่กใ/"*.mp3; do echo "file '$f'" >> mylist.txt; done
!ffmpeg -f concat -safe 0 -i mylist.txt -c copy output.mp3
Besides the other ones already mentioned & installed this project uses other open source projects subs2cia, & anki-csv-importer
https://github.com/gsingh93/anki-csv-importer
https://github.com/kanjieater/subs2cia
https://github.com/ym1234/audiobooktextsync
The GOAT delivers again; The best Japanese reading experience ttu-reader paired with SubPlz subs
A cool tool to turn these audiobook subs into Visual Novels