Split by silence - Githubissues

sandreas / m4b-tool

m4b-tool is a command line utility to merge, split and chapterize audiobook files such as mp3, ogg, flac, m4a or m4b

MIT License

1.12k stars 78 forks source link

Split by silence #229

Closed markfaine closed 1 year ago

markfaine commented 1 year ago

I don't have chapters or a chapters file and the book is unfortunately not on Musicbrainz. I just realized that split doesn't work unless you have chapters already, which in that case, I wouldn't need it. I assumed it would split by silence. Is that a possible feature? Like adjustable based on the existing -a -b options. I think generally 3 seconds of silence seems to be just about right for the chapter marks.

sandreas commented 1 year ago

If you use the latest pre-release (which I highly recommend), it should already work:

m4b-tool split --by-silence my-book.m4b

You can also use milliseconds value (e.g. --silence-min-length=1250 - default is 1750) to increase the amount of silences that are detected. Beware that it might result in a longer process, if you use a smaller value for this.

markfaine commented 1 year ago

Thanks, this is awesome and I think I will keep that in my back pocket for next time. My use case was to split the book into files and then merge it to get the chapters fixed but I've since figured out the chapters but now I need options to tell m4b-tool merge or chapters to use my chapters.txt file exactly as it is and only my chapters without adjustments. I don't have a musicbrainz id. Is that possible? Also, I'm using Docker, is the pre-release available with a docker tag?

sandreas commented 1 year ago

but now I need options to tell m4b-tool merge or chapters to use my chapters.txt file exactly as it is and only my chapters without adjustments.

You don't need to do a full merge with m4b-tool for this. Just use tone:

Create a file <m4b-filename>.chapters.txt (e.g. harry-potter.chapters.txt, when the audio file is called harry-potter.m4b), then:

tone tag --auto-import=chapters test.m4b

If you don't wanna install latest tone, you could also use the tone version in the docker image:

docker run -it --entrypoint=tone --rm -u $(id -u):$(id -g) -v "$(pwd)":/mnt sandreas/m4b-tool tag --auto-import=chapters test.m4b

Depending on which docker image you use, you have to change sandreas/m4b-tool to the name of the image you are using...

Just for the next time: If you need to chapterize an audiobook, that does not contain any chapters and you cannot get the metadata from anywhere, you can also use the merge command to auto create chapters like this:

m4b-tool merge --max-chapter-length=300,900 --no-conversion input.m4b -o output.m4b

It will:

Detect silences
Create chapters in the window between 5 and 15 minutes and try to match silences
Re-use existing chapters if present
Don't re-encode the file to keep the quality as good as possible

merge is very powerful and not only does merge stuff :-)

Question: Would you be interested to sponsor 2$ a month to get access to regularly updated a private repository, offering tutorials and scripts to get metadata from internet providers like this:

m4b-tool tag --url="https://metadataprovider.com/?isbn=978-xxxxxx" my-file.m4b

It is just a concept for now but I would implement it, if it leads to some supportive money.

markfaine commented 1 year ago

Awesome, I will give this a try. Will tone split off to it's own docker container at some point? I guess what I'm asking is if there is any reason for me to have both tone and m4btool installed or can I continue to use just the one docker container? I do really prefer using the docker versions since it's so much simpler with fewer dependencies to worry about.

It's only really an issue on books that are split at non-chapter breaks or entire books in one file without chapters, if each file is a chapter then it's usually fairly easy to merge those with the file start being the chapter time index.

For those problem cases, I am currently using a python script that uses pydub to get the silences and write them out to a file, it works, but unfortunately there doesn't seem to be a standard for pauses, it seems that most newer book are around 3-4 seconds but it can vary and you can end up getting pauses that are not chapter breaks or missing chapter breaks due to the minimum silence length being too long. Frustrating but nothing can be done about it I guess.

I like the idea of using m4b-tool or tone for this instead but unless it can solve that problem I don't see how it would help. I know it's not completely reasonable but I would be bothered by chapters anywhere other than the actual chapter start, even if that means i have to hand edit some of the chapter indexes in the text file.

As for being a sponsor, I don't know maybe, I'm feeling a bit of subscription fatigue right now, seems every thing that exists costs per month to use it these days, though if you can make it so I never have to hand edit chapter files again, I would certainly consider it :)

sandreas commented 1 year ago

Will tone split off to it's own docker container at some point?

It already has it's own docker container. See https://hub.docker.com/u/sandreas for my "official" containers.

is any reason for me to have both tone and m4btool installed or can I continue to use just the one docker container?

tone in m4b-tool may be not the latest version - depending on your task, this usually is ok, but if you need tone latest features, you may have to upgrade

I do really prefer using the docker versions since it's so much simpler with fewer dependencies to worry about.

tone is a single binary without any dependencies and I try to keep it like that. No libraries, no VM, no Framework. Just tone.

As for being a sponsor, I don't know maybe, I'm feeling a bit of subscription fatigue right now, seems every thing that exists costs per month to use it these days,

Absolutely understandable. That's why I do open source software... it should be for free, I'm just considering ways to invest more time into my tools, because they would really need some polish.

though if you can make it so I never have to hand edit chapter files again, I would certainly consider it :)

Really unlikely that there is a silver bullet solving all chapter problems, but I could add some chapter providers, where you can get chapters from (e.g. MusicBrainz, Spotify, etc.) and also already considered reading chapters from the epub of the ebook by estimating the text length in percent and applying this to the audio book duration. Worked pretty well for some of my books, where I also own the epub file. Another possibility would be speech detection. I also own the domain https://chapter-db.org, but I never finished the API. No time ATM :-) With enough donations, that might change... but I cannot guarantee anything.

markfaine commented 1 year ago

All good info and I'm going to checkout tone soon. I love this idea. I recently did that for a book. There was no way to tell if the chapter count was correct without going through each chapter mark, so I looked to see how many chapters there was in the epub, of course I did this manually, your idea is much better though.

I have been trying to convert older mp3 books to m4b with chapters and I've come to the conclusion that I don't really need perfection but what I need is a very fast review process. The goal would be to get all of the silences that could be chapters, even if some of them will obviously not be chapters, and then have a method of just listening briefly, answering yes/no to "is this a chapter?", and then skipping to the next marker, the end of the process resulting in a chapter's file that can be used to set the actual chapters. Even better would be the option to upload the results to a database, so that we can start to build something akin to Musicbrainz but for audio books.

Really unlikely that there is a silver bullet solving all chapter problems, but I could add some chapter providers, where you can get chapters from (e.g. MusicBrainz, Spotify, etc.) and also already considered reading chapters from the epub of the ebook by estimating the text length in percent and applying this to the audio book duration. Worked pretty well for some of my books, where I also own the epub file. Another possibility would be speech detection. I also own the domain https://chapter-db.org, but I never finished the API. No time ATM :-) With enough donations, that might change... but I cannot guarantee anything.

sandreas commented 1 year ago

he goal would be to get all of the silences that could be chapters, even if some of them will obviously not be chapters, and then have a method of just listening briefly, answering yes/no to "is this a chapter?", and then skipping to the next marker, the end of the process resulting in a chapter's file that can be used to set the actual chapters. Even better would be the option to upload the results to a database, so that we can start to build something akin to Musicbrainz but for audio books.

That was my plan. Write a UI-App with an Audio-Player that guesses possible chapters from different sources:

epub
spotify, musicbrainz, audible
detected silences
....

Then a few buttons:

Next potential chapter
Prev potential chapter
rewind 10 seconds
fast-forward 10 seconds
Chapter found

where the "Chapter found" button pauses the playback and opens a window where you can change the chapter title and go on.

After this you see a text window with the chapters in chapters.txt format where you can manually edit and save it to the audiobook file AND optionally upload it to https://chapter-db.org.

But this is a project for the future... where I have unlimited time :-)

Meanwhile you could try a hidden feature of m4b-tool ;) Please backup your files before doing this...!

m4b-tool chapters --epub="book.epub" "audiobook.m4a" --epub-dump

markfaine commented 1 year ago

Sounds perfect, just about the only viable solution I think. I will definitely give that hidden feature a try. Interested to see how close it gets. :)