Open ROBERT-MCDOWELL opened 19 hours ago
Can I see the full terminal log?
Or at least the command you used to run it?
Update your docker image as I ad to fix a bug on a previous push I did.
docker pull athomasson2/ebook2audiobookxtts:huggingface
Run this command in the dir where your Audiobooks
and input_folder
folders are located
sudo chmod -R 777 $(pwd)/input-folder $(pwd)/Audiobooks
This will give full read, write, and execute permissions to all users for the input-folder and Audiobooks directories. After that, rerun the Docker command.
I did it first on AudioBooks only, and do it on input_folder and now works well! lol :o) just some ffmpeg warnings [aac @ 0x55c43272adc0] Too many bits 8192.000000 > 6144 per frame requested, clamping to max [swscaler @ 0x55c4328a3c40] [swscaler @ 0x55c4328b2580] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328a3c40] [swscaler @ 0x55c4328d1240] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328a3c40] [swscaler @ 0x55c4328ee440] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328a3c40] [swscaler @ 0x55c43290a540] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328a3c40] [swscaler @ 0x55c432927740] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328a3c40] [swscaler @ 0x55c432944940] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328a3c40] [swscaler @ 0x55c432961b40] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328a3c40] [swscaler @ 0x55c43297ed40] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328a3c40] [swscaler @ 0x55c43299bf40] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c4328bf540] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c4328d0600] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c4328ee840] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c43290ba40] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c432927b40] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c432944d40] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c432961f40] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c43297f140] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c43299c340] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c4328bf540] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c4328cf5c0] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c4328f2700] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c43290a8c0] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c432928b00] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c432945d00] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c432961e00] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c43297f000] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c43299c200] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c4328bf540] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c4328cf5c0] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c4328f2700] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c43290a8c0] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c432928b00] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c432945d00] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c432961e00] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c43297f000] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c43299c200] deprecated pixel format used, make sure you did set range correctly Output #0, ipod, to './Audiobooks/demo_mini_story_chapters_Drew.m4b':
They seem to video file related warnings but we're passing only audio anyway into the m4b file.
-vn
To the FFmpeg command in the python script
Which will explicitly force FFmpeg to treat it as an audio-only file(which disables video processing), and that should stop the video-related warnings.
yup you should lol! :D
I'll add it when I find time 🤔
In the meantime I'll keep this issue open to remind me of it lol
ok great!, anyhow I'm going to try it now out of docker, not a fan of it ;) on your README, maybe more accurate to say python >= 3.9, < 3.12 must be installed since coqui does not install (yet) on python >= 3.12. Are you using coqui TTSv2?
Oh yeah the readme should say Python 3.10 cause that's the env I use it in
And yeah I think it's using XttsV2
Just updated readme replacing python 3.* with 3.10
Ah also I just realized that all option path must be absolute right? to specify on the help and/or README would be cool.
What the -h or -help command to just the extra parameters?
lol it's mentioned like 6 times in the readme and also its output
Idk what you mean by the all option path 🤨
argh. ok. so It's weird, adding --voice /path/to/the/wav/file fails. no such file or directory. should it be in input_folder strictly?
all command options that require a path.
If it's the docker,
the default settings provided made it so that the only folders the docker can see on your computer are the Audiobook and the input_folder
argh, that's why I don't like docker :). but it's ok for dev tests.
You know tho if you managed to figure out how to give docker network access then you could just run the gui as a public link
Which you'd be able to access on your phone
Ngl, Fedora is the only Linux OS I've ever run into that's annoying with docker lol
It's so security and privacy oriented that it seems to get in the way
Windows, Mac, arm Mac, Ubuntu, arch Linux etc never have these issues and run the public gradio server just fine lol
That sounded mean but like XD
It's true I only run into issues with fedora XD
well, my goal is to run this on my server in headless mode and convert hundreds of millions of ebooks to help sight deficient and blind people to be happy to listen a good book, I have other script I developed to translate as best as possible.....
so everything must run in background silentely.
do you know where to set "attention_mask"? I get:
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask
to obtain reliable results
Oh yeah I have no idea lol, it doesn't seem to cause any issues tho so I've been ignoring that lol.
yeah fedora is the pre-release of new redhat features since 12 years now... so it's a kind of obvious security is a must. but I swear you that I removed many many securities to help me be more free without any security loss.
Oh yeah well best get it locally installed and working then lol,
Super cool what you're doing with it tho!
here is some clarification of attention_mask https://discuss.huggingface.co/t/clarification-on-the-attention-mask/1538
audiobooks are nice when travelling too... you can close your eyes or watch the landscape or drive while you are listening something that nourrish your mind, rather than stupid radios :)
If you look at the huggingface files perhaps it might help to look at its requirments.txt and packages.txt files lol
Oh and miniconda for easy python 3.10 installation lol
https://huggingface.co/spaces/drewThomasson/ebook2audiobookXTTS
yeah I use conda or python -m venv it's fair enough
do you know how to show the progress of the total reaming segments? for now we can see only the chapter(?) segments on progress.
I'll see about adding another TQDM loading bar for that lol
good idea! :D
add also -y on ffmpeg command to override any existing output file with the same name.
Actually if you just run it like this with the yes in the front it should auto say yes to everything
yes | python app.py
Doing that should make it say y
to every terminal prompt given to the user
Traceback (most recent call last): File "/usr/bin/ebook-meta", line 21, in
sys.exit(main())
^^^^^^
File "/usr/lib/calibre/calibre/ebooks/metadata/cli.py", line 220, in main
with open(opts.get_cover, 'wb') as f:
^^^^^^^^^^^^^^^^^^^^^^^^^^
PermissionError: [Errno 13] Permission denied: '/home/user/app/input_folder/demo_mini_story_chapters_Drew.jpg'
Error extracting eBook metadata or cover: Command '['ebook-meta', '/home/user/app/input_folder/demo_mini_story_chapters_Drew.epub', '--get-cover', '/home/user/app/input_folder/demo_mini_story_chapters_Drew.jpg']' returned non-zero exit status 1.
Combined audio saved to /tmp/combined.wav
....