DrewThomasson / ebook2audiobookXTTS

Generates an audiobook with chapters and ebook metadata using Calibre and Xtts from Coqui tts, and with optional voice cloning, and supports multiple languages
MIT License
574 stars 60 forks source link

permissions issues with docker #22

Open ROBERT-MCDOWELL opened 19 hours ago

ROBERT-MCDOWELL commented 19 hours ago

Traceback (most recent call last): File "/usr/bin/ebook-meta", line 21, in sys.exit(main()) ^^^^^^ File "/usr/lib/calibre/calibre/ebooks/metadata/cli.py", line 220, in main with open(opts.get_cover, 'wb') as f: ^^^^^^^^^^^^^^^^^^^^^^^^^^ PermissionError: [Errno 13] Permission denied: '/home/user/app/input_folder/demo_mini_story_chapters_Drew.jpg' Error extracting eBook metadata or cover: Command '['ebook-meta', '/home/user/app/input_folder/demo_mini_story_chapters_Drew.epub', '--get-cover', '/home/user/app/input_folder/demo_mini_story_chapters_Drew.jpg']' returned non-zero exit status 1. Combined audio saved to /tmp/combined.wav ....

DrewThomasson commented 18 hours ago

Can I see the full terminal log?

Or at least the command you used to run it?

DrewThomasson commented 18 hours ago

Update your docker image as I ad to fix a bug on a previous push I did.

To update your docker image:

docker pull athomasson2/ebook2audiobookxtts:huggingface
DrewThomasson commented 18 hours ago

I think it might be your Fedora distro being picky with permissions lol

Run this command in the dir where your Audiobooks and input_folder folders are located

sudo chmod -R 777 $(pwd)/input-folder $(pwd)/Audiobooks

This will give full read, write, and execute permissions to all users for the input-folder and Audiobooks directories. After that, rerun the Docker command.

ROBERT-MCDOWELL commented 13 hours ago

I did it first on AudioBooks only, and do it on input_folder and now works well! lol :o) just some ffmpeg warnings [aac @ 0x55c43272adc0] Too many bits 8192.000000 > 6144 per frame requested, clamping to max [swscaler @ 0x55c4328a3c40] [swscaler @ 0x55c4328b2580] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328a3c40] [swscaler @ 0x55c4328d1240] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328a3c40] [swscaler @ 0x55c4328ee440] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328a3c40] [swscaler @ 0x55c43290a540] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328a3c40] [swscaler @ 0x55c432927740] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328a3c40] [swscaler @ 0x55c432944940] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328a3c40] [swscaler @ 0x55c432961b40] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328a3c40] [swscaler @ 0x55c43297ed40] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328a3c40] [swscaler @ 0x55c43299bf40] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c4328bf540] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c4328d0600] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c4328ee840] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c43290ba40] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c432927b40] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c432944d40] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c432961f40] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c43297f140] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c43299c340] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c4328bf540] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c4328cf5c0] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c4328f2700] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c43290a8c0] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c432928b00] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c432945d00] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c432961e00] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c43297f000] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c43299c200] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c4328bf540] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c4328cf5c0] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c4328f2700] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c43290a8c0] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c432928b00] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c432945d00] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c432961e00] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c43297f000] deprecated pixel format used, make sure you did set range correctly [swscaler @ 0x55c4328b2580] [swscaler @ 0x55c43299c200] deprecated pixel format used, make sure you did set range correctly Output #0, ipod, to './Audiobooks/demo_mini_story_chapters_Drew.m4b':

DrewThomasson commented 12 hours ago

I don't think those warnings mean anything.

They seem to video file related warnings but we're passing only audio anyway into the m4b file.

I might be able to stop those warnings by adding

-vn

To the FFmpeg command in the python script

Which will explicitly force FFmpeg to treat it as an audio-only file(which disables video processing), and that should stop the video-related warnings.

ROBERT-MCDOWELL commented 12 hours ago

yup you should lol! :D

DrewThomasson commented 12 hours ago

I'll add it when I find time 🤔

In the meantime I'll keep this issue open to remind me of it lol

ROBERT-MCDOWELL commented 12 hours ago

ok great!, anyhow I'm going to try it now out of docker, not a fan of it ;) on your README, maybe more accurate to say python >= 3.9, < 3.12 must be installed since coqui does not install (yet) on python >= 3.12. Are you using coqui TTSv2?

DrewThomasson commented 12 hours ago

Oh yeah the readme should say Python 3.10 cause that's the env I use it in

DrewThomasson commented 12 hours ago

And yeah I think it's using XttsV2

DrewThomasson commented 12 hours ago

Just updated readme replacing python 3.* with 3.10

ROBERT-MCDOWELL commented 12 hours ago

Ah also I just realized that all option path must be absolute right? to specify on the help and/or README would be cool.

DrewThomasson commented 12 hours ago

What the -h or -help command to just the extra parameters?

lol it's mentioned like 6 times in the readme and also its output

DrewThomasson commented 12 hours ago

Idk what you mean by the all option path 🤨

ROBERT-MCDOWELL commented 12 hours ago

argh. ok. so It's weird, adding --voice /path/to/the/wav/file fails. no such file or directory. should it be in input_folder strictly?

ROBERT-MCDOWELL commented 12 hours ago

all command options that require a path.

DrewThomasson commented 12 hours ago

If it's the docker,

the default settings provided made it so that the only folders the docker can see on your computer are the Audiobook and the input_folder

ROBERT-MCDOWELL commented 12 hours ago

argh, that's why I don't like docker :). but it's ok for dev tests.

DrewThomasson commented 12 hours ago

You know tho if you managed to figure out how to give docker network access then you could just run the gui as a public link

Which you'd be able to access on your phone

Ngl, Fedora is the only Linux OS I've ever run into that's annoying with docker lol

It's so security and privacy oriented that it seems to get in the way

Windows, Mac, arm Mac, Ubuntu, arch Linux etc never have these issues and run the public gradio server just fine lol

DrewThomasson commented 12 hours ago

That sounded mean but like XD

It's true I only run into issues with fedora XD

ROBERT-MCDOWELL commented 12 hours ago

well, my goal is to run this on my server in headless mode and convert hundreds of millions of ebooks to help sight deficient and blind people to be happy to listen a good book, I have other script I developed to translate as best as possible..... so everything must run in background silentely. do you know where to set "attention_mask"? I get: The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results

DrewThomasson commented 12 hours ago

Oh yeah I have no idea lol, it doesn't seem to cause any issues tho so I've been ignoring that lol.

ROBERT-MCDOWELL commented 12 hours ago

yeah fedora is the pre-release of new redhat features since 12 years now... so it's a kind of obvious security is a must. but I swear you that I removed many many securities to help me be more free without any security loss.

DrewThomasson commented 11 hours ago

Oh yeah well best get it locally installed and working then lol,

Super cool what you're doing with it tho!

ROBERT-MCDOWELL commented 11 hours ago

here is some clarification of attention_mask https://discuss.huggingface.co/t/clarification-on-the-attention-mask/1538

ROBERT-MCDOWELL commented 11 hours ago

audiobooks are nice when travelling too... you can close your eyes or watch the landscape or drive while you are listening something that nourrish your mind, rather than stupid radios :)

DrewThomasson commented 11 hours ago

If you look at the huggingface files perhaps it might help to look at its requirments.txt and packages.txt files lol

Oh and miniconda for easy python 3.10 installation lol

https://huggingface.co/spaces/drewThomasson/ebook2audiobookXTTS

ROBERT-MCDOWELL commented 11 hours ago

yeah I use conda or python -m venv it's fair enough

ROBERT-MCDOWELL commented 11 hours ago

do you know how to show the progress of the total reaming segments? for now we can see only the chapter(?) segments on progress.

DrewThomasson commented 11 hours ago

I'll see about adding another TQDM loading bar for that lol

ROBERT-MCDOWELL commented 11 hours ago

good idea! :D

ROBERT-MCDOWELL commented 10 hours ago

add also -y on ffmpeg command to override any existing output file with the same name.

DrewThomasson commented 9 hours ago

Actually if you just run it like this with the yes in the front it should auto say yes to everything

yes | python app.py

Doing that should make it say y to every terminal prompt given to the user