Open batchfileframework opened 7 months ago
The creator of #29 here
In my vision it is entirely doable for developers / contributors to make a webui-user.bat
or start.bat
like the classic A1111 Stable Diffusion. Especially when most dependencies can be easily downloaded in Windows without having compatibility issue with the OS or Python or anything else. ( except espeak-ng
which I mentioned in #29 that it can't be download through command prompt )
it should be very doable, just need to replace espeak with another timestamp aligner
I made an API version that is compatible with Windows (currently only for TTS, not speech modification). See https://github.com/lukaszliniewicz/VoiceCraft_API. If you test it, please let me know if everything works. It is not exactly what you're looking for, and it uses conda (I think it's a very good method, but everyone has their preference). Still, you can use the modified audiocraft files, the USER and espeak solution from api.py
and run inferences with Python in a venv. It comes with espeak. I will make an automatic installer for it or at least include it with my audiobook app (https://github.com/lukaszliniewicz/Pandrator).
@yumlevi
Espeak is not a problem. You can install it using the official Windows installer and take the contents of its folder in ProgramFiles, create an espeak
directory in the main directory of the repo, paste them and do this (or use my fork, which already has the espeak folder and the files):
# Get the current username
username = getpass.getuser()
# Set the USER environment variable to the username
os.environ['USER'] = username
# Set the os variable for espeak
os.environ['PHONEMIZER_ESPEAK_LIBRARY'] = './espeak/libespeak-ng.dll'
I added VoiceCraft to my audiobook/dubbing generator app: https://github.com/lukaszliniewicz/Pandrator. It has a one-click Windows installer and installs the API (https://github.com/lukaszliniewicz/VoiceCraft_API).
The creator of #29 here
In my vision it is entirely doable for developers / contributors to make a
webui-user.bat
orstart.bat
like the classic A1111 Stable Diffusion. Especially when most dependencies can be easily downloaded in Windows without having compatibility issue with the OS or Python or anything else. ( exceptespeak-ng
which I mentioned in #29 that it can't be download through command prompt )
Vall-E-EX did a great job of a cross platform Gradio frontend for TTS that just works. Lots of cool features beyond basic TTS, such as audio from Microphone, paste in transcripts, manage voice presets and so on. Might be a good inspiration or even adaptable with attribution.
Hi,
This issue is a installation solution for installing on windows
preferably, if possible at all
without using WSL / docker / conda
Just stock python & pip, maybe a venv, maybe some powershell but preferably pure batch install.
In reference to previous attempts
https://github.com/jasonppy/VoiceCraft/issues/28 https://github.com/jasonppy/VoiceCraft/issues/29