Hay Say is a user interface for generating pony voices. From a single UI, you can generate voices or perform voice conversion from AI architectures such as so-vits-svc (https://github.com/svc-develop-team/so-vits-svc), Controllable TalkNet (https://github.com/SortAnon/ControllableTalkNet) and RVC (https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI). At the moment, all supported AI architectures can run locally and do not require an internet connection except to download character models.
Over the past few years, many AI architectures have emerged for accomplishing text-to-speech generation and voice conversion ("speech-to-speech"). Many of these new technologies did not have a graphical user interface when they first became available. Each time a new technology has entered into the awareness of the Pony Preservation Project thread at /mlp on 4chan, someone has had to develop a new UI to make the technology accessible to others. This has lead to a collection of user interfaces over time, each one unique to a particular AI architecture and with features that are not available in the other UIs. The primary motivation behind Hay Say is to create a unified user experience for all the voice generation solutions. Preprocessing and postprocessing options are separated from architecture-specific options so that every architecture can benefit from new pre- or post-processing features. With a UI framework already in place, it should be possible to give new architectures a usable UI more quickly by integrating them with Hay Say.
A secondary motivation behind Hay Say is to reduce user frustration with installing software. Falling into Python's "dependency hell" seems to be a common pain point when users try to install voice generation software locally. Hay Say addresses this issue by pre-installing all the AI solutions into their own Docker images. Users simply need to install Docker and pull the pre-built images.
Important: Be aware that Hay Say will need to download at least 49 GB of compressed Docker images, so expect it to take a while to start up the first time you run it. It should come up quickly after that. The exact amount of time required for the initial startup depends, of course, on your internet speed.
Before you install Hay Say, I recommend you take a look at the Testing Data / Benchmarks section at the bottom of this page to get a ballpark figure on how fast or slow Hay Say might be on your machine.
I also tested Hay Say on MacOS 13.3.1 (Ventura) with Apple Silicon M2. I was unable to generate audio with Controllable TalkNet, and so-vits-svc (3.0 and 4.0) was unusably slow even though the machine was brand new (see Testing Data / Benchmarks section). I do not recommend running Hay Say on Apple Silicon. It is unknown whether it works any better on Macs with Intel chips.
wsl --update
, and then start Docker Desktop again.curl.exe --output docker-compose.yaml https://raw.githubusercontent.com/hydrusbeta/hay_say_ui/main/docker-compose.yaml
docker volume create models
docker volume create audio_cache
docker compose up
Note: You might get a Windows Defender Firewall popup. You can safely close that window. No special firewall rules are required to run Hay Say.
Hay Say will take time to download the large Docker images (~49 GB). Once it is done, you should see output like the following:
Open a web browser and go to the following URL:
http://127.0.0.1:6573/
To stop Hay Say, go to the Command Prompt that you ran "docker compose up" in earlier and type CTRL+C. It will take 10-30 seconds for Docker to gracefully shut down all of the containers. I also recommend shutting down Docker Desktop. Right click on the Whale icon in the taskbar and select "Quit Docker Desktop".
To start Hay Say again, first make sure that the Docker Engine is running. You can check for Docker Engine in the
taskbar:
If it is not running, you can start it by launching Docker Desktop.
Then, open Command Prompt and type the following command:
docker compose up
Note: When you run that command, you must be in the folder where docker-compose.yaml is located (you downloaded it
earlier when you executed the curl
command in step 3 above). If you get an error stating no configuration file provided: not found
, then cd to that directory first.
Install Docker Engine. This can be done in several ways according to preference. See https://docs.docker.com/engine/install/ubuntu. Note: By default, Docker needs to be run with superuser privileges. However, it is possible to install it in a way that lets you run it in rootless mode if you prefer (see https://docs.docker.com/engine/security/rootless/).
Open a terminal and execute the following commands:
wget https://raw.githubusercontent.com/hydrusbeta/hay_say_ui/main/docker-compose.yaml
sudo docker volume create models
sudo docker volume create audio_cache
sudo docker compose up
Hay Say will take time to download the large Docker images (~49 GB). Once it is done, you should see output like the
following:
Open a web browser and go to the following URL:
http://127.0.0.1:6573/
To stop Hay Say, go to the terminal where you ran "docker compose up" earlier and type CTRL+C. It will take 10-30 seconds for Docker to gracefully shut down all of the containers.
To start Hay Say again, open a terminal and type the following command:
sudo docker compose up
Note: When you run that command, you must be in the folder where docker-compose.yaml is located (you downloaded it
earlier when you executed the wget
command in step 2 above). If you get an error stating no configuration file provided: not found
, then cd to that directory first.
Important! Hay Say did not run well on Apple Silicon during my testing. See "A Note on MacOS" in the Recommended Hardware and Software section, above. But here are the steps in case you want to give it a try anyways.
curl --output docker-compose.yaml https://raw.githubusercontent.com/hydrusbeta/hay_say_ui/main/docker-compose.yaml
docker volume create models
docker volume create audio_cache
docker compose up
Hay Say will take time to download the large Docker images (~49 GB). Once it is done, you should see output like the following:
Open a web browser and go to the following URL:
http://127.0.0.1:6573/
To stop Hay Say, go to the Terminal where you ran "docker compose up" earlier and type CTRL+C. It will take 10-30 seconds for Docker to gracefully shut down all of the containers. I also recommend shutting down Docker Desktop. Right click on the Whale icon in the taskbar and select "Quit Docker Desktop".
To start Hay Say again, first make sure that the Docker Engine is running. You can check for Docker Engine in the
taskbar:
If it is not running, you can start it by launching Docker Desktop
Then open a terminal and type the following command
sudo docker compose up
Note: When you run that command, you must be in the folder where docker-compose.yaml is located (you downloaded it
earlier when you executed the curl
command in step 3 above). If you get an error stating no configuration file provided: not found
, then cd to that directory first.
Note: if you last installed/updated Hay Say before Aug 19, 2023, please see Special Instructions for the Aug 19, 2023 update If you last installed/updated Hay Say between Aug 19, 2023 and Dec 16, 2023, please see Special Instructions for the Dec 16, 2023 update
To update Hay Say, first download the latest docker-compose.yaml file by executing the following command. Please note that this will overwrite your existing docker-compose file. If you made any edits to your old docker-compose file (e.g. uncommenting lines to enable Hay Say to use your Cuda GPU or commenting out lines to make it download only specific architectures), you will need to make the same edits to the new file:
Linux:
wget https://raw.githubusercontent.com/hydrusbeta/hay_say_ui/main/docker-compose.yaml
Windows:
curl.exe --output docker-compose.yaml https://raw.githubusercontent.com/hydrusbeta/hay_say_ui/main/docker-compose.yaml
MacOS:
curl --output docker-compose.yaml https://raw.githubusercontent.com/hydrusbeta/hay_say_ui/main/docker-compose.yaml
Next, execute the following commands to make sure that your containers are stopped, to pull the latest images, and to start Hay Say again:
Linux:
sudo docker compose stop
sudo docker compose pull
sudo docker compose up
Windows and MacOS:
docker compose stop
docker compose pull
docker compose up
When you pull an updated Docker image, the old image is not automatically deleted. The old image, which is no longer used, is referred to as a "dangling" image. You can remove dangling images to save space by executing the following command:
Linux:
sudo docker image prune
Windows and MacOS:
docker image prune
Windows users must also follow the instructions in Additional Required Steps for Windows Users afterwards to finish freeing disk space.
Following the update on Aug 19, 2023, the "model pack" images have become obsolete and can be removed. Furthermore, every other image has been updated, so you can save some space by deleting all of your existing Hay Say Docker images first and then re-downloading them (doing so will prevent "dangling" images from taking up excessive space during the update). Lastly, Hay Say now expects the presence of a "models" docker volume, so you must create it. If you installed Hay Say before Aug 19, 2023, please execute these commands instead:
Linux:
sudo docker compose down --rmi all
wget https://raw.githubusercontent.com/hydrusbeta/hay_say_ui/main/docker-compose.yaml
sudo docker volume create models
sudo docker compose up
MacOS:
docker compose down --rmi all
curl --output docker-compose.yaml https://raw.githubusercontent.com/hydrusbeta/hay_say_ui/main/docker-compose.yaml
docker volume create models
docker compose up
Windows:
docker compose down --rmi all
Now follow the steps in Additional Required Steps for Windows Users to clear disk space. After that, open Docker Desktop again. Once it has loaded, execute the following commands:
curl.exe --output docker-compose.yaml https://raw.githubusercontent.com/hydrusbeta/hay_say_ui/main/docker-compose.yaml
docker volume create models
docker compose up
Every Docker image in the Hay Say project was updated in the Dec 16 update, so you can save some space by deleting all of your existing Hay Say Docker images first and then re-downloading them (doing so will prevent "dangling" images from taking up excessive space during the update). If you last installed or updated Hay Say between Aug 19 and Dec 16, 2023, please execute these commands instead:
Linux:
sudo docker compose down --rmi all
wget https://raw.githubusercontent.com/hydrusbeta/hay_say_ui/main/docker-compose.yaml
sudo docker compose up
MacOS:
docker compose down --rmi all
curl --output docker-compose.yaml https://raw.githubusercontent.com/hydrusbeta/hay_say_ui/main/docker-compose.yaml
docker compose up
Windows:
docker compose down --rmi all
Now follow the steps in Additional Required Steps for Windows Users to clear disk space. After that, open Docker Desktop again. Once it has loaded, execute the following commands:
curl.exe --output docker-compose.yaml https://raw.githubusercontent.com/hydrusbeta/hay_say_ui/main/docker-compose.yaml
docker compose up
Hay Say used to download character models via Docker by downloading special, data-only images called "model packs". Model packs proved to be inefficient with disk space usage, so Hay Say was updated to allow users to download individual characters directly from Mega, Google Drive, and Huggingface Hub instead. The existing models packs should still work, however, and are available as a fallback in case there is an issue with downloading models individually. Please note that model packs will be deprecated in the future.
You can configure Hay Say to download a model pack by "uncommenting" the relevant lines in the docker-compose.yaml file. For example, to download the singing models for so-vits-svc 4.0, uncomment (remove the hashtag at the start of) the following lines:
#so_vits_svc_3_model_pack_1:
# image: hydrusbeta/hay_say:so_vits_svc_3_model_pack_1
# volumes:
# - so_vits_svc_3_model_pack_1:/home/luna/hay_say/so_vits_svc_3_model_pack_1
So that they look like this instead:
so_vits_svc_3_model_pack_1:
image: hydrusbeta/hay_say:so_vits_svc_3_model_pack_1
volumes:
- so_vits_svc_3_model_pack_1:/home/luna/hay_say/so_vits_svc_3_model_pack_1
Be sure to save the file, then restart Hay Say (type ctrl+c in Hay Say's terminal if it is running and then execute "docker compose up" again).
Here is a table showing which characters are included in each model pack:
Model Pack Name | Characters |
---|---|
controllable_talknet_model_pack_0 | Apple Bloom, Applejack, Applejack (singing), Big McIntosh, Cadance, Celestia, Chrysalis, Cozy Glow, Discord, Fluttershy, Fluttershy (singing), Granny Smith, hifire, hifis, Luna, Maud Pie, Mayor Mare, Pinkie Pie, Pinkie Pie (singing), Rainbow Dash, Rainbow Dash (singing), Rarity, Rarity (singing), Scootaloo, Shining Armor, Spike, Starlight Glimmer, Sunset Shimmer, Sweetie Belle, Tirek, Trixie Lulamoon, Trixie Lulamoon (singing), Twilight Sparkle, Twilight Sparkle (singing), Twilight Sparkle (whispering), Zecora |
so_vits_svc_3_model_pack_0 | Apple Bloom, Applejack, Bon Bon, Discord, Fluttershy, Pinkie Pie, Rainbow Dash, Rarity, Scootaloo, Sweetie Belle, Trixie Lulamoon, Twilight Sparkle |
so_vits_svc_3_model_pack_1 | Applejack (singing), Cadance (singing), Celestia (singing), Luna (singing), Rarity (singing), Starlight Glimmer (singing), Twilight Sparkle (singing) |
so_vits_svc_4_model_pack_0 | Apple Bloom, Applejack, Celestia, Chrysalis, Derpy Hooves, Discord, Fluttershy, Pinkie Pie, Rainbow Dash, Rarity, Saffron Masala, Shining Armor, Tree Hugger, Trixie Lulamoon, Trixie Lulamoon (singing), Twilight Sparkle |
so_vits_svc_4_model_pack_1 | Apple Bloom (singing), Apple Bloom (singing, PS1), Applejack (singing), Applejack (singing, PS1), Cadance (singing), Cadance (singing, PS1), Celestia (singing), Celestia (singing, alt), Celestia (singing, PS1), Fluttershy (singing), Fluttershy (singing, PS1), Luna (singing), Luna (singing, PS1), Pinkie Pie (singing), Pinkie Pie (singing, PS1), Rainbow Dash (singing), Rainbow Dash (singing, alt), Rainbow Dash (singing, PS1), Rarity (singing), Rarity (singing, PS1), Scootaloo (singing), Scootaloo (singing, alt), Scootaloo (singing, PS1), Starlight Glimmer (singing, evil), Starlight Glimmer (singing, good), Sweetie Belle (singing), Sweetie Belle (singing, PS1), Twilight Sparkle (singing), Twilight Sparkle (singing, PS1) |
so_vits_svc_4_model_pack_2 | Pinkie Pie (angry), Pinkie Pie (annoyed), Pinkie Pie (anxious), Pinkie Pie (fearful), Pinkie Pie (happy), Pinkie Pie (neutral), Pinkie Pie (nonverbal), Pinkie Pie (sad), Pinkie Pie (sad shouting), Pinkie Pie (shouting), Pinkie Pie (surprised), Pinkie Pie (tired), Pinkie Pie (whispering) |
so_vits_svc_5_model_pack_0 | Applejack (singing, mane6), Fluttershy (singing, mane6), Pinkie Pie (singing), Pinkie Pie (singing, mane6), Rainbow Dash (singing, mane6), Rarity (singing, mane6), Twilight Sparkle (singing, mane6) |
rvc_model_pack_0 | Babs Seed, Big McIntosh, Braeburn, Bunni Bunni, Cozy Glow, Cream Heart, Derpy Hooves, Diamond Tiara, Doctor Whooves, Gallus, Octavia Melody, Thorax, Twilight Sparkle (singing), Vinyl Scratch |
rvc_model_pack_1 | Applejack, Applejack (alt), Fluttershy, Fluttershy (alt), Pinkie Pie, Pinkie Pie (alt), Rainbow Dash (alt), Rarity (alt), Twilight Sparkle (alt) |
If you have acquired or trained a model that is not included with Hay Say, you can add it to Hay Say by copying it to
the relevant characters folder inside the main docker container:
/home/luna/hay_say/models/[architecture]/characters/
where [architecture] is one of: controllable_talknet, rvc, so_vits_svc_3, so_vits_svc_4, or so_vits_svc_5
docker container ls
Note: You may need to expand the command window to properly display the output, which is arranged like a wide table. You should see a column titled "IMAGE" in the output. Look for the entry "hydrusbeta/hay_say:hay_say_ui" and find the corresponding container name under the "NAMES" column. You will need that name in a moment: The name you see might be a little different. For example, another name I have seen on someone else's machine was "hay_say_ui-hay_say_ui-1".
{
"speaker": "<name of speaker>"
}
where <name of speaker>
should match one of the strings under "spk" in the config.json file.
docker cp
command. For example, if you have a folder named "Rainbowshine_Custom" on your desktop containing a so-vits-svc 4.0
model, you can copy it by executing the following on Linux or MacOS:
docker cp ~/Desktop/Rainbowshine_Custom/. hydrusbeta-hay_say_ui-1:/home/luna/hay_say/models/so_vits_svc_4/characters/RainbowShine_Custom
or the following command on Windows:
docker cp %HOMEDRIVE%%HOMEPATH%/Desktop/Rainbowshine_Custom/. hydrusbeta-hay_say_ui-1:/home/luna/hay_say/models/so_vits_svc_4/characters/RainbowShine_Custom
GPU integration is turned off by default in Hay Say. This is to prevent an error for users who do not have a Cuda-capable GPU. If you do have a Cuda-capable GPU on a Windows or Linux machine, you can enable GPU integration:
docker compose up
instead of sudo docker compose up
); if that is the case, then be sure to follow the
steps in https://docs.docker.com/engine/security/rootless/ as well.deploy:
restart_policy:
condition: on-failure
window: 30s
# resources:
# reservations:
# devices:
# - driver: nvidia
# count: all
# capabilities: [gpu]
Remove the hashtags so that it looks like this instead:
deploy:
restart_policy:
condition: on-failure
window: 30s
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
There are a couple of ways you can reduce the disk usage of Hay Say.
Important! Windows users must complete additional steps to free disk space after following any of these methods. See Additional Required Steps for Windows Users.
Launch Hay Say and click on the "Manage Models" button at the top of the screen.
This will open a screen where you can delete characters.
By default, Hay Say downloads all supported AI architectures. This currently includes Controllable TalkNet, so-vits-svc 3.0, so-vits-svc 4.0, so-vits-svc 5.0, and Retrieval-based Voice Conversion (RVC). Each of those take about 10GB. If you want to reclaim some disk space by deleting undesired architectures, you must first disable the architecture so that Hay Say does not automatically re-download it. Open the docker-compose.yaml file and look for sections named like architectureName_server. For example, here is the section that defines the so-vits-svc 3.0 server:
# This container provides a web service interface to so-vits-svc 3.0.
so_vits_svc_3_server:
depends_on:
- redis
image: hydrusbeta/hay_say:so_vits_svc_3_server
working_dir: /home/luna/hay_say/so_vits_svc_3
volumes:
- so_vits_svc_3_model_pack_0:/home/luna/hay_say/so_vits_svc_3_model_pack_0
- so_vits_svc_3_model_pack_1:/home/luna/hay_say/so_vits_svc_3_model_pack_1
- models:/home/luna/hay_say/models
- audio_cache:/home/luna/hay_say/audio_cache
# GPU integration is disabled by default to prevent an error on machines that do not have a Cuda-capable GPU.
# Uncomment the lines below to enable it for so-vits-svc 3.0 if you wish.
# deploy:
# resources:
# reservations:
# devices:
# - driver: nvidia
# count: all
# capabilities: [gpu]
You can disable so-vits-svc 3.0 by commenting out this section, i.e., add hashtags at the beginnings of the lines to make it look like this:
# This container provides a web service interface to so-vits-svc 3.0.
# so_vits_svc_3_server:
# depends_on:
# - redis
# image: hydrusbeta/hay_say:so_vits_svc_3_server
# working_dir: /home/luna/hay_say/so_vits_svc_3
# volumes:
# - so_vits_svc_3_model_pack_0:/home/luna/hay_say/so_vits_svc_3_model_pack_0
# - so_vits_svc_3_model_pack_1:/home/luna/hay_say/so_vits_svc_3_model_pack_1
# - models:/home/luna/hay_say/models
# - audio_cache:/home/luna/hay_say/audio_cache
# GPU integration is disabled by default to prevent an error on machines that do not have a Cuda-capable GPU.
# Uncomment the lines below to enable it for so-vits-svc 3.0 if you wish.
# deploy:
# resources:
# reservations:
# devices:
# - driver: nvidia
# count: all
# capabilities: [gpu]
Next, delete both the Docker container and Docker image for so-vits-svc 3.0 to free disk space. Open a command prompt or terminal and execute the following command to list all containers:
docker container ls -a
The output should be similar to the following:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4d3098ae4c2f hydrusbeta/hay_say:so_vits_svc_3_server "/bin/sh -c '/home/l…" 9 days ago Exited (137) 9 days ago hay_say-so_vits_svc_3_server-1
0c6c9eac8573 hydrusbeta/hay_say:so_vits_svc_5_server "/bin/sh -c '/home/l…" 9 days ago Exited (137) 9 days ago hay_say-so_vits_svc_5_server-1
7defd670649c hydrusbeta/hay_say:controllable_talknet_server "/bin/sh -c '/home/l…" 9 days ago Exited (137) 9 days ago hay_say-controllable_talknet_server-1
9b85ad39fea9 hydrusbeta/hay_say:so_vits_svc_4_server "/bin/sh -c '/home/l…" 9 days ago Exited (137) 9 days ago hay_say-so_vits_svc_4_server-1
48bc80452718 hydrusbeta/hay_say:rvc_server "/bin/sh -c '/home/l…" 9 days ago Exited (137) 9 days ago hay_say-rvc_server-1
d0343f8f00d4 hydrusbeta/hay_say:hay_say_ui "/bin/sh -c 'python …" 9 days ago Exited (137) 9 days ago hay_say-hay_say_ui-1
d82816c5889a redis "docker-entrypoint.s…" 9 days ago Exited (0) 9 days ago hay_say-redis-1
c26692c3240b hydrusbeta/hay_say:so_vits_svc_4_model_pack_1 "/bin/sh" 9 days ago Exited (0) 9 days ago hay_say-so_vits_svc_4_model_pack_1-1
e49ad2b1ff83 hydrusbeta/hay_say:so_vits_svc_3_model_pack_1 "/bin/sh" 9 days ago Exited (0) 9 days ago hay_say-so_vits_svc_3_model_pack_1-1
35913b4b7f7e hydrusbeta/hay_say:so_vits_svc_4_model_pack_2 "/bin/sh" 9 days ago Exited (0) 9 days ago hay_say-so_vits_svc_4_model_pack_2-1
fd2f36c568e3 hydrusbeta/hay_say:so_vits_svc_3_model_pack_0 "/bin/sh" 11 days ago Exited (0) 9 days ago hay_say-so_vits_svc_3_model_pack_0-1
feb9c22c9d1b hydrusbeta/hay_say:controllable_talknet_model_pack_0 "/bin/sh" 11 days ago Exited (0) 9 days ago hay_say-controllable_talknet_model_pack_0-1
d3c8114e4639 hydrusbeta/hay_say:so_vits_svc_5_model_pack_0 "/bin/sh" 11 days ago Exited (0) 9 days ago hay_say-so_vits_svc_5_model_pack_0-1
3830263165be hydrusbeta/hay_say:so_vits_svc_4_model_pack_0 "/bin/sh" 11 days ago Exited (0) 9 days ago hay_say-so_vits_svc_4_model_pack_0-1
9c5f0eb3478a hydrusbeta/hay_say:rvc_model_pack_0 "/bin/sh" 11 days ago Exited (0) 9 days ago hay_say-rvc_model_pack_0-1
d4824c05694a hydrusbeta/hay_say:rvc_model_pack_1 "/bin/sh" 11 days ago Exited (0) 9 days ago hay_say-rvc_model_pack_1-1
In the NAMES column, look for the name of the architecture you want to delete, followed by "_server". In this case, we have hay_say-so_vits_svc_3_server-1. Delete that container by executing the following command:
docker container rm <name of the container you want to delete>
In this case,
docker container rm hay_say-so_vits_svc_3_server-1
Next, execute the following command to list all Docker images:
docker image ls
The output should be similar to the following:
REPOSITORY TAG IMAGE ID CREATED SIZE
redis latest 8e69fcb59ff4 5 weeks ago 130MB
hydrusbeta/hay_say hay_say_ui 381f9d276433 6 weeks ago 1.47GB
hydrusbeta/hay_say rvc_server a619e2e6e6ee 6 weeks ago 11.1GB
hydrusbeta/hay_say rvc_model_pack_0 032a923041bf 6 weeks ago 1.34GB
hydrusbeta/hay_say so_vits_svc_5_server 2b15ea8db246 8 weeks ago 10GB
hydrusbeta/hay_say so_vits_svc_5_model_pack_0 65410b0d89b7 8 weeks ago 1.11GB
hydrusbeta/hay_say controllable_talknet_server 65bfaae29689 2 months ago 8.13GB
hydrusbeta/hay_say so_vits_svc_4_server 4bae6175c9d4 2 months ago 6.89GB
hydrusbeta/hay_say so_vits_svc_3_server e3224b5a2c79 2 months ago 6.12GB
hydrusbeta/hay_say so_vits_svc_4_model_pack_0 8f73a6838a1d 2 months ago 8.33GB
hydrusbeta/hay_say so_vits_svc_3_model_pack_0 8c92571c4566 2 months ago 8.4GB
hydrusbeta/hay_say controllable_talknet_model_pack_0 972e53accb49 3 months ago 3.96GB
Look for the desired architecture name in the "TAG" column and then delete the desired architecture by executing the following command:
docker image rm hydrusbeta/hay_say:<tag of image you would like to delete>
So, for so-vits-svc 3.0 for example, that would be:
docker image rm hydrusbeta/hay_say:so_vits_svc_3_server
Optional Step: You can hide the architecture in the UI by editing the docker-compose.yaml file. Look for the following lines:
command: ["/bin/sh", "-c", "
celery --workdir ~/hay_say/hay_say_ui/ -A celery_download:celery_app worker --loglevel=INFO --concurrency 5 --include_architecture ControllableTalkNet --include_architecture SoVitsSvc3 --include_architecture SoVitsSvc4 --include_architecture SoVitsSvc5 --include_architecture Rvc &
celery --workdir ~/hay_say/hay_say_ui/ -A celery_generate_gpu:celery_app worker --loglevel=INFO --concurrency 1 --cache_implementation file --include_architecture ControllableTalkNet --include_architecture SoVitsSvc3 --include_architecture SoVitsSvc4 --include_architecture SoVitsSvc5 --include_architecture Rvc &
celery --workdir ~/hay_say/hay_say_ui/ -A celery_generate_cpu:celery_app worker --loglevel=INFO --concurrency 1 --cache_implementation file --include_architecture ControllableTalkNet --include_architecture SoVitsSvc3 --include_architecture SoVitsSvc4 --include_architecture SoVitsSvc5 --include_architecture Rvc &
gunicorn --config=server_initialization.py --workers 1 --bind 0.0.0.0:6573 'wsgi:get_server(enable_model_management=True, update_model_lists_on_startup=True, enable_session_caches=False, migrate_models=True, cache_implementation=\"file\", architectures=[\"ControllableTalkNet\", \"SoVitsSvc3\", \"SoVitsSvc4\", \"SoVitsSvc5\", \"Rvc\"])'
"]
The architecture's name will appear four times. Delete "--include_architecture <architectureName>"
on the 2nd, 3rd,
and 4th lines and also delete the architecture name after the "--architectures"
flag on the 5th line. For example,
here's the result after removing so-vits-svc 3.0:
command: ["/bin/sh", "-c", "
celery --workdir ~/hay_say/hay_say_ui/ -A celery_download:celery_app worker --loglevel=INFO --concurrency 5 --include_architecture ControllableTalkNet --include_architecture SoVitsSvc4 --include_architecture SoVitsSvc5 --include_architecture Rvc &
celery --workdir ~/hay_say/hay_say_ui/ -A celery_generate_gpu:celery_app worker --loglevel=INFO --concurrency 1 --cache_implementation file --include_architecture ControllableTalkNet --include_architecture SoVitsSvc4 --include_architecture SoVitsSvc5 --include_architecture Rvc &
celery --workdir ~/hay_say/hay_say_ui/ -A celery_generate_cpu:celery_app worker --loglevel=INFO --concurrency 1 --cache_implementation file --include_architecture ControllableTalkNet --include_architecture SoVitsSvc4 --include_architecture SoVitsSvc5 --include_architecture Rvc &
gunicorn --config=server_initialization.py --workers 1 --bind 0.0.0.0:6573 'wsgi:get_server(enable_model_management=True, update_model_lists_on_startup=True, enable_session_caches=False, migrate_models=True, cache_implementation=\"file\", architectures=[\"ControllableTalkNet\", \"SoVitsSvc4\", \"SoVitsSvc5\", \"Rvc\"])'
"]
Windows users must complete additional steps to free disk space after deleting characters or images. You have a couple of options:
wsl --shutdown
diskpart
select vdisk file="C:\\path\\to\\your\\vhdx\\file.vhdx"
attach vdisk readonly
compact vdisk
detach vdisk
exit
If you are running Windows 10 Pro, you can use just these two commands instead of all the ones in steps 4 and 5:
wsl --shutdown
optimize-vhd -Path "C:\\path\\to\\your\\vhdx\\file.vhdx" -Mode full
If you have a drive with enough free space to save a copy of the vhdi file, another options is to use wslcompact. wslcompact saves a copy of the vhdi file first, operates on the copy, and then overwrites the original vhdi file if the operation is successful
irm get.scoop.sh | iex
scoop bucket add .oki https://github.com/okibcn/Bucket
scoop install wslcompact
$env:TEMP="Z:/specify/a/folder/on/the/drive"
wslcompact -c -d docker-desktop-data
It is possible that your WSL distro name is different. If the command in step 5 does not work, then execute the following to list all distro names:
wslcompact -l
and search for a distro name with the word "docker" or "ubuntu" in it.
Hay Say runs on the Docker Engine. On Windows, Docker typically runs on a virtualization platform called
"Windows Subsystem for Linux, version 2", or WSL2, which stores all of its data on a virtual hard disk (a .vhdx
file). The vhdx file will automatically grow in size as data is added to it (e.g. when you download a new model
in Hay Say), but it will not automatically shrink when you delete files. To reclaim unused disk space from WSL2,
you need to manually shrink the .vhdx file. There is an open feature request for Microsoft to make WSL
automatically release disk space, which is discussed here:
https://github.com/microsoft/WSL/issues/4699
Linux and MacOS users are unaffected by this issue and should see an immediate increase in disk space after deleting models or architectures.
The user interface code for Hay Say runs in its own Docker container, hay_say_ui, and the UI is accessed by the user via a web browser. Each AI architecture (e.g. so-vits-svc or ControllableTalkNet) is installed in its own container and a simple Flask web server runs in each one, listening for connections. Each Flask web server defines a /generate method which invokes the AI architecture to generate an audio file.
After the user enters their desired options and presses the "Generate!" button, hay_say_ui first preprocesses the audio and saves the result to a mounted volume, audio_cache. It then makes a web service call to the container with the desired AI architecture, instructing it to generate audio. That container reads the preprocessed audio from audio_cache, generates a pony voice from it, and saves it back to audio_cache. The hay_say_ui container then reads the generated audio, performs any postprocessing, and presents the result to the user via an HTML audio element.
Hence, there are 2 mechanisms of communication between the docker containers: calling web services and passing files over a shared volume.
Weights for the neural networks are stored in Docker volumes that are named like "controllable_talknet_model_pack_0" or "so_vits_svc_4_model_pack_1". Within each volume is a collection of folders, one for each pony character. Each character folder contains files with the model weights for that character. The volumes are initially populated by pulling a Docker image containing models for many characters and mounting the corresponding model_pack volume to that image, which automatically copies all the models from the image to the volume.
Hay say also has an "audio_cache" volume for maintaining a small cache of audio files. Any time the user uploads an audio file to Hay Say, the file is saved to audio_cache/raw. When a file is preprocessed, the result is saved to audio_cache/preprocessed. When an audio file is generated, the result is saved to audio_cache/output and, finally, when an output audio is postprocessed, the result is saved in audio_cache/postprocessed. Up to 25 files can be stored in each subfolder before Hay Say begins to automatically delete the oldest files. The purpose of the caching system is twofold. First, by storing the file at various stages of production, Hay Say will be able to present a playback button at each stage, so users can compare the "before" and "after" of a particular step. For example, the user can play back the raw audio and also play back the preprocessed audio to compare them. Second, some operations take a lot of computational power to complete, such as generating output from one of the AI architectures. By caching that output, Hay Say can allow the user to tweak the post-processing options and listen to the results over and over without needing to invoke the AI architecture each time.
The code for the main UI is in this repository. Code for the Flask servers for the other containers can be found in various other code repositories on hydrusbeta's account. See https://github.com/hydrusbeta?tab=repositories.
Here are some tasks I consider important. I intend to work on them "soon", in no particular order:
Plus, there are numerous minor code improvement opportunities that I have marked with "todo" throughout the codebase.
Hay Say will run on most machines but will be very slow on older hardware. Here are the results of some tests I ran on various computers I could get my hooves on. Hopefully it will help inform you how fast or slow you can expect Hay Say to run on your machine. They are ordered here from best-performing machines to worst.
If it looks like your machine will perform poorly with Hay Say, an alternative for generating pony voices would be to
try out one of the many online Google Colab UI projects listed throughout the following document, which run on Google's
servers instead of your local machine:
https://docs.google.com/document/d/1y1pfS0LCrwbbvxdn3ZksH25BKaf0LaO13uYppxIQnac/edit
Note: At the time of this writing, Hay Say is not affiliated with any of the Google Colab projects, and Hay Say itself
is not available in Google Colab.
I discovered during testing that, unfortunately, loading an audio file into Hay Say can take a significant amount of time on some machines. I have recorded those times here as well. In this case, I was uploading a short (<3-second) audio recording to use as an audio input.
The tests were performed manually, and times were taken with a stopwatch. Each AI architecture was invoked 2-3 times to generate a pony saying "Testing, testing 1 2 3!" and then the times were averaged. In each case, GPU integration was disabled, so the CPU was used exclusively.
Note: Generating Audio with any architecture was noticeably slower for the first audio output and faster on subsequent generations, even with a different character selected.
Machine Name | Age of Computer (years) as of May 2023 | CPU | RAM | Loading time for a short audio recording (s) | Controllable TalkNet Generation time (s) | so-vits-svc 3.0 Generation time (s) | so-vits-svc 4.0 Generation time (s) |
---|---|---|---|---|---|---|---|
Custom Built Tower | 0.5 | 13th gen i9 | 32 GB (6000 MHz) | 0 | 8 | 5 | 6 |
Aspire TC-895 | 2 | 10th gen i5 | 12 GB (2600 MHz) | 3 | 26.5 | 30.5 | 27.5 |
Dell XPS 13 9380 | 4 | 8th gen i7 | 16 GB (1200 MHz) | 1 | 44 | 38 | 41 |
Toshiba Qosmio laptop | 10 | 4th gen i7 | 16 GB (1600 MHz) | 10 | 198 | 174 | 227 |
Mac Mini | 0 | M2 Apple Silicon | 8 GB | 27 | N/A [1] | 589 | 575 |
[1] Test failed with an error. Unable to generate audio.