Closed Aamir3d closed 1 week ago
Is it just sequential batching (i.e., do 1-1-1-1-1 automatically then give 5) or are you looking for a 5-at-once scenario? Also do you want to just have the same prompt like A-A-A-A-A or rather have the ability to give many prompts at once with each line like:
water
water 320Kbps
loud water 320Kbps
quiet water
water stream
You have two great ideas here! The ideal would be to start a batch, and have it finish and show the generations below in a row format (like in your outputs section). In this case - Audio1-Audio2-Audio3.....Audio7 would work (Giving 7 files with random seeds, but with the same parameters).
The idea for multiple prompts is even better! One could do variations for each batch and choose the best output.
And while we're discussing this, this same approach could potentially be applied to the first screen where you have several buttons that say "Generate 1" "Generate 2" etc. Simplifying the interface to specify number of audio files and a single generate button will make it more consistent.
You have two great ideas here! The ideal would be to start a batch, and have it finish and show the generations below in a row format (like in your outputs section). In this case - Audio1-Audio2-Audio3.....Audio7 would work (Giving 7 files with random seeds, but with the same parameters).
The idea for multiple prompts is even better! One could do variations for each batch and choose the best output.
And while we're discussing this, this same approach could potentially be applied to the first screen where you have several buttons that say "Generate 1" "Generate 2" etc. Simplifying the interface to specify number of audio files and a single generate button will make it more consistent.
That sounds good! I am thinking about doing it for the React UI since it's a lot easier than the gradio. Would you be ok with using the React UI, at least for this use case?
From the end user's perspective, as long as there's a good GUI, the back end development should not be an issue. Whatever is easier for you and I know you're improving the GUI constantly.
Hopefully this resolves it: https://github.com/rsxdalv/tts-generation-webui/pull/281
Hopefully this resolves it: #281
Thank you - I'm looking to test this out over the weekend! Will share updates.
Issue: Currently, we have to click generate every time we want to generate a new sound in the Musicgen tab (for all the Facebook models). It gets slightly tedious when one wants to generate different sounds and select a sound that is good enough to be used.
Feature request: It would be good to have a "Batch number" where one can select the number of generations that are done by the application. For example, one can select "5" and have the system output 5 tunes.
Additional good to have: Shortcut Key "Ctrl+Enter" to start generation (like in A1111 and Fooocus and other applications)