Open mepc36 opened 1 year ago
Turns out I'm doing this wrong. I don't need to pass in a mask_image_id. Just change the seed_image_id
to point at the audio I want to change (i.e., the techno bass), and don't pass a mask image at all:
{
"alpha": 0.75,
"num_inference_steps": 50,
"seed_image_id": "bass",
"start": {
"prompt": "cello",
"seed": 42,
"denoising": 0.55,
"guidance": 7.0
},
"end": {
"prompt": "cello",
"seed": 42,
"denoising": 0.55,
"guidance": 7.0
}
}
Now the challenge is to enable dynamic seed images at the Banana server...
Hey there, thanks a lot for the repo man!
My goal is to do audio-to-audio with a text prompt using this banana-riffusion repo. More specifically, I want to pass in a techno-sounding bass guitar; also send a text prompt like "cello"; and then getting back new base64 data that represents that techno bass guitar, but with a "cello" text prompt applied to it. Here's a screenshot of Riffusion's streamlit app showing which UI I want to programmatically access:
I tried to do this by passing in a
mask_image_id
parameter pointing to those rock and roll drums in the request, and something broke. I'm trying to use a file calledbass.png
that exists in~/bass.png
as the mask_image. I createdbass.png
using theaudio-to-image
command from the Riffusion CLI. I then uploaded it manually to the server usingscp
and put it in the~/seed_images
folder.Unfortunately, I got the following error when trying to do this. I have this
bana-riffusion
repo running on an AWS GPU in the cloud. I'm not running this repo inside a Docker container there, just starting it with the following command:I use the AWS GPU for testing this repo, but my prod env obviously hooks up to a banana-hosted instance of this repo. Using an AWS GPU for dev lets me cut down on dev time, since I don't have to wait for the Banana pipeline's artifact-build or coldstarts. I have not yet tested if I'd get this error in the banana-hosted instance of this repo
Here's the request I'm sending. I send it to
http://{{GPU_IP}}:3013
:Here's the error I'm getting back:
Any help please? Does the
mask_image_id
parameter not represent one of the images in animg2img
/audio_to_audio
conversion? Can this repo not doaudio_to_audio
, and I have to create my own banana-riffusion to achieve that?Again, really amazing repo, saved me about 3 weeks of dev time if I'd have had to put riffusion together for banana myself. Thanks!