Open buscon opened 1 year ago
can you describe your request in detail? maybe i can help implement it
thanks!
This is what I imagine:
audioldm --mode "transfer" --file_path trumpet.wav 70% cello.wav 30% -t "Children Singing"
if no percentage is given, each prompt sample should have the same influence on the output. I think you are doing something similar, mixing the influence of the audio prompt together with the influence of the text prompti have a lot of ideas to implement your idea , wait my arXiv paper this year (i am doing something other currently)
@buscon does transfer learning from the input audio sample for human spoken to generate a specific human voice with model with giga_speech already works?
@buscon does transfer learning from the input audio sample for human spoken to generate a specific human voice with model with giga_speech already works?
I think it does, though I tested it long time ago and cannot remember right now. I will try it again soon and report back here.
@buscon did you manage to find the way? or any success?
@buscon did you manage to find the way? or any success?
not yet, I cannot install audioldm with pip anymore. I think it's related to the overall upgrade of python 3.12. I will answer again when I figured that out.
for sound design purposes, it would be interesting to use the audio-to-audio feature with multiple samples
is that possible? if not, any hint on how I could add such feature?