suno-ai / bark

🔊 Text-Prompted Generative Audio Model
MIT License
35.31k stars 4.15k forks source link

What does "we limit the audio history prompts" mean? #126

Closed DDBE12 closed 1 year ago

DDBE12 commented 1 year ago

What does it mean when you guys say, "we limit the audio history prompts to a limited set of Suno-provided, fully synthetic options to choose from for each language"?

Dooes it mean you limit the number of prompt commands to those c. 10 prompt commands of "laughs", "sings", "clears throat", etc.? Or do you mean we can't clone our own voices with our own source material? I'm asking because I don't see any code or instructions on how to clone a voice by means of Bark or Suno.

St33lMouse commented 1 year ago

I have the same question. Cloning is the most interesting thing. If I'm limited to a set of boring voices I don't like, I might as well use Silero. It's fast, has no overhead, and has a large number of of boring voices I don't like. But at least it's fast and resource cheap.

St33lMouse commented 1 year ago

I don't see a point to limiting Bark's capabilities, unless you are afraid of lawsuits. If that's the fear, fine, I get it, but it is kind of sad.

You can already clone voices in eleven labs and several other services, and you can do it with Tortoise, and maybe even coqui. So it's not like limiting Bark is going to stop people from cloning voices for nefarious or mean purposes. The only thing you get is potential lawsuit protection and a GUARANTEE that Bark will not be the final choice for millions of future AI users who want unique and interesting voices for their characters.

The responsibility for misuse of a gun, knife, car, hammer, coconut or Bark lies entirely with the user.

DDBE12 commented 1 year ago

Just to add to the above people chiming in and just to make sure, you guys at Suno-AI who are writing Bark do know about the technique of inaudible watermarks, right? It's how the entire sector of voice cloning is preventing nefarious use of the technology. So, if you're worried somebody could start defamation campaigns or spread disinformation, the file can easily be analyzed to find the inaudible watermark saying, "This is an artificial voice, no human has ever said this."

gkucsko commented 1 year ago

closing for now since we moved away from a functional issue. feel free to move to discussions