Open souvikqb opened 10 months ago
your audio input it's in 24 bit ?
I passed in webm and mp3 files, how do I check this?
your audio input it's in 24 bit ?
Using this file - https://upload.wikimedia.org/wikipedia/commons/c/c5/CA_AG_Kamala_Harris_2013_CADEM_Convention.webm
Can you elaborate more?
Can you elaborate more?
from pydub import AudioSegment
def convert_webm_to_wav(input_file, output_file): audio = AudioSegment.from_file(input_file, format="webm") audio.export(output_file, format="wav")
def crop_audio(input_file, output_file, seconds): audio = AudioSegment.from_wav(input_file) processed_audio = audio[:seconds * 1000] processed_audio.export(output_file, format="wav")
usage:
input_wav = 'CA_AG_Kamala_Harris_2013_CADEM_Convention.webm' converted_wav = 'converted.wav' cropped_wav = 'cropped.wav' seconds_to_crop = 60
convert_webm_to_wav(input_wav, converted_wav) crop_audio(converted_wav, cropped_wav, seconds_to_crop)
2. We can determine the bitrate of the audio recording using the code, but I used the website (I was too lazy to write code;)): https://www.advalify.io/audio-validator
<img width="355" alt="image" src="https://github.com/serp-ai/bark-with-voice-clone/assets/36342074/e1479ef6-4c57-43cf-9abd-c89f2136c79d">
3. Your audio is 32 bit:
<img width="235" alt="image" src="https://github.com/serp-ai/bark-with-voice-clone/assets/36342074/a5ea7610-993c-4643-9fd2-1754ed047afd">
4. Use this to convert to 24 bit: https://onlineaudioconverter.com/
I see,
Thanks for taking the effort.
But how should I use this to improve the video cloning performance
But how should I use this to improve the video cloning performance
If I understand correctly, do you want to make a deepfake for a video with a voice change?
If yes, here is the code to convert to 24 bits (https://stackoverflow.com/questions/44812553/how-to-convert-a-24-bit-wav-file-to-16-or-32-bit-files-in-python3):
import soundfile
input_wav = 'input.wav' # Maybe 32 bit?
output_wav = 'output.wav'
data, samplerate = soundfile.read(input_wav)
soundfile.write(output_wav, data, samplerate, subtype='PCM_24')
Yes thats a possibility
But for now I would just like a Voice Cloned Audio File
Say - Reading a normal speech but with a celebrity's or a user defined speaker voice
Does converting it to 24 bits help in video cloning process?
@souvikqb In fact, I myself have encountered difficulties when cloning a voice. Unfortunately, they do not give an answer to my question, but the option with a 24-bit translation gives me little hope of success. I will try it on my own data...
Thanks 👍
Do let me know if you get anything
Also can we tag the owner of this repository?
@souvikqb I think we can tag Francis @francislabountyjr.
I'm also stuck on my issue ;( #49
@souvikqb how can i contact you? I found another solution (from another project). I will not write here, because it does not apply to this project.
@souvikqb how can i contact you? I found another solution (from another project). I will not write here, because it does not apply to this project.
Please email me on -> autocar2060 @ gmail . com
@souvikqb @BrasD99 , if you guys succeeded in generating better voice cloning, could you please put your outputs here?
@BrasD99 Is it simply the bit rate difference causing the issue? I'd love to hear if there are other factors one could employ to improve the clone.
having difficulties just using my own voice with good results.
litterally one time out of a handful did I hear my voice. and it was a single "umm" at the start before switching back to some person who does not sound like mehaha
@Shyk92 did you ever make progress on this? I'm in the same boat.
Facing same issue
I am using the https://github.com/serp-ai/bark-with-voice-clone/blob/main/clone_voice.ipynb Notebook to generate audio clips similar to one provided by me.
While the code ran well, the resulting audio file was not really very good. I am using common American and British accents speakers
Any tips to tune the model to correctly get the results or any parameters to play with ?