Open hongseoi opened 6 months ago
Use Sox to remove silence in the audio file. It's not yet a complete success, but some improvements have been made.
import subprocess
import os
import glob
def remove_silence(input_file, output_file):
try:
# sox
subprocess.run([
'sox', input_file, output_file, 'silence', '2', '0.1', '1%', 'reverse', 'silence', '2', '0.1', '1%', 'reverse'
], check=True)
print(f'Successfully removed silence from {input_file} and saved to {output_file}')
except subprocess.CalledProcessError as e:
print(f'Error occurred: {e}')
def process_folder(input_folder, output_folder):
# mkdir output folder
os.makedirs(output_folder, exist_ok=True)
# process all of the wav files in the input_folder
for wav_file in glob.glob(os.path.join(input_folder, '*.wav')):
file_name = os.path.basename(wav_file)
output_wav = os.path.join(output_folder, file_name)
remove_silence(wav_file, output_wav)
input_folder = '~/data/train'
output_folder = '~/data/processed_train'
process_folder(input_folder, output_folder)
It was a really simple problem
Hi! I trained tacotron2 more than 60000 steps but I cannot get alignment properly. The alignment graph is as follows. Does anyone know the cause of this?
I'm training using 100 samples of elderly voice data selected from the common voice dataset.
Training performance was not good in previous attempts, so I looked for other issues.
But sadly it didn't work.