deezer / spleeter

Deezer source separation library including pretrained models.
https://research.deezer.com/projects/spleeter.html
MIT License
25.56k stars 2.8k forks source link

[Bug] looping over files will cause a race condition #815

Open binbinxue opened 1 year ago

binbinxue commented 1 year ago

Description

when looping over audio files, do separation and save will cause a race condition. e.g. file1, file2, file3 after feeding to separater will give file2, file2, file1. I tried setting the multiprocessing to false and tried using separator.separate_to_file(), none worked, same behaviour

Step to reproduce

from spleeter.separator import Separator
from spleeter.audio.adapter import AudioAdapter
from scipy.io.wavfile import write

separator = Separator('spleeter:2stems', multiprocess=False)  # two parts output, vocals and accompanying noise
audio_loader = AudioAdapter.default()

def extract_vocals(input_dir, sampling_rate, rm_original=False):
    toprocess = []
    for root, dirs, files in os.walk(input_dir):
        for file in files:
            if file.endswith(('.wav', '.flac', '.m4a', '.mp3')):
                toprocess.append(Path(root) / file)

    for pt in tqdm(toprocess, unit='files'):
      waveform, _ = audio_loader.load(fpath, sample_rate=sampling_rate)
      prediction = separator.separate(waveform)
      vocals = librosa.to_mono(prediction['vocals'].T)
      write(str(fpath)[:-4] + '_vocals.wav', sampling_rate, vocals)
      if rm_original:
          os.remove(fpath)

extract_vocals('input_folder', 16000, False)
binbinxue commented 1 year ago

same behavior with the code spleeter separate -o outputFolder -p spleeter:2stems-16kHz inpFolder/*.wav 3 files in the input folder, the output folder has one file repeated and one file missing, order is scrambled too

bmcfee commented 1 year ago

:wave: just chiming in to note that we're also observing this behavior here (spleeter 2.3.2). Not sure if this is a dupe of #809 but it certainly seems related.