spotify / pedalboard

πŸŽ› πŸ”Š A Python library for audio.
https://spotify.github.io/pedalboard
GNU General Public License v3.0
5.23k stars 262 forks source link

vst_plugin in a for loop - multiprocessing issue #181

Open cweaver-logitech opened 1 year ago

cweaver-logitech commented 1 year ago

I'm using Pedalboard (which is amazing -thank you)to process a large number (8K) of audio files, using a VST3 plugin (Waves Clarity Vx Pro). I'm using the plugin in a simple for-loop:

if __name__ == '__main__':
    for audio_file in tqdm.tqdm(raw_files):
        vst_plugin(
            audio_file, 
            plugin_location=f'{str(clarity)}', 
            output_directory=output_directory
        )

After awhile (never got past > 150 files) I hit a error, which I must confess is quite beyond my Python skills:

%|β–ˆβ–‹                                                                                                                                       | 91/7559 [03:04<4:07:24,  1.99s/it]Not auto-determining label: found {'', 'dB'}
zsh: segmentation fault  python cleaning_tests_mac.py
(cleaning_eval) ec2-user@ip-172-31-16-239 cleaning_eval % /usr/local/Caskroom/miniconda/base/envs/cleaning_eval/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

I'm open to suggestions to get around this?

cweaver-logitech commented 1 year ago

Here is the vst_plug function:

def vst_plugin(input_audio, output_directory, plugin_location):
    if plugin_location.endswith("WaveShell1-VST3 14.5.vst3"):
        vst3 = VST3Plugin(plugin_location, plugin_name="Clarity Vx Pro Mono")
        # vst3.reset = True
    else:
        vst3 = VST3Plugin(plugin_location)
    plugin_name = vst3.name.replace(" ", "_")
    output_audio = input_audio.with_stem(input_audio.stem + "_" + plugin_name)
    output_audio = output_directory / output_audio.name

    if vst3.name == 'RX 10 Spectral De-noise':
        vst3.artifact_control = 10.0
        vst3.noise_reduction_db = 20.0
        vst3.linked_reduction_db = 20.0
        vst3.quality = 'Adv.+Extr.'
        vst3.artifact_control = 10.0
        vst3.adaptive_learning  = True
        vst3.multi_resolution = True 
        vst3.smoothing = 10.0
        vst3.synthesis = 10.0
        vst3.enhancement = 10.0
        vst3.masking = 10.0
    else:
        # plugin_name=
        vst3.bank="Improve Voice Detail"

    with AudioFile(str(input_audio), 'r') as f:
        audio = f.read(f.frames)
        samplerate = f.samplerate
        effected = vst3.process(audio, samplerate)

    with AudioFile(
        str(output_audio), 
        'w', 
        samplerate, 
        effected.shape[0]
    ) as f:
        f.write(effected)
    vst3.reset()
psobot commented 1 year ago

Hi @cweaver-logitech!

That's a bit tricky: a segmentation fault should never happen in Pedalboard (or Python), but I see three areas to investigate (both on the Pedalboard side):

Unfortunately, investigating the latter two options would require a copy of the plugin in question, which I don't have access to. Are you able to reproduce this with any other plugin, or just Clarity Vx Pro Mono?

If you're feeling adventurous, you could use the instructions in CONTRIBUTING.md to build Pedalboard from source in Debug mode, which would allow you to attach a debugger like lldb or gdb and get a native stack trace from the C++ code. That would tell us if it's Pedalboard or the plugin's code crashing. (Another, less useful option: using the faulthandler Python module to print out a Python stack trace, which will at least tell us which line of Python is triggering the issue.)

cweaver-logitech commented 1 year ago

Thanks for the detailed reply. I'll give building from source a go and report back what I find.

cweaver-logitech commented 1 year ago

A colleague will help me do the C++ debug but in the meantime, I ran the program again with the faulthandler option. Here's the output:

Fatal Python error: Bus error

Thread 0x00007000097cd000 (most recent call first):
  File "/usr/local/Caskroom/miniconda/base/envs/cleaning_eval/lib/python3.10/threading.py", line 324 in wait
  File "/usr/local/Caskroom/miniconda/base/envs/cleaning_eval/lib/python3.10/threading.py", line 607 in wait
  File "/usr/local/Caskroom/miniconda/base/envs/cleaning_eval/lib/python3.10/site-packages/tqdm/_monitor.py", line 60 in run
  File "/usr/local/Caskroom/miniconda/base/envs/cleaning_eval/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
  File "/usr/local/Caskroom/miniconda/base/envs/cleaning_eval/lib/python3.10/threading.py", line 973 in _bootstrap

Current thread 0x000000011e51a600 (most recent call first):
  File "/Users/ec2-user/Documents/cleaning_eval/cleaning_tests_mac.py", line 58 in vst_plugin
  File "/Users/ec2-user/Documents/cleaning_eval/cleaning_tests_mac.py", line 72 in <module>

Extension modules: numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator (total: 13)
zsh: bus error  python -X faulthandler cleaning_tests_mac.py
(cleaning_eval) ec2-user@ip-172-31-16-239 cleaning_eval % /usr/local/Caskroom/miniconda/base/envs/cleaning_eval/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '
cweaver-logitech commented 1 year ago

Not 100% that the code complied with the correct debug settings but this is the output of gdb: image

image

then this continues for about 800 lines: image

alisonbma commented 1 year ago

I'm having a potentially related issue to this one. I'm running the audio through a chain of plug-ins: "Fab Filter Pro Q 3" > "Izotope RX 8 De-reverb" > "Izotope RX 8 De-breath" on an M1 mac (13.2.1). I'm using Audio Units currently.

Screenshot 2023-03-26 at 12 08 19 PM
Nintorac commented 10 months ago

I am having a potentially related problem over here https://github.com/ray-project/ray/issues/42551

After seeing this I am thinking pedalboard may be at fault, and is perphaps leaking file descriptors, this gets to the edge of my knowledge so IDK

Nintorac commented 10 months ago

I was able to use this as a workaround to fix the issue in my case at the end of my render loop. eg end of vst_plugin for the OP

    del vst3
    import gc
    gc.collect()

The number of open fles did not change though so that is not the issue

Nintorac commented 9 months ago

just thought I'd mention, there is a reproduction in the ray issue https://github.com/ray-project/ray/issues/42551