nomadkaraoke / python-audio-separator

Easy to use stem (e.g. instrumental/vocals) separation from CLI or as a python package, using a variety of amazing pre-trained models (primarily from UVR)
MIT License
469 stars 81 forks source link

I didn't quite understand the instructions for usage #111

Closed Bebra777228 closed 1 month ago

Bebra777228 commented 1 month ago

When I was reading the information about usage, I didn't quite understand if it's possible to do something similar to what is shown in the screenshot below?

image

Here is the code I wrote for testing (yes, it's not perfect, but it will do for the test 😁). However, it doesn't work properly. The file names in the output do not match the ones I specified, which causes an error because the script cannot find the required file.

image

Maybe I'm doing something wrong? I would be very grateful if you could give me a hint on how to work with your program correctly, at least approximately 😅.

beveradb commented 1 month ago

If you can share your full code (as text in a code block, not as screenshots 😅) and the error you're getting I can try and help you resolve it 😄

I'm not sure exactly what you're trying to do or what error you faced, but you might find this comment useful from someone else's issue where they wanted to rename the output files: https://github.com/nomadkaraoke/python-audio-separator/issues/89#issuecomment-2234119906

Also, once you figure out how to use audio-separator for your use case, please raise a PR improving the documentation to help future people like you!

Bebra777228 commented 1 month ago

Code:

import os
from audio_separator.separator import Separator

input = "/content/Сатисфакция - Один на один.mp3"
output = "/content/out"

separator = Separator(output_dir = output)

voc_inst = True
voc = os.path.join(output, f"Vocal.wav")
inst = os.path.join(output, f"Instrumental.wav")

back_voc = True
back = os.path.join(output, f"Back Vocal.wav")

de_reverb = True
voc_no_reverb = os.path.join(output, f"Vocal (no reverb).wav")
back_no_reverb = os.path.join(output, f"Back Vocal (no reverb).wav")

if voc_inst:
  separator.load_model(model_filename='model_bs_roformer_ep_317_sdr_12.9755.ckpt')
  voc, inst = separator.separate(input)

if back_voc:
  separator.load_model(model_filename='mel_band_roformer_karaoke_aufr33_viperx_sdr_10.1956.ckpt')
  back = separator.separate(voc)

if de_reverb:
  separator.load_model(model_filename='UVR-DeEcho-DeReverb.pth')
  voc_no_reverb = separator.separate(voc)
  back_no_reverb = separator.separate(back)

Full error output:

---------------------------------------------------------------------------
LibsndfileError                           Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/librosa/core/audio.py](https://localhost:8080/#) in load(path, sr, mono, offset, duration, dtype, res_type)
    175         try:
--> 176             y, sr_native = __soundfile_load(path, offset, duration, dtype)
    177 

11 frames
[/usr/local/lib/python3.10/dist-packages/librosa/core/audio.py](https://localhost:8080/#) in __soundfile_load(path, offset, duration, dtype)
    208         # Otherwise, create the soundfile object
--> 209         context = sf.SoundFile(path)
    210 

[/usr/local/lib/python3.10/dist-packages/soundfile.py](https://localhost:8080/#) in __init__(self, file, mode, samplerate, channels, subtype, endian, format, closefd)
    657                                          format, subtype, endian)
--> 658         self._file = self._open(file, mode_int, closefd)
    659         if set(mode).issuperset('r+') and self.seekable():

[/usr/local/lib/python3.10/dist-packages/soundfile.py](https://localhost:8080/#) in _open(self, file, mode_int, closefd)
   1215             err = _snd.sf_error(file_ptr)
-> 1216             raise LibsndfileError(err, prefix="Error opening {0!r}: ".format(self.name))
   1217         if mode_int == _snd.SFM_WRITE:

LibsndfileError: Error opening 'Сатисфакция - Один на один_(Instrumental)_model_bs_roformer_ep_317_sdr_12.wav': System error.

During handling of the above exception, another exception occurred:

FileNotFoundError                         Traceback (most recent call last)
[<ipython-input-2-4cba904f4bc2>](https://localhost:8080/#) in <cell line: 24>()
     24 if back_voc:
     25   separator.load_model(model_filename='mel_band_roformer_karaoke_aufr33_viperx_sdr_10.1956.ckpt')
---> 26   back = separator.separate(voc)
     27 
     28 if de_reverb:

[/usr/local/lib/python3.10/dist-packages/audio_separator/separator/separator.py](https://localhost:8080/#) in separate(self, audio_file_path)
    733 
    734         # Run separation method for the loaded model
--> 735         output_files = self.model_instance.separate(audio_file_path)
    736 
    737         # Clear GPU cache to free up memory

[/usr/local/lib/python3.10/dist-packages/audio_separator/separator/architectures/mdxc_separator.py](https://localhost:8080/#) in separate(self, audio_file_path)
    130 
    131         self.logger.debug(f"Preparing mix for input audio file {self.audio_file_path}...")
--> 132         mix = self.prepare_mix(self.audio_file_path)
    133 
    134         self.logger.debug("Normalizing mix before demixing...")

[/usr/local/lib/python3.10/dist-packages/audio_separator/separator/common_separator.py](https://localhost:8080/#) in prepare_mix(self, mix)
    203         if not isinstance(mix, np.ndarray):
    204             self.logger.debug(f"Loading audio from file: {mix}")
--> 205             mix, sr = librosa.load(mix, mono=False, sr=self.sample_rate)
    206             self.logger.debug(f"Audio loaded. Sample rate: {sr}, Audio shape: {mix.shape}")
    207         else:

[/usr/local/lib/python3.10/dist-packages/librosa/core/audio.py](https://localhost:8080/#) in load(path, sr, mono, offset, duration, dtype, res_type)
    182                     "PySoundFile failed. Trying audioread instead.", stacklevel=2
    183                 )
--> 184                 y, sr_native = __audioread_load(path, offset, duration, dtype)
    185             else:
    186                 raise exc

<decorator-gen-161> in __audioread_load(path, offset, duration, dtype)

[/usr/local/lib/python3.10/dist-packages/librosa/util/decorators.py](https://localhost:8080/#) in __wrapper(func, *args, **kwargs)
     57             stacklevel=3,  # Would be 2, but the decorator adds a level
     58         )
---> 59         return func(*args, **kwargs)
     60 
     61     return decorator(__wrapper)

[/usr/local/lib/python3.10/dist-packages/librosa/core/audio.py](https://localhost:8080/#) in __audioread_load(path, offset, duration, dtype)
    238     else:
    239         # If the input was not an audioread object, try to open it
--> 240         reader = audioread.audio_open(path)
    241 
    242     with reader as input_file:

[/usr/local/lib/python3.10/dist-packages/audioread/__init__.py](https://localhost:8080/#) in audio_open(path, backends)
    125     for BackendClass in backends:
    126         try:
--> 127             return BackendClass(path)
    128         except DecodeError:
    129             pass

[/usr/local/lib/python3.10/dist-packages/audioread/rawread.py](https://localhost:8080/#) in __init__(self, filename)
     57     """
     58     def __init__(self, filename):
---> 59         self._fh = open(filename, 'rb')
     60 
     61         try:

FileNotFoundError: [Errno 2] No such file or directory: 'Сатисфакция - Один на один_(Instrumental)_model_bs_roformer_ep_317_sdr_12.wav'
Bebra777228 commented 1 month ago

Thanks! Here's what I ended up with:

import os
from audio_separator.separator import Separator

input = "/content/Сатисфакция - Один на один.mp3"
output = "/content/out"

separator = Separator(output_dir=output, vr_params={"batch_size": 1})

# Vocals and Instrumental
vocals = os.path.join(output, 'Vocals.wav')
instrumental = os.path.join(output, 'Instrumental.wav')

# Vocals with Reverb and Vocals without Reverb
vocals_reverb = os.path.join(output, 'Vocals (Reverb).wav')
vocals_no_reverb = os.path.join(output, 'Vocals (No Reverb).wav')

# Lead Vocals and Backing Vocals
lead_vocals = os.path.join(output, 'Lead Vocals.wav')
backing_vocals = os.path.join(output, 'Backing Vocals.wav')

# Splitting a track into Vocal and Instrumental
separator.load_model(model_filename='model_bs_roformer_ep_317_sdr_12.9755.ckpt')
voc_inst = separator.separate(input)
os.rename(os.path.join(output, voc_inst[0]), instrumental) # Rename file to “Instrumental.wav”
os.rename(os.path.join(output, voc_inst[1]), vocals) # Rename file to “Vocals.wav”

# Applying DeEcho-DeReverb to Vocals
separator.load_model(model_filename='UVR-DeEcho-DeReverb.pth')
voc_no_reverb = separator.separate(vocals)
os.rename(os.path.join(output, voc_no_reverb[0]), vocals_no_reverb) # Rename file to “Vocals (No Reverb).wav”
os.rename(os.path.join(output, voc_no_reverb[1]), vocals_reverb) # Rename file to “Vocals (Reverb).wav”

# Separating Back Vocals from Main Vocals
separator.load_model(model_filename='mel_band_roformer_karaoke_aufr33_viperx_sdr_10.1956.ckpt')
backing_voc = separator.separate(vocals_no_reverb)
os.rename(os.path.join(output, backing_voc[0]), backing_vocals) # Rename file to “Backing Vocals.wav”
os.rename(os.path.join(output, backing_voc[1]), lead_vocals) # Rename file to “Lead Vocals.wav”

Now the code functions as I need it to and saves files with the correct names.

image

As for improving the documentation, I think it would be better if you or someone else took care of it. I'm not very skilled at writing it.

beveradb commented 1 month ago

Nice one, glad you got it working for you! 😄

I reckon something in the README linking to your above comment would probably be helpful enough to be honest, I'll add that just now but if anyone else sees this and has a better idea feel free to raise a PR improving the docs!

beveradb commented 1 month ago

Added here - thanks for the example @Bebra777228 !

https://github.com/nomadkaraoke/python-audio-separator/tree/main?tab=readme-ov-file#using-different-models-to-extract-different-stems