ZFTurbo / Music-Source-Separation-Training

Repository for training models for music source separation.
MIT License
485 stars 67 forks source link

Is there any way to parallelize across lots of audio clips? #65

Closed AshwinSankar17 closed 2 months ago

AshwinSankar17 commented 3 months ago

I have a lot of audio clips, in the order of 1000s of hours that I want to clean. Is there any way to parallelize?

ZFTurbo commented 3 months ago

How many GPUs do you have?

AshwinSankar17 commented 3 months ago

I have 8 x 40GB gpus (Enough to not worry about it). But, I'm a bit short on time. Would appreciate any help.

ZFTurbo commented 3 months ago

The easiest way - split all data on 8 folders and run inference.py on every GPU in parralel for each folder.

To choose on which GPU to use the script do: CUDA_VISIBLE_DEVICES=1 python3 inference.py ...

Details: https://stackoverflow.com/questions/39649102/how-do-i-select-which-gpu-to-run-a-job-on

You can specify on which GPU to run the script inside inference.py:

if __name__ == '__main__':
    import os

    gpu_use = "0"
    print('GPU use: {}'.format(gpu_use))
    os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
    os.environ["CUDA_VISIBLE_DEVICES"] = "{}".format(gpu_use)

Put on the top of file.

AshwinSankar17 commented 3 months ago

Yes, I was doing this, but this approach is not VRAM efficient. I'm barely using a few GBs.

ZFTurbo commented 3 months ago

Increase the inference.batch_size in config.