asteroid-team / torch-audiomentations

Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
MIT License
969 stars 88 forks source link

Allow list of folders as path in AddBackgroundNoise #122

Closed FrenchKrab closed 2 years ago

FrenchKrab commented 2 years ago

Currently, the path parameter of AddBackgroundNoise either allows a single folder path, or a list of individual file paths. background_paths: Either a path to a folder with audio files or a list of paths to audio files.

The easiest way to use multiple folders I'm aware of is to set up symbolic links. I suggest that the path parameter simply allows a list of folders. Am I missing the intended way to do it, or is it worth suggesting a PR ?

iver56 commented 2 years ago

Hi :) Thanks for your patience!

If you're using the current version of torch-audiomentations, the best way to do it is probably to manually create a set of audio file paths from your collection of folders and give that to AddBackgroundNoise.

like

my_audio_file_paths = set()
for folder in my_folders:
    file_paths = get_file_paths(folder)
    my_audio_file_paths = my_audio_file_paths.union(file_paths)

my_transform = AddBackgroundNoise(list(my_audio_file_paths))

A folder with symlinked folders, as you say, is also an option, at least if you're running Linux.

A third option is to add support for passing it a list of folders. That would make the code in AddBackgroundNoise more complex. Is it worth the cost? Would you want to make a PR?

FrenchKrab commented 2 years ago

I just opened a PR to implement the 3rd option. I found that that ApplyImpulseResponse share the logic for retrieving its audio files, so I also updated the calls there too.

And without the PR, I think the 1st option is probably the way to go in my case, as it's probably the only way to make a setting/parameter to alter which files should be used. Thanks for the answer !

iver56 commented 2 years ago

Thanks, I will review it 👍