Audio-AGI / AudioSep

Official implementation of "Separate Anything You Describe"
https://audio-agi.github.io/Separate-Anything-You-Describe/
MIT License
1.63k stars 118 forks source link

Add HF mixin #7

Closed NielsRogge closed 1 year ago

NielsRogge commented 1 year ago

This PR enables to directly load AudioSep models from the 🤗 hub, rather than people having to download them first manually (by simply making the model inherit from the PyTorchModelHubMixin class). The additional benefit is that you'll see actual download numbers (similar to HF models) on the hub, you can add a model card to your model, etc. I've pushed a model here: https://huggingface.co/nielsr/audiosep-demo

Here's how to use it, using the familiar from_pretrained method:

from models.audiosep import AudioSep
from utils import get_ss_model

ss_model = get_ss_model('config/audiosep_base.yaml')
model = AudioSep.from_pretrained("nielsr/audiosep-demo", ss_model=ss_model)

Inference can then be done in the same way:

audio_file = 'exp31_water drops_mixture.wav'
text = 'water drops'
output_file='separated_audio.wav'

# AudioSep processes the audio at 32 kHz sampling rate
inference(model, audio_file, text, output_file, device)

One also directly inherits the from_pretrained and push_to_hub methods:

# save model locally
model.save_pretrained("my-awesome-audiosep-model")
model.push_to_hub("nielsr/my-audiosep-model")

It could be improved even more by defining attributes (similar to models in HF Transformers), such as input_channels, output_channels as part of the model's init such that those can be saved in a config.json on the hub, such that one can skip having to load the ss_model, allowing to just do:

from models.audiosep import AudioSep

model = AudioSep.from_pretrained("nielsr/audiosep-demo")

However the latter would require a breaking change, so if you don't want breaking changes you could just use solution 1. Let me know whether you are interested.

Kind regards,

Niels ML @ HF

NielsRogge commented 1 year ago

Hi @liuxubo717 thanks for merging my PR! However it might be better to host the AudioSep checkpoint as part of your organization, rather than my personal username (nielsr). Are you interested in moving the AudioSep checkpoint from nielsr to the Audio-AGI organization? Then I can also update the README