Audio-AGI / AudioSep

Official implementation of "Separate Anything You Describe"
https://audio-agi.github.io/Separate-Anything-You-Describe/
MIT License
1.63k stars 118 forks source link

Error when using music_speech..._89.98.pt: pytorch-lightning_version #16

Open tomthecollins opened 1 year ago

tomthecollins commented 1 year ago

From your paper, I wasn't sure of the role/purpose of music_speech_audioset_epoch_15_esc_89.98.pt

Are these the saved model weights one should use if one wants to focus on separation of musical instruments from one another, say? Or is audiosep_base_4M_steps.ckpt still applicable in such use cases?

When I edited your example inference code from the readme to use music_speech_audioset_epoch_15_esc_89.98.pt on a Linux machine running Ubuntu, I got the following error.

Please clarify the purpose/use of this checkpoint, and if it is meant to be used, whether I need to modify the example inference code further.

Thanks!

Traceback (most recent call last): File "/home/blah/repos/AudioSep/sayd_infer_example.py", line 6, in model = build_audiosep( File "/home/blah/repos/AudioSep/pipeline.py", line 17, in build_audiosep model = load_ss_model( File "/home/blah/repos/AudioSep/utils.py", line 387, in load_ss_model pl_model = AudioSep.load_from_checkpoint( File "/home/blah/anaconda3/envs/AudioSep/lib/python3.10/site-packages/lightning/pytorch/core/module.py", line 1532, in load_from_checkpoint loaded = _load_from_checkpoint( File "/home/blah/anaconda3/envs/AudioSep/lib/python3.10/site-packages/lightning/pytorch/core/saving.py", line 65, in _load_from_checkpoint checkpoint = _pl_migrate_checkpoint( File "/home/blah/anaconda3/envs/AudioSep/lib/python3.10/site-packages/lightning/pytorch/utilities/migration/utils.py", line 113, in _pl_migrate_checkpoint old_version = _get_version(checkpoint) File "/home/blah/anaconda3/envs/AudioSep/lib/python3.10/site-packages/lightning/pytorch/utilities/migration/utils.py", line 136, in _get_version return checkpoint["pytorch-lightning_version"] KeyError: 'pytorch-lightning_version'

fabiogra commented 1 year ago

I asked the same here, it's seems a model focused on music separation but I wasn't able to load it.

tomthecollins commented 1 year ago

Oh cool, thanks. I did look through the issue titles but must have missed this one. Thanks for pointing it out.

Although it seems liuxubo717 closed without solving/addressing it...

Best wishes and thanks for the work liuxubo717, Tom

On Tue, 24 Oct 2023 at 02:32, Fabio Grasso @.***> wrote:

I asked the same here https://github.com/Audio-AGI/AudioSep/issues/6, it's seems a model focused on music separation but I wasn't able to load it.

— Reply to this email directly, view it on GitHub https://github.com/Audio-AGI/AudioSep/issues/16#issuecomment-1776608765, or unsubscribe https://github.com/notifications/unsubscribe-auth/AETIGRHSGQ6M3MVQKUBOHQLYA5OG3AVCNFSM6AAAAAA6M6OBRSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZWGYYDQNZWGU . You are receiving this because you authored the thread.Message ID: @.***>

MertCokelek commented 1 year ago

I was able to fix this error by copying the missing keys from the first checkpoint to the second. But this time the model parameters do not match. I guess the model definition for music separation is not given.

liuxubo717 commented 1 year ago

music_speech_audioset_epoch_15_esc_89.98.pt is not used for music source separation. Actually, it is used to initalise the text encoder (https://github.com/Audio-AGI/AudioSep/blob/main/models/clap_encoder.py#L13) of the AudioSep model.

nathanodle commented 1 year ago

It's from new transformers.

Run this script on the music_speech_audioset_epoch_15_esc_89.98.pt checkpoint: https://github.com/LAION-AI/CLAP/issues/127#issuecomment-1769770667