Open tomthecollins opened 1 year ago
I asked the same here, it's seems a model focused on music separation but I wasn't able to load it.
Oh cool, thanks. I did look through the issue titles but must have missed this one. Thanks for pointing it out.
Although it seems liuxubo717 closed without solving/addressing it...
Best wishes and thanks for the work liuxubo717, Tom
On Tue, 24 Oct 2023 at 02:32, Fabio Grasso @.***> wrote:
I asked the same here https://github.com/Audio-AGI/AudioSep/issues/6, it's seems a model focused on music separation but I wasn't able to load it.
— Reply to this email directly, view it on GitHub https://github.com/Audio-AGI/AudioSep/issues/16#issuecomment-1776608765, or unsubscribe https://github.com/notifications/unsubscribe-auth/AETIGRHSGQ6M3MVQKUBOHQLYA5OG3AVCNFSM6AAAAAA6M6OBRSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZWGYYDQNZWGU . You are receiving this because you authored the thread.Message ID: @.***>
I was able to fix this error by copying the missing keys from the first checkpoint to the second. But this time the model parameters do not match. I guess the model definition for music separation is not given.
music_speech_audioset_epoch_15_esc_89.98.pt is not used for music source separation. Actually, it is used to initalise the text encoder (https://github.com/Audio-AGI/AudioSep/blob/main/models/clap_encoder.py#L13) of the AudioSep model.
It's from new transformers.
Run this script on the music_speech_audioset_epoch_15_esc_89.98.pt checkpoint: https://github.com/LAION-AI/CLAP/issues/127#issuecomment-1769770667
From your paper, I wasn't sure of the role/purpose of music_speech_audioset_epoch_15_esc_89.98.pt
Are these the saved model weights one should use if one wants to focus on separation of musical instruments from one another, say? Or is audiosep_base_4M_steps.ckpt still applicable in such use cases?
When I edited your example inference code from the readme to use music_speech_audioset_epoch_15_esc_89.98.pt on a Linux machine running Ubuntu, I got the following error.
Please clarify the purpose/use of this checkpoint, and if it is meant to be used, whether I need to modify the example inference code further.
Thanks!