Audio-AGI / dcase2024_task9_baseline

Baseline for DCASE 2024 Task 9: "Language-Queried Audio Source Separation"
22 stars 1 forks source link

Issues about loading params for pretrained CLAP #1

Closed apple-yinhan closed 6 months ago

apple-yinhan commented 6 months ago

When I run the baseline, a bug is: raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for CLAP: Unexpected key(s) in state_dict: "text_branch.embeddings.position_ids".

This happens in: model.load_state_dict(ckpt)

But I do download the music_speech_audioset_epoch_15_esc_89.98.pt and put it in the right folder. So I have set model.load_state_dict(ckpt, stirct = False)

Will this affect the results of the baseline?

liuxubo717 commented 6 months ago

Can you reproduce the results on the Validation (synth)?


Evaluation on DCASE T9 synthetic validation set. Results: SDR: SDR: 5.708, SDRi: 5.673, SISDR: 3.862

apple-yinhan commented 6 months ago

Yes, I can. It seems that this does not affect anything. Thanks.