ZFTurbo / Music-Source-Separation-Training

Repository for training models for music source separation.
MIT License
485 stars 67 forks source link

Wrong htdemucs_6s weights in readme #77

Closed paulchaum closed 2 months ago

paulchaum commented 2 months ago

I think the htdemucs 6s weight should be https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/5c90dfd2-34c22ccb.th instead of https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/955717e8-8726e21a.th.

I found the 5c90dfd2-34c22ccb.th file when i run demucs from the original repo.

When I use the 955717e8-8726e21a.th from the readme, i got this error :

RuntimeError: Error(s) in loading state_dict for HTDemucs:
    Missing key(s) in state_dict: "decoder.0.dconv.layers.0.0.weight", "decoder.0.dconv.layers.0.0.bias", "decoder.0.dconv.layers.0.1.weight", "decoder.0.dconv.layers.0.1.bias", "decoder.0.dconv.layers.0.3.weight", "decoder.0.dconv.layers.0.3.bias", "decoder.0.dconv.layers.0.4.weight", "decoder.0.dconv.layers.0.4.bias", "decoder.0.dconv.layers.0.6.scale", "decoder.0.dconv.layers.1.0.weight", "decoder.0.dconv.layers.1.0.bias", "decoder.0.dconv.layers.1.1.weight", "decoder.0.dconv.layers.1.1.bias", "decoder.0.dconv.layers.1.3.weight", "decoder.0.dconv.layers.1.3.bias", "decoder.0.dconv.layers.1.4.weight", "decoder.0.dconv.layers.1.4.bias", "decoder.0.dconv.layers.1.6.scale", "decoder.1.dconv.layers.0.0.weight", "decoder.1.dconv.layers.0.0.bias", "decoder.1.dconv.layers.0.1.weight", "decoder.1.dconv.layers.0.1.bias", "decoder.1.dconv.layers.0.3.weight", "decoder.1.dconv.layers.0.3.bias", "decoder.1.dconv.layers.0.4.weight", "decoder.1.dconv.layers.0.4.bias", "decoder.1.dconv.layers.0.6.scale", "decoder.1.dconv.layers.1.0.weight", "decoder.1.dconv.layers.1.0.bias", "decoder.1.dconv.layers.1.1.weight", "decoder.1.dconv.layers.1.1.bias", "decoder.1.dconv.layers.1.3.weight", "decoder.1.dconv.layers.1.3.bias", "decoder.1.dconv.layers.1.4.weight", "decoder.1.dconv.layers.1.4.bias", "decoder.1.dconv.layers.1.6.scale", "decoder.2.dconv.layers.0.0.weight", "decoder.2.dconv.layers.0.0.bias", "decoder.2.dconv.layers.0.1.weight", "decoder.2.dconv.layers.0.1.bias", "decoder.2.dconv.layers.0.3.weight", "decoder.2.dconv.layers.0.3.bias", "decoder.2.dconv.layers.0.4.weight", "decoder.2.dconv.layers.0.4.bias", "decoder.2.dconv.layers.0.6.scale", "decoder.2.dconv.layers.1.0.weight", "decoder.2.dconv.layers.1.0.bias", "decoder.2.dconv.layers.1.1.weight", "decoder.2.dconv.layers.1.1.bias", "decoder.2.dconv.layers.1.3.weight", "decoder.2.dconv.layers.1.3.bias", "decoder.2.dconv.layers.1.4.weight", "decoder.2.dconv.layers.1.4.bias", "decoder.2.dconv.layers.1.6.scale", "decoder.3.dconv.layers.0.0.weight", "decoder.3.dconv.layers.0.0.bias", "decoder.3.dconv.layers.0.1.weight", "decoder.3.dconv.layers.0.1.bias", "decoder.3.dconv.layers.0.3.weight", "decoder.3.dconv.layers.0.3.bias", "decoder.3.dconv.layers.0.4.weight", "decoder.3.dconv.layers.0.4.bias", "decoder.3.dconv.layers.0.6.scale", "decoder.3.dconv.layers.1.0.weight", "decoder.3.dconv.layers.1.0.bias", "decoder.3.dconv.layers.1.1.weight", "decoder.3.dconv.layers.1.1.bias", "decoder.3.dconv.layers.1.3.weight", "decoder.3.dconv.layers.1.3.bias", "decoder.3.dconv.layers.1.4.weight", "decoder.3.dconv.layers.1.4.bias", "decoder.3.dconv.layers.1.6.scale", "tdecoder.0.rewrite.weight", "tdecoder.0.rewrite.bias", "tdecoder.0.dconv.layers.0.0.weight", "tdecoder.0.dconv.layers.0.0.bias", "tdecoder.0.dconv.layers.0.1.weight", "tdecoder.0.dconv.layers.0.1.bias", "tdecoder.0.dconv.layers.0.3.weight", "tdecoder.0.dconv.layers.0.3.bias", "tdecoder.0.dconv.layers.0.4.weight", "tdecoder.0.dconv.layers.0.4.bias", "tdecoder.0.dconv.layers.0.6.scale", "tdecoder.0.dconv.layers.1.0.weight", "tdecoder.0.dconv.layers.1.0.bias", "tdecoder.0.dconv.layers.1.1.weight", "tdecoder.0.dconv.layers.1.1.bias", "tdecoder.0.dconv.layers.1.3.weight", "tdecoder.0.dconv.layers.1.3.bias", "tdecoder.0.dconv.layers.1.4.weight", "tdecoder.0.dconv.layers.1.4.bias", "tdecoder.0.dconv.layers.1.6.scale", "tdecoder.1.dconv.layers.0.0.weight", "tdecoder.1.dconv.layers.0.0.bias", "tdecoder.1.dconv.layers.0.1.weight", "tdecoder.1.dconv.layers.0.1.bias", "tdecoder.1.dconv.layers.0.3.weight", "tdecoder.1.dconv.layers.0.3.bias", "tdecoder.1.dconv.layers.0.4.weight", "tdecoder.1.dconv.layers.0.4.bias", "tdecoder.1.dconv.layers.0.6.scale", "tdecoder.1.dconv.layers.1.0.weight", "tdecoder.1.dconv.layers.1.0.bias", "tdecoder.1.dconv.layers.1.1.weight", "tdecoder.1.dconv.layers.1.1.bias", "tdecoder.1.dconv.layers.1.3.weight", "tdecoder.1.dconv.layers.1.3.bias", "tdecoder.1.dconv.layers.1.4.weight", "tdecoder.1.dconv.layers.1.4.bias", "tdecoder.1.dconv.layers.1.6.scale", "tdecoder.2.dconv.layers.0.0.weight", "tdecoder.2.dconv.layers.0.0.bias", "tdecoder.2.dconv.layers.0.1.weight", "tdecoder.2.dconv.layers.0.1.bias", "tdecoder.2.dconv.layers.0.3.weight", "tdecoder.2.dconv.layers.0.3.bias", "tdecoder.2.dconv.layers.0.4.weight", "tdecoder.2.dconv.layers.0.4.bias", "tdecoder.2.dconv.layers.0.6.scale", "tdecoder.2.dconv.layers.1.0.weight", "tdecoder.2.dconv.layers.1.0.bias", "tdecoder.2.dconv.layers.1.1.weight", "tdecoder.2.dconv.layers.1.1.bias", "tdecoder.2.dconv.layers.1.3.weight", "tdecoder.2.dconv.layers.1.3.bias", "tdecoder.2.dconv.layers.1.4.weight", "tdecoder.2.dconv.layers.1.4.bias", "tdecoder.2.dconv.layers.1.6.scale", "tdecoder.3.dconv.layers.0.0.weight", "tdecoder.3.dconv.layers.0.0.bias", "tdecoder.3.dconv.layers.0.1.weight", "tdecoder.3.dconv.layers.0.1.bias", "tdecoder.3.dconv.layers.0.3.weight", "tdecoder.3.dconv.layers.0.3.bias", "tdecoder.3.dconv.layers.0.4.weight", "tdecoder.3.dconv.layers.0.4.bias", "tdecoder.3.dconv.layers.0.6.scale", "tdecoder.3.dconv.layers.1.0.weight", "tdecoder.3.dconv.layers.1.0.bias", "tdecoder.3.dconv.layers.1.1.weight", "tdecoder.3.dconv.layers.1.1.bias", "tdecoder.3.dconv.layers.1.3.weight", "tdecoder.3.dconv.layers.1.3.bias", "tdecoder.3.dconv.layers.1.4.weight", "tdecoder.3.dconv.layers.1.4.bias", "tdecoder.3.dconv.layers.1.6.scale", "crosstransformer.norm_in.weight", "crosstransformer.norm_in.bias", "crosstransformer.norm_in_t.weight", "crosstransformer.norm_in_t.bias", "crosstransformer.layers.0.self_attn.in_proj_weight", "crosstransformer.layers.0.self_attn.in_proj_bias", "crosstransformer.layers.0.self_attn.out_proj.weight", "crosstransformer.layers.0.self_attn.out_proj.bias", "crosstransformer.layers.0.linear1.weight", "crosstransformer.layers.0.linear1.bias", "crosstransformer.layers.0.linear2.weight", "crosstransformer.layers.0.linear2.bias", "crosstransformer.layers.0.norm1.weight", "crosstransformer.layers.0.norm1.bias", "crosstransformer.layers.0.norm2.weight", "crosstransformer.layers.0.norm2.bias", "crosstransformer.layers.0.norm_out.weight", "crosstransformer.layers.0.norm_out.bias", "crosstransformer.layers.0.gamma_1.scale", "crosstransformer.layers.0.gamma_2.scale", "crosstransformer.layers.1.cross_attn.in_proj_weight", "crosstransformer.layers.1.cross_attn.in_proj_bias", "crosstransformer.layers.1.cross_attn.out_proj.weight", "crosstransformer.layers.1.cross_attn.out_proj.bias", "crosstransformer.layers.1.linear1.weight", "crosstransformer.layers.1.linear1.bias", "crosstransformer.layers.1.linear2.weight", "crosstransformer.layers.1.linear2.bias", "crosstransformer.layers.1.norm1.weight", "crosstransformer.layers.1.norm1.bias", "crosstransformer.layers.1.norm2.weight", "crosstransformer.layers.1.norm2.bias", "crosstransformer.layers.1.norm3.weight", "crosstransformer.layers.1.norm3.bias", "crosstransformer.layers.1.norm_out.weight", "crosstransformer.layers.1.norm_out.bias", "crosstransformer.layers.1.gamma_1.scale", "crosstransformer.layers.1.gamma_2.scale", "crosstransformer.layers.2.self_attn.in_proj_weight", "crosstransformer.layers.2.self_attn.in_proj_bias", "crosstransformer.layers.2.self_attn.out_proj.weight", "crosstransformer.layers.2.self_attn.out_proj.bias", "crosstransformer.layers.2.linear1.weight", "crosstransformer.layers.2.linear1.bias", "crosstransformer.layers.2.linear2.weight", "crosstransformer.layers.2.linear2.bias", "crosstransformer.layers.2.norm1.weight", "crosstransformer.layers.2.norm1.bias", "crosstransformer.layers.2.norm2.weight", "crosstransformer.layers.2.norm2.bias", "crosstransformer.layers.2.norm_out.weight", "crosstransformer.layers.2.norm_out.bias", "crosstransformer.layers.2.gamma_1.scale", "crosstransformer.layers.2.gamma_2.scale", "crosstransformer.layers.3.cross_attn.in_proj_weight", "crosstransformer.layers.3.cross_attn.in_proj_bias", "crosstransformer.layers.3.cross_attn.out_proj.weight", "crosstransformer.layers.3.cross_attn.out_proj.bias", "crosstransformer.layers.3.linear1.weight", "crosstransformer.layers.3.linear1.bias", "crosstransformer.layers.3.linear2.weight", "crosstransformer.layers.3.linear2.bias", "crosstransformer.layers.3.norm1.weight", "crosstransformer.layers.3.norm1.bias", "crosstransformer.layers.3.norm2.weight", "crosstransformer.layers.3.norm2.bias", "crosstransformer.layers.3.norm3.weight", "crosstransformer.layers.3.norm3.bias", "crosstransformer.layers.3.norm_out.weight", "crosstransformer.layers.3.norm_out.bias", "crosstransformer.layers.3.gamma_1.scale", "crosstransformer.layers.3.gamma_2.scale", "crosstransformer.layers.4.self_attn.in_proj_weight", "crosstransformer.layers.4.self_attn.in_proj_bias", "crosstransformer.layers.4.self_attn.out_proj.weight", "crosstransformer.layers.4.self_attn.out_proj.bias", "crosstransformer.layers.4.linear1.weight", "crosstransformer.layers.4.linear1.bias", "crosstransformer.layers.4.linear2.weight", "crosstransformer.layers.4.linear2.bias", "crosstransformer.layers.4.norm1.weight", "crosstransformer.layers.4.norm1.bias", "crosstransformer.layers.4.norm2.weight", "crosstransformer.layers.4.norm2.bias", "crosstransformer.layers.4.norm_out.weight", "crosstransformer.layers.4.norm_out.bias", "crosstransformer.layers.4.gamma_1.scale", "crosstransformer.layers.4.gamma_2.scale", "crosstransformer.layers_t.0.self_attn.in_proj_weight", "crosstransformer.layers_t.0.self_attn.in_proj_bias", "crosstransformer.layers_t.0.self_attn.out_proj.weight", "crosstransformer.layers_t.0.self_attn.out_proj.bias", "crosstransformer.layers_t.0.linear1.weight", "crosstransformer.layers_t.0.linear1.bias", "crosstransformer.layers_t.0.linear2.weight", "crosstransformer.layers_t.0.linear2.bias", "crosstransformer.layers_t.0.norm1.weight", "crosstransformer.layers_t.0.norm1.bias", "crosstransformer.layers_t.0.norm2.weight", "crosstransformer.layers_t.0.norm2.bias", "crosstransformer.layers_t.0.norm_out.weight", "crosstransformer.layers_t.0.norm_out.bias", "crosstransformer.layers_t.0.gamma_1.scale", "crosstransformer.layers_t.0.gamma_2.scale", "crosstransformer.layers_t.1.cross_attn.in_proj_weight", "crosstransformer.layers_t.1.cross_attn.in_proj_bias", "crosstransformer.layers_t.1.cross_attn.out_proj.weight", "crosstransformer.layers_t.1.cross_attn.out_proj.bias", "crosstransformer.layers_t.1.linear1.weight", "crosstransformer.layers_t.1.linear1.bias", "crosstransformer.layers_t.1.linear2.weight", "crosstransformer.layers_t.1.linear2.bias", "crosstransformer.layers_t.1.norm1.weight", "crosstransformer.layers_t.1.norm1.bias", "crosstransformer.layers_t.1.norm2.weight", "crosstransformer.layers_t.1.norm2.bias", "crosstransformer.layers_t.1.norm3.weight", "crosstransformer.layers_t.1.norm3.bias", "crosstransformer.layers_t.1.norm_out.weight", "crosstransformer.layers_t.1.norm_out.bias", "crosstransformer.layers_t.1.gamma_1.scale", "crosstransformer.layers_t.1.gamma_2.scale", "crosstransformer.layers_t.2.self_attn.in_proj_weight", "crosstransformer.layers_t.2.self_attn.in_proj_bias", "crosstransformer.layers_t.2.self_attn.out_proj.weight", "crosstransformer.layers_t.2.self_attn.out_proj.bias", "crosstransformer.layers_t.2.linear1.weight", "crosstransformer.layers_t.2.linear1.bias", "crosstransformer.layers_t.2.linear2.weight", "crosstransformer.layers_t.2.linear2.bias", "crosstransformer.layers_t.2.norm1.weight", "crosstransformer.layers_t.2.norm1.bias", "crosstransformer.layers_t.2.norm2.weight", "crosstransformer.layers_t.2.norm2.bias", "crosstransformer.layers_t.2.norm_out.weight", "crosstransformer.layers_t.2.norm_out.bias", "crosstransformer.layers_t.2.gamma_1.scale", "crosstransformer.layers_t.2.gamma_2.scale", "crosstransformer.layers_t.3.cross_attn.in_proj_weight", "crosstransformer.layers_t.3.cross_attn.in_proj_bias", "crosstransformer.layers_t.3.cross_attn.out_proj.weight", "crosstransformer.layers_t.3.cross_attn.out_proj.bias", "crosstransformer.layers_t.3.linear1.weight", "crosstransformer.layers_t.3.linear1.bias", "crosstransformer.layers_t.3.linear2.weight", "crosstransformer.layers_t.3.linear2.bias", "crosstransformer.layers_t.3.norm1.weight", "crosstransformer.layers_t.3.norm1.bias", "crosstransformer.layers_t.3.norm2.weight", "crosstransformer.layers_t.3.norm2.bias", "crosstransformer.layers_t.3.norm3.weight", "crosstransformer.layers_t.3.norm3.bias", "crosstransformer.layers_t.3.norm_out.weight", "crosstransformer.layers_t.3.norm_out.bias", "crosstransformer.layers_t.3.gamma_1.scale", "crosstransformer.layers_t.3.gamma_2.scale", "crosstransformer.layers_t.4.self_attn.in_proj_weight", "crosstransformer.layers_t.4.self_attn.in_proj_bias", "crosstransformer.layers_t.4.self_attn.out_proj.weight", "crosstransformer.layers_t.4.self_attn.out_proj.bias", "crosstransformer.layers_t.4.linear1.weight", "crosstransformer.layers_t.4.linear1.bias", "crosstransformer.layers_t.4.linear2.weight", "crosstransformer.layers_t.4.linear2.bias", "crosstransformer.layers_t.4.norm1.weight", "crosstransformer.layers_t.4.norm1.bias", "crosstransformer.layers_t.4.norm2.weight", "crosstransformer.layers_t.4.norm2.bias", "crosstransformer.layers_t.4.norm_out.weight", "crosstransformer.layers_t.4.norm_out.bias", "crosstransformer.layers_t.4.gamma_1.scale", "crosstransformer.layers_t.4.gamma_2.scale". 
    Unexpected key(s) in state_dict: "encoder.4.conv.weight", "encoder.4.conv.bias", "encoder.4.norm1.weight", "encoder.4.norm1.bias", "encoder.4.rewrite.weight", "encoder.4.rewrite.bias", "encoder.4.norm2.weight", "encoder.4.norm2.bias", "encoder.4.dconv.layers.0.0.weight", "encoder.4.dconv.layers.0.0.bias", "encoder.4.dconv.layers.0.1.weight", "encoder.4.dconv.layers.0.1.bias", "encoder.4.dconv.layers.0.3.lstm.weight_ih_l0", "encoder.4.dconv.layers.0.3.lstm.weight_hh_l0", "encoder.4.dconv.layers.0.3.lstm.bias_ih_l0", "encoder.4.dconv.layers.0.3.lstm.bias_hh_l0", "encoder.4.dconv.layers.0.3.lstm.weight_ih_l0_reverse", "encoder.4.dconv.layers.0.3.lstm.weight_hh_l0_reverse", "encoder.4.dconv.layers.0.3.lstm.bias_ih_l0_reverse", "encoder.4.dconv.layers.0.3.lstm.bias_hh_l0_reverse", "encoder.4.dconv.layers.0.3.lstm.weight_ih_l1", "encoder.4.dconv.layers.0.3.lstm.weight_hh_l1", "encoder.4.dconv.layers.0.3.lstm.bias_ih_l1", "encoder.4.dconv.layers.0.3.lstm.bias_hh_l1", "encoder.4.dconv.layers.0.3.lstm.weight_ih_l1_reverse", "encoder.4.dconv.layers.0.3.lstm.weight_hh_l1_reverse", "encoder.4.dconv.layers.0.3.lstm.bias_ih_l1_reverse", "encoder.4.dconv.layers.0.3.lstm.bias_hh_l1_reverse", "encoder.4.dconv.layers.0.3.linear.weight", "encoder.4.dconv.layers.0.3.linear.bias", "encoder.4.dconv.layers.0.4.content.weight", "encoder.4.dconv.layers.0.4.content.bias", "encoder.4.dconv.layers.0.4.query.weight", "encoder.4.dconv.layers.0.4.query.bias", "encoder.4.dconv.layers.0.4.key.weight", "encoder.4.dconv.layers.0.4.key.bias", "encoder.4.dconv.layers.0.4.query_decay.weight", "encoder.4.dconv.layers.0.4.query_decay.bias", "encoder.4.dconv.layers.0.4.proj.weight", "encoder.4.dconv.layers.0.4.proj.bias", "encoder.4.dconv.layers.0.5.weight", "encoder.4.dconv.layers.0.5.bias", "encoder.4.dconv.layers.0.6.weight", "encoder.4.dconv.layers.0.6.bias", "encoder.4.dconv.layers.0.8.scale", "encoder.4.dconv.layers.1.0.weight", "encoder.4.dconv.layers.1.0.bias", "encoder.4.dconv.layers.1.1.weight", "encoder.4.dconv.layers.1.1.bias", "encoder.4.dconv.layers.1.3.lstm.weight_ih_l0", "encoder.4.dconv.layers.1.3.lstm.weight_hh_l0", "encoder.4.dconv.layers.1.3.lstm.bias_ih_l0", "encoder.4.dconv.layers.1.3.lstm.bias_hh_l0", "encoder.4.dconv.layers.1.3.lstm.weight_ih_l0_reverse", "encoder.4.dconv.layers.1.3.lstm.weight_hh_l0_reverse", "encoder.4.dconv.layers.1.3.lstm.bias_ih_l0_reverse", "encoder.4.dconv.layers.1.3.lstm.bias_hh_l0_reverse", "encoder.4.dconv.layers.1.3.lstm.weight_ih_l1", "encoder.4.dconv.layers.1.3.lstm.weight_hh_l1", "encoder.4.dconv.layers.1.3.lstm.bias_ih_l1", "encoder.4.dconv.layers.1.3.lstm.bias_hh_l1", "encoder.4.dconv.layers.1.3.lstm.weight_ih_l1_reverse", "encoder.4.dconv.layers.1.3.lstm.weight_hh_l1_reverse", "encoder.4.dconv.layers.1.3.lstm.bias_ih_l1_reverse", "encoder.4.dconv.layers.1.3.lstm.bias_hh_l1_reverse", "encoder.4.dconv.layers.1.3.linear.weight", "encoder.4.dconv.layers.1.3.linear.bias", "encoder.4.dconv.layers.1.4.content.weight", "encoder.4.dconv.layers.1.4.content.bias", "encoder.4.dconv.layers.1.4.query.weight", "encoder.4.dconv.layers.1.4.query.bias", "encoder.4.dconv.layers.1.4.key.weight", "encoder.4.dconv.layers.1.4.key.bias", "encoder.4.dconv.layers.1.4.query_decay.weight", "encoder.4.dconv.layers.1.4.query_decay.bias", "encoder.4.dconv.layers.1.4.proj.weight", "encoder.4.dconv.layers.1.4.proj.bias", "encoder.4.dconv.layers.1.5.weight", "encoder.4.dconv.layers.1.5.bias", "encoder.4.dconv.layers.1.6.weight", "encoder.4.dconv.layers.1.6.bias", "encoder.4.dconv.layers.1.8.scale", "encoder.5.conv.weight", "encoder.5.conv.bias", "encoder.5.norm1.weight", "encoder.5.norm1.bias", "encoder.5.rewrite.weight", "encoder.5.rewrite.bias", "encoder.5.norm2.weight", "encoder.5.norm2.bias", "encoder.5.dconv.layers.0.0.weight", "encoder.5.dconv.layers.0.0.bias", "encoder.5.dconv.layers.0.1.weight", "encoder.5.dconv.layers.0.1.bias", "encoder.5.dconv.layers.0.3.lstm.weight_ih_l0", "encoder.5.dconv.layers.0.3.lstm.weight_hh_l0", "encoder.5.dconv.layers.0.3.lstm.bias_ih_l0", "encoder.5.dconv.layers.0.3.lstm.bias_hh_l0", "encoder.5.dconv.layers.0.3.lstm.weight_ih_l0_reverse", "encoder.5.dconv.layers.0.3.lstm.weight_hh_l0_reverse", "encoder.5.dconv.layers.0.3.lstm.bias_ih_l0_reverse", "encoder.5.dconv.layers.0.3.lstm.bias_hh_l0_reverse", "encoder.5.dconv.layers.0.3.lstm.weight_ih_l1", "encoder.5.dconv.layers.0.3.lstm.weight_hh_l1", "encoder.5.dconv.layers.0.3.lstm.bias_ih_l1", "encoder.5.dconv.layers.0.3.lstm.bias_hh_l1", "encoder.5.dconv.layers.0.3.lstm.weight_ih_l1_reverse", "encoder.5.dconv.layers.0.3.lstm.weight_hh_l1_reverse", "encoder.5.dconv.layers.0.3.lstm.bias_ih_l1_reverse", "encoder.5.dconv.layers.0.3.lstm.bias_hh_l1_reverse", "encoder.5.dconv.layers.0.3.linear.weight", "encoder.5.dconv.layers.0.3.linear.bias", "encoder.5.dconv.layers.0.4.content.weight", "encoder.5.dconv.layers.0.4.content.bias", "encoder.5.dconv.layers.0.4.query.weight", "encoder.5.dconv.layers.0.4.query.bias", "encoder.5.dconv.layers.0.4.key.weight", "encoder.5.dconv.layers.0.4.key.bias", "encoder.5.dconv.layers.0.4.query_decay.weight", "encoder.5.dconv.layers.0.4.query_decay.bias", "encoder.5.dconv.layers.0.4.proj.weight", "encoder.5.dconv.layers.0.4.proj.bias", "encoder.5.dconv.layers.0.5.weight", "encoder.5.dconv.layers.0.5.bias", "encoder.5.dconv.layers.0.6.weight", "encoder.5.dconv.layers.0.6.bias", "encoder.5.dconv.layers.0.8.scale", "encoder.5.dconv.layers.1.0.weight", "encoder.5.dconv.layers.1.0.bias", "encoder.5.dconv.layers.1.1.weight", "encoder.5.dconv.layers.1.1.bias", "encoder.5.dconv.layers.1.3.lstm.weight_ih_l0", "encoder.5.dconv.layers.1.3.lstm.weight_hh_l0", "encoder.5.dconv.layers.1.3.lstm.bias_ih_l0", "encoder.5.dconv.layers.1.3.lstm.bias_hh_l0", "encoder.5.dconv.layers.1.3.lstm.weight_ih_l0_reverse", "encoder.5.dconv.layers.1.3.lstm.weight_hh_l0_reverse", "encoder.5.dconv.layers.1.3.lstm.bias_ih_l0_reverse", "encoder.5.dconv.layers.1.3.lstm.bias_hh_l0_reverse", "encoder.5.dconv.layers.1.3.lstm.weight_ih_l1", "encoder.5.dconv.layers.1.3.lstm.weight_hh_l1", "encoder.5.dconv.layers.1.3.lstm.bias_ih_l1", "encoder.5.dconv.layers.1.3.lstm.bias_hh_l1", "encoder.5.dconv.layers.1.3.lstm.weight_ih_l1_reverse", "encoder.5.dconv.layers.1.3.lstm.weight_hh_l1_reverse", "encoder.5.dconv.layers.1.3.lstm.bias_ih_l1_reverse", "encoder.5.dconv.layers.1.3.lstm.bias_hh_l1_reverse", "encoder.5.dconv.layers.1.3.linear.weight", "encoder.5.dconv.layers.1.3.linear.bias", "encoder.5.dconv.layers.1.4.content.weight", "encoder.5.dconv.layers.1.4.content.bias", "encoder.5.dconv.layers.1.4.query.weight", "encoder.5.dconv.layers.1.4.query.bias", "encoder.5.dconv.layers.1.4.key.weight", "encoder.5.dconv.layers.1.4.key.bias", "encoder.5.dconv.layers.1.4.query_decay.weight", "encoder.5.dconv.layers.1.4.query_decay.bias", "encoder.5.dconv.layers.1.4.proj.weight", "encoder.5.dconv.layers.1.4.proj.bias", "encoder.5.dconv.layers.1.5.weight", "encoder.5.dconv.layers.1.5.bias", "encoder.5.dconv.layers.1.6.weight", "encoder.5.dconv.layers.1.6.bias", "encoder.5.dconv.layers.1.8.scale", "decoder.4.conv_tr.weight", "decoder.4.conv_tr.bias", "decoder.4.rewrite.weight", "decoder.4.rewrite.bias", "decoder.5.conv_tr.weight", "decoder.5.conv_tr.bias", "decoder.5.rewrite.weight", "decoder.5.rewrite.bias", "decoder.0.norm2.weight", "decoder.0.norm2.bias", "decoder.0.norm1.weight", "decoder.0.norm1.bias", "decoder.1.norm2.weight", "decoder.1.norm2.bias", "decoder.1.norm1.weight", "decoder.1.norm1.bias", "tencoder.4.conv.weight", "tencoder.4.conv.bias", "tdecoder.4.conv_tr.weight", "tdecoder.4.conv_tr.bias", "tdecoder.4.rewrite.weight", "tdecoder.4.rewrite.bias", "tdecoder.0.norm2.weight", "tdecoder.0.norm2.bias". 
    size mismatch for encoder.0.dconv.layers.0.0.weight: copying a param with shape torch.Size([12, 48, 3]) from checkpoint, the shape in current model is torch.Size([6, 48, 3]).
    size mismatch for encoder.0.dconv.layers.0.0.bias: copying a param with shape torch.Size([12]) from checkpoint, the shape in current model is torch.Size([6]).
    size mismatch for encoder.0.dconv.layers.0.1.weight: copying a param with shape torch.Size([12]) from checkpoint, the shape in current model is torch.Size([6]).
    size mismatch for encoder.0.dconv.layers.0.1.bias: copying a param with shape torch.Size([12]) from checkpoint, the shape in current model is torch.Size([6]).
    size mismatch for encoder.0.dconv.layers.0.3.weight: copying a param with shape torch.Size([96, 12, 1]) from checkpoint, the shape in current model is torch.Size([96, 6, 1]).
    size mismatch for encoder.0.dconv.layers.1.0.weight: copying a param with shape torch.Size([12, 48, 3]) from checkpoint, the shape in current model is torch.Size([6, 48, 3]).
    size mismatch for encoder.0.dconv.layers.1.0.bias: copying a param with shape torch.Size([12]) from checkpoint, the shape in current model is torch.Size([6]).
    size mismatch for encoder.0.dconv.layers.1.1.weight: copying a param with shape torch.Size([12]) from checkpoint, the shape in current model is torch.Size([6]).
    size mismatch for encoder.0.dconv.layers.1.1.bias: copying a param with shape torch.Size([12]) from checkpoint, the shape in current model is torch.Size([6]).
    size mismatch for encoder.0.dconv.layers.1.3.weight: copying a param with shape torch.Size([96, 12, 1]) from checkpoint, the shape in current model is torch.Size([96, 6, 1]).
    size mismatch for encoder.1.dconv.layers.0.0.weight: copying a param with shape torch.Size([24, 96, 3]) from checkpoint, the shape in current model is torch.Size([12, 96, 3]).
    size mismatch for encoder.1.dconv.layers.0.0.bias: copying a param with shape torch.Size([24]) from checkpoint, the shape in current model is torch.Size([12]).
    size mismatch for encoder.1.dconv.layers.0.1.weight: copying a param with shape torch.Size([24]) from checkpoint, the shape in current model is torch.Size([12]).
    size mismatch for encoder.1.dconv.layers.0.1.bias: copying a param with shape torch.Size([24]) from checkpoint, the shape in current model is torch.Size([12]).
    size mismatch for encoder.1.dconv.layers.0.3.weight: copying a param with shape torch.Size([192, 24, 1]) from checkpoint, the shape in current model is torch.Size([192, 12, 1]).
    size mismatch for encoder.1.dconv.layers.1.0.weight: copying a param with shape torch.Size([24, 96, 3]) from checkpoint, the shape in current model is torch.Size([12, 96, 3]).
    size mismatch for encoder.1.dconv.layers.1.0.bias: copying a param with shape torch.Size([24]) from checkpoint, the shape in current model is torch.Size([12]).
    size mismatch for encoder.1.dconv.layers.1.1.weight: copying a param with shape torch.Size([24]) from checkpoint, the shape in current model is torch.Size([12]).
    size mismatch for encoder.1.dconv.layers.1.1.bias: copying a param with shape torch.Size([24]) from checkpoint, the shape in current model is torch.Size([12]).
    size mismatch for encoder.1.dconv.layers.1.3.weight: copying a param with shape torch.Size([192, 24, 1]) from checkpoint, the shape in current model is torch.Size([192, 12, 1]).
    size mismatch for encoder.2.dconv.layers.0.0.weight: copying a param with shape torch.Size([48, 192, 3]) from checkpoint, the shape in current model is torch.Size([24, 192, 3]).
    size mismatch for encoder.2.dconv.layers.0.0.bias: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([24]).
    size mismatch for encoder.2.dconv.layers.0.1.weight: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([24]).
    size mismatch for encoder.2.dconv.layers.0.1.bias: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([24]).
    size mismatch for encoder.2.dconv.layers.0.3.weight: copying a param with shape torch.Size([384, 48, 1]) from checkpoint, the shape in current model is torch.Size([384, 24, 1]).
    size mismatch for encoder.2.dconv.layers.1.0.weight: copying a param with shape torch.Size([48, 192, 3]) from checkpoint, the shape in current model is torch.Size([24, 192, 3]).
    size mismatch for encoder.2.dconv.layers.1.0.bias: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([24]).
    size mismatch for encoder.2.dconv.layers.1.1.weight: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([24]).
    size mismatch for encoder.2.dconv.layers.1.1.bias: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([24]).
    size mismatch for encoder.2.dconv.layers.1.3.weight: copying a param with shape torch.Size([384, 48, 1]) from checkpoint, the shape in current model is torch.Size([384, 24, 1]).
    size mismatch for encoder.3.dconv.layers.0.0.weight: copying a param with shape torch.Size([96, 384, 3]) from checkpoint, the shape in current model is torch.Size([48, 384, 3]).
    size mismatch for encoder.3.dconv.layers.0.0.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([48]).
    size mismatch for encoder.3.dconv.layers.0.1.weight: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([48]).
    size mismatch for encoder.3.dconv.layers.0.1.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([48]).
    size mismatch for encoder.3.dconv.layers.0.3.weight: copying a param with shape torch.Size([768, 96, 1]) from checkpoint, the shape in current model is torch.Size([768, 48, 1]).
    size mismatch for encoder.3.dconv.layers.1.0.weight: copying a param with shape torch.Size([96, 384, 3]) from checkpoint, the shape in current model is torch.Size([48, 384, 3]).
    size mismatch for encoder.3.dconv.layers.1.0.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([48]).
    size mismatch for encoder.3.dconv.layers.1.1.weight: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([48]).
    size mismatch for encoder.3.dconv.layers.1.1.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([48]).
    size mismatch for encoder.3.dconv.layers.1.3.weight: copying a param with shape torch.Size([768, 96, 1]) from checkpoint, the shape in current model is torch.Size([768, 48, 1]).
    size mismatch for decoder.0.conv_tr.weight: copying a param with shape torch.Size([1536, 768, 4]) from checkpoint, the shape in current model is torch.Size([384, 192, 8, 1]).
    size mismatch for decoder.0.conv_tr.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([192]).
    size mismatch for decoder.0.rewrite.weight: copying a param with shape torch.Size([3072, 1536, 3]) from checkpoint, the shape in current model is torch.Size([768, 384, 3, 3]).
    size mismatch for decoder.0.rewrite.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([768]).
    size mismatch for decoder.1.conv_tr.weight: copying a param with shape torch.Size([768, 384, 8, 1]) from checkpoint, the shape in current model is torch.Size([192, 96, 8, 1]).
    size mismatch for decoder.1.conv_tr.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([96]).
    size mismatch for decoder.1.rewrite.weight: copying a param with shape torch.Size([1536, 768, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 192, 3, 3]).
    size mismatch for decoder.1.rewrite.bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([384]).
    size mismatch for decoder.2.conv_tr.weight: copying a param with shape torch.Size([384, 192, 8, 1]) from checkpoint, the shape in current model is torch.Size([96, 48, 8, 1]).
    size mismatch for decoder.2.conv_tr.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([48]).
    size mismatch for decoder.2.rewrite.weight: copying a param with shape torch.Size([768, 384, 3, 3]) from checkpoint, the shape in current model is torch.Size([192, 96, 3, 3]).
    size mismatch for decoder.2.rewrite.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([192]).
    size mismatch for decoder.3.conv_tr.weight: copying a param with shape torch.Size([192, 96, 8, 1]) from checkpoint, the shape in current model is torch.Size([48, 24, 8, 1]).
    size mismatch for decoder.3.conv_tr.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([24]).
    size mismatch for decoder.3.rewrite.weight: copying a param with shape torch.Size([384, 192, 3, 3]) from checkpoint, the shape in current model is torch.Size([96, 48, 3, 3]).
    size mismatch for decoder.3.rewrite.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([96]).
    size mismatch for tencoder.0.dconv.layers.0.0.weight: copying a param with shape torch.Size([12, 48, 3]) from checkpoint, the shape in current model is torch.Size([6, 48, 3]).
    size mismatch for tencoder.0.dconv.layers.0.0.bias: copying a param with shape torch.Size([12]) from checkpoint, the shape in current model is torch.Size([6]).
    size mismatch for tencoder.0.dconv.layers.0.1.weight: copying a param with shape torch.Size([12]) from checkpoint, the shape in current model is torch.Size([6]).
    size mismatch for tencoder.0.dconv.layers.0.1.bias: copying a param with shape torch.Size([12]) from checkpoint, the shape in current model is torch.Size([6]).
    size mismatch for tencoder.0.dconv.layers.0.3.weight: copying a param with shape torch.Size([96, 12, 1]) from checkpoint, the shape in current model is torch.Size([96, 6, 1]).
    size mismatch for tencoder.0.dconv.layers.1.0.weight: copying a param with shape torch.Size([12, 48, 3]) from checkpoint, the shape in current model is torch.Size([6, 48, 3]).
    size mismatch for tencoder.0.dconv.layers.1.0.bias: copying a param with shape torch.Size([12]) from checkpoint, the shape in current model is torch.Size([6]).
    size mismatch for tencoder.0.dconv.layers.1.1.weight: copying a param with shape torch.Size([12]) from checkpoint, the shape in current model is torch.Size([6]).
    size mismatch for tencoder.0.dconv.layers.1.1.bias: copying a param with shape torch.Size([12]) from checkpoint, the shape in current model is torch.Size([6]).
    size mismatch for tencoder.0.dconv.layers.1.3.weight: copying a param with shape torch.Size([96, 12, 1]) from checkpoint, the shape in current model is torch.Size([96, 6, 1]).
    size mismatch for tencoder.1.dconv.layers.0.0.weight: copying a param with shape torch.Size([24, 96, 3]) from checkpoint, the shape in current model is torch.Size([12, 96, 3]).
    size mismatch for tencoder.1.dconv.layers.0.0.bias: copying a param with shape torch.Size([24]) from checkpoint, the shape in current model is torch.Size([12]).
    size mismatch for tencoder.1.dconv.layers.0.1.weight: copying a param with shape torch.Size([24]) from checkpoint, the shape in current model is torch.Size([12]).
    size mismatch for tencoder.1.dconv.layers.0.1.bias: copying a param with shape torch.Size([24]) from checkpoint, the shape in current model is torch.Size([12]).
    size mismatch for tencoder.1.dconv.layers.0.3.weight: copying a param with shape torch.Size([192, 24, 1]) from checkpoint, the shape in current model is torch.Size([192, 12, 1]).
    size mismatch for tencoder.1.dconv.layers.1.0.weight: copying a param with shape torch.Size([24, 96, 3]) from checkpoint, the shape in current model is torch.Size([12, 96, 3]).
    size mismatch for tencoder.1.dconv.layers.1.0.bias: copying a param with shape torch.Size([24]) from checkpoint, the shape in current model is torch.Size([12]).
    size mismatch for tencoder.1.dconv.layers.1.1.weight: copying a param with shape torch.Size([24]) from checkpoint, the shape in current model is torch.Size([12]).
    size mismatch for tencoder.1.dconv.layers.1.1.bias: copying a param with shape torch.Size([24]) from checkpoint, the shape in current model is torch.Size([12]).
    size mismatch for tencoder.1.dconv.layers.1.3.weight: copying a param with shape torch.Size([192, 24, 1]) from checkpoint, the shape in current model is torch.Size([192, 12, 1]).
    size mismatch for tencoder.2.dconv.layers.0.0.weight: copying a param with shape torch.Size([48, 192, 3]) from checkpoint, the shape in current model is torch.Size([24, 192, 3]).
    size mismatch for tencoder.2.dconv.layers.0.0.bias: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([24]).
    size mismatch for tencoder.2.dconv.layers.0.1.weight: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([24]).
    size mismatch for tencoder.2.dconv.layers.0.1.bias: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([24]).
    size mismatch for tencoder.2.dconv.layers.0.3.weight: copying a param with shape torch.Size([384, 48, 1]) from checkpoint, the shape in current model is torch.Size([384, 24, 1]).
    size mismatch for tencoder.2.dconv.layers.1.0.weight: copying a param with shape torch.Size([48, 192, 3]) from checkpoint, the shape in current model is torch.Size([24, 192, 3]).
    size mismatch for tencoder.2.dconv.layers.1.0.bias: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([24]).
    size mismatch for tencoder.2.dconv.layers.1.1.weight: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([24]).
    size mismatch for tencoder.2.dconv.layers.1.1.bias: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([24]).
    size mismatch for tencoder.2.dconv.layers.1.3.weight: copying a param with shape torch.Size([384, 48, 1]) from checkpoint, the shape in current model is torch.Size([384, 24, 1]).
    size mismatch for tencoder.3.dconv.layers.0.0.weight: copying a param with shape torch.Size([96, 384, 3]) from checkpoint, the shape in current model is torch.Size([48, 384, 3]).
    size mismatch for tencoder.3.dconv.layers.0.0.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([48]).
    size mismatch for tencoder.3.dconv.layers.0.1.weight: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([48]).
    size mismatch for tencoder.3.dconv.layers.0.1.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([48]).
    size mismatch for tencoder.3.dconv.layers.0.3.weight: copying a param with shape torch.Size([768, 96, 1]) from checkpoint, the shape in current model is torch.Size([768, 48, 1]).
    size mismatch for tencoder.3.dconv.layers.1.0.weight: copying a param with shape torch.Size([96, 384, 3]) from checkpoint, the shape in current model is torch.Size([48, 384, 3]).
    size mismatch for tencoder.3.dconv.layers.1.0.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([48]).
    size mismatch for tencoder.3.dconv.layers.1.1.weight: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([48]).
    size mismatch for tencoder.3.dconv.layers.1.1.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([48]).
    size mismatch for tencoder.3.dconv.layers.1.3.weight: copying a param with shape torch.Size([768, 96, 1]) from checkpoint, the shape in current model is torch.Size([768, 48, 1]).
    size mismatch for tdecoder.0.conv_tr.weight: copying a param with shape torch.Size([768, 384, 8]) from checkpoint, the shape in current model is torch.Size([384, 192, 8]).
    size mismatch for tdecoder.0.conv_tr.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([192]).
    size mismatch for tdecoder.1.conv_tr.weight: copying a param with shape torch.Size([384, 192, 8]) from checkpoint, the shape in current model is torch.Size([192, 96, 8]).
    size mismatch for tdecoder.1.conv_tr.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([96]).
    size mismatch for tdecoder.1.rewrite.weight: copying a param with shape torch.Size([768, 384, 3]) from checkpoint, the shape in current model is torch.Size([384, 192, 3]).
    size mismatch for tdecoder.1.rewrite.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([384]).
    size mismatch for tdecoder.2.conv_tr.weight: copying a param with shape torch.Size([192, 96, 8]) from checkpoint, the shape in current model is torch.Size([96, 48, 8]).
    size mismatch for tdecoder.2.conv_tr.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([48]).
    size mismatch for tdecoder.2.rewrite.weight: copying a param with shape torch.Size([384, 192, 3]) from checkpoint, the shape in current model is torch.Size([192, 96, 3]).
    size mismatch for tdecoder.2.rewrite.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([192]).
    size mismatch for tdecoder.3.conv_tr.weight: copying a param with shape torch.Size([96, 48, 8]) from checkpoint, the shape in current model is torch.Size([48, 12, 8]).
    size mismatch for tdecoder.3.conv_tr.bias: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([12]).
    size mismatch for tdecoder.3.rewrite.weight: copying a param with shape torch.Size([192, 96, 3]) from checkpoint, the shape in current model is torch.Size([96, 48, 3]).
    size mismatch for tdecoder.3.rewrite.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([96]).
CUDA is available, use --force_cpu to disable it.
Using device:  cuda:0
Start from checkpoint: models/htdemucs_6s/75fc33f5-1941ce65.th
ZFTurbo commented 2 months ago

But weights for 6s in readme is correct: https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/5c90dfd2-34c22ccb.th

Can you please add more details?

paulchaum commented 2 months ago

Sorry, it's a mistake on my part, I'm closing the issue