Hi, author! Thanks for your awesome repo.
I am trying to run your inference demo and confronting the loading error.
RuntimeError: Error(s) in loading state_dict for RF:
size mismatch for model.cond_seq_linear.weight: copying a param with shape torch.Size([2560, 1024]) from checkpoint, the shape in current model is torch.Size([2560, 2048]).
size mismatch for model.layers.0.attn.q_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.0.attn.k_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.0.attn.q_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.0.attn.k_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.1.attn.q_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.1.attn.k_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.1.attn.q_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.1.attn.k_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.2.attn.q_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.2.attn.k_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.2.attn.q_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.2.attn.k_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.3.attn.q_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.3.attn.k_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.3.attn.q_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.3.attn.k_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.4.attn.q_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.4.attn.k_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.4.attn.q_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.4.attn.k_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.5.attn.q_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.5.attn.k_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.5.attn.q_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.5.attn.k_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.6.attn.q_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.6.attn.k_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.6.attn.q_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.6.attn.k_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.7.attn.q_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.7.attn.k_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.7.attn.q_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.7.attn.k_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.8.attn.q_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.8.attn.k_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.8.attn.q_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.8.attn.k_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.9.attn.q_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.9.attn.k_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.9.attn.q_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.9.attn.k_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.10.attn.q_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.10.attn.k_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.10.attn.q_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.10.attn.k_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.11.attn.q_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.11.attn.k_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.11.attn.q_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.11.attn.k_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.12.attn.q_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.12.attn.k_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.12.attn.q_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.12.attn.k_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.13.attn.q_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.13.attn.k_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.13.attn.q_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.13.attn.k_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.14.attn.q_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.14.attn.k_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.14.attn.q_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.14.attn.k_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.15.attn.q_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.15.attn.k_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.15.attn.q_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.15.attn.k_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.16.attn.q_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.16.attn.k_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.16.attn.q_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.16.attn.k_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.17.attn.q_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.17.attn.k_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.17.attn.q_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.17.attn.k_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.18.attn.q_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.18.attn.k_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.18.attn.q_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.18.attn.k_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.19.attn.q_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.19.attn.k_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.19.attn.q_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.19.attn.k_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.20.attn.q_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.20.attn.k_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.20.attn.q_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.20.attn.k_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.21.attn.q_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.21.attn.k_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.21.attn.q_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.21.attn.k_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.22.attn.q_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.22.attn.k_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.22.attn.q_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.22.attn.k_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.23.modC.1.weight: copying a param with shape torch.Size([5120, 2560]) from checkpoint, the shape in current model is torch.Size([15360, 2560]).
size mismatch for model.layers.23.attn.q_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.23.attn.k_norm1.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.23.attn.q_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
size mismatch for model.layers.23.attn.k_norm2.weight: copying a param with shape torch.Size([8, 320]) from checkpoint, the shape in current model is torch.Size([320]).
Hi, author! Thanks for your awesome repo. I am trying to run your inference demo and confronting the loading error.