Hello, I found that during model inference, when I print "model.internal_xlora_scalings.shape", the result is consistently [1, 1, 28, 7]. It is supposed to be [batch_size, seq_len, n_layers, n_classes], but I don't understand why seq_len is always 1.
Hello, I found that during model inference, when I print "model.internal_xlora_scalings.shape", the result is consistently [1, 1, 28, 7]. It is supposed to be [batch_size, seq_len, n_layers, n_classes], but I don't understand why seq_len is always 1.
model:ChatGLM3-6b-base peft==0.10.0 transformers==4.37.2