But the weight of the model is:
encoder.layers.33.self_att.self_attention.project_v.lora.lora_A Parameter containing:
Parameter(DistributedParameter([ 0.0030, 0.0088, 0.0114, ..., 0.0004, -0.0066,
-0.0021], device='cuda:0', dtype=torch.float16,
requires_grad=True))
Is it because of type inconsistencies. But how to solve this
The saved weights are: encoder.layers.33.self_att.self_attention.project_v.lora.lora_A tensor([[ 5.5656e-03, 1.1871e-02, 1.4404e-02, ..., 1.3145e-02, -1.3046e-03, -2.7542e-03],...]], dtype=torch.float16) <class 'collections. OrderedDict'>
But the weight of the model is: encoder.layers.33.self_att.self_attention.project_v.lora.lora_A Parameter containing: Parameter(DistributedParameter([ 0.0030, 0.0088, 0.0114, ..., 0.0004, -0.0066, -0.0021], device='cuda:0', dtype=torch.float16, requires_grad=True))
Is it because of type inconsistencies. But how to solve this