OpenBMB / BMTrain

Efficient Training (including pre-training and fine-tuning) for Big Models
Apache License 2.0
560 stars 77 forks source link

temparary fix of bmtrain+opendelta load state dict #77

Closed Achazwl closed 1 year ago

liweiqing1997 commented 1 year ago

The saved weights are: encoder.layers.33.self_att.self_attention.project_v.lora.lora_A tensor([[ 5.5656e-03, 1.1871e-02, 1.4404e-02, ..., 1.3145e-02, -1.3046e-03, -2.7542e-03],...]], dtype=torch.float16) <class 'collections. OrderedDict'>

But the weight of the model is: encoder.layers.33.self_att.self_attention.project_v.lora.lora_A Parameter containing: Parameter(DistributedParameter([ 0.0030, 0.0088, 0.0114, ..., 0.0004, -0.0066, -0.0021], device='cuda:0', dtype=torch.float16, requires_grad=True))

Is it because of type inconsistencies. But how to solve this

Achazwl commented 1 year ago

This is not type inconsistency, Parameter is just a wrapper for the tensor and add some training information.