SimpleModule
========================================
SimpleModule
|--- Linear
+--- MultiheadAttention <--- *** HERE ***
+--- NonDynamicallyQuantizableLinear (skip)
padiff_log/weight_init_SimpleLayer.log:
SimpleLayer
========================================
SimpleLayer
|--- Linear
+--- MultiHeadAttention (skip)
|--- Linear <--- *** HERE ***
|--- Linear
|--- Linear
+--- Linear
在样例代码上加入MultiheadAttention,尝试进行参数值复制,但失败
版本: paddlepaddle-gpu == 2.4.2 torch == 1.12.0+cu102
代码
报错: RuntimeError: Error occured when trying init weights, between: base_model:
MultiheadAttention()
SimpleModule.attention.in_proj_weight
raw_model:Linear(in_features=64, out_features=64, dtype=float32)
SimpleLayer.attention.q_proj.weight
模型架构日志文件: padiff_log/weight_init_SimpleModule.log:
padiff_log/weight_init_SimpleLayer.log:
请问应该如何修改呢?