Tencent / TurboTransformers

a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.
Other
1.49k stars 198 forks source link

bert_pooler.cpp的一处简单bug #186

Closed LitLeo closed 4 years ago

LitLeo commented 4 years ago

master/turbo_transformers/layers/bert_pooler.cpp 42行 pooler层本质上是一个权值不转置的全连接层 TT_ENFORCE_EQ(dense_weight_.shape(0), dense_bias_.shape(0), "weight and bias shape mismatch %d, %d", dense_weight_.shape(0), dense_bias_.shape(0)); 应该修改为: TT_ENFORCE_EQ(dense_weight_.shape(1), dense_bias_.shape(0), "weight and bias shape mismatch %d, %d", dense_weight_.shape(0), dense_bias_.shape(0));

目前的写法对bert pooler层是没有问题的,因为pooler的weight维度是[768, 768],但如果自己修改过pooler层的话,会报错

feifeibear commented 4 years ago

非常感谢您的指正!

LitLeo commented 4 years ago

还有一处有同样的bug master/turbo_transformers/layers/bert_pooler.cpp 29行 output_tensor->Reshape<float>({input_tensor.shape(0), dense_weight_.shape(0)}, input_tensor.device_type(), input_tensor.device_id());

to output_tensor->Reshape<float>({input_tensor.shape(0), dense_weight_.shape(1)}, input_tensor.device_type(), input_tensor.device_id());

feifeibear commented 4 years ago

嗯,你也可以提交一个merge request。先fork一下项目,然后交一个分支,再把这个分支merge到master上