bert_pooler.cpp的一处简单bug

LitLeo commented 4 years ago

master/turbo_transformers/layers/bert_pooler.cpp 42行 pooler层本质上是一个权值不转置的全连接层 TT_ENFORCE_EQ(dense_weight_.shape(0), dense_bias_.shape(0), "weight and bias shape mismatch %d, %d", dense_weight_.shape(0), dense_bias_.shape(0)); 应该修改为： TT_ENFORCE_EQ(dense_weight_.shape(1), dense_bias_.shape(0), "weight and bias shape mismatch %d, %d", dense_weight_.shape(0), dense_bias_.shape(0));

目前的写法对bert pooler层是没有问题的，因为pooler的weight维度是[768, 768]，但如果自己修改过pooler层的话，会报错

feifeibear commented 4 years ago

非常感谢您的指正！

LitLeo commented 4 years ago

还有一处有同样的bug master/turbo_transformers/layers/bert_pooler.cpp 29行 output_tensor->Reshape<float>({input_tensor.shape(0), dense_weight_.shape(0)}, input_tensor.device_type(), input_tensor.device_id());

to output_tensor->Reshape<float>({input_tensor.shape(0), dense_weight_.shape(1)}, input_tensor.device_type(), input_tensor.device_id());

feifeibear commented 4 years ago

嗯，你也可以提交一个merge request。先fork一下项目，然后交一个分支，再把这个分支merge到master上

Tencent / TurboTransformers

bert_pooler.cpp的一处简单bug #186