autoliuweijie / FastBERT

The score code of FastBERT (ACL2020)
https://www.aclweb.org/anthology/2020.acl-main.537/
604 stars 90 forks source link

Calculation about the FLOP of full-connected layer. #2

Closed dawson-chen closed 4 years ago

dawson-chen commented 4 years ago

First, thanks for you work, it's very useful for inference BERT-like modes. Hope your paper get published soon. And something i'm confused about is the FLOP of dense layer, which is in the section 4.1. As far as i know, the FLOPs of fully connect layer with bias = 2 I O

I=input neuron numbers, O=output neuron numbers.

For the Fully-connect layer 128 ->128, FLOPs = 2 128 128 = 32,768 And in the Table 1, the answer is 4.2M, which is much higher than i got. Can you release your method to calculate the answer ?

autoliuweijie commented 4 years ago

First, thanks for you work, it's very useful for inference BERT-like modes. Hope your paper get published soon. And something i'm confused about is the FLOP of dense layer, which is in the section 4.1. As far as i know, the FLOPs of fully connect layer with bias = 2 I O

I=input neuron numbers, O=output neuron numbers.

For the Fully-connect layer 128 ->128, FLOPs = 2 128 128 = 32,768 And in the Table 1, the answer is 4.2M, which is much higher than i got. Can you release your method to calculate the answer ?

In BERT, if the input sentence contains 128 words, the FLOPs should be 2 128 128 * 128 = 4194304.

dawson-chen commented 4 years ago

Now i got it, thank you.