Problem in bert - Githubissues

Lyken17 / pytorch-OpCounter

Count the MACs / FLOPs of your PyTorch model.

MIT License

4.9k stars 528 forks source link

Open zetaodu opened 1 year ago

zetaodu commented 1 year ago

I find thop will not calculate the parameters in BertEmbedding and if I define two self_attention blocks in one layer, it will only calculate one.

ivanstepanovftw commented 10 months ago

Second self_attention block should also be used in forward method