FLOPS calculation - Githubissues

dandelin / ViLT

Code for the ICML 2021 (long talk) paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"

Apache License 2.0

1.41k stars 208 forks source link

FLOPS calculation #23

Open junchen14 opened 3 years ago

junchen14 commented 3 years ago

hi when you compute the FLOPS in table 6 for baseline models such as ViLBERT, do you also include the FLOPS computation of feature extraction models?

dandelin commented 3 years ago

Hi @junchen14,

Yes, we calculated FLOPs by summing up those of object detection backbone + object detection RCNN + NMS + modality interaction transformer for object detection-based vision-and-language models.