zhoudaquan / Refiner_ViT

108 stars 15 forks source link

About the ablation study. #1

Closed TiankaiHang closed 3 years ago

TiankaiHang commented 3 years ago

Hi, thanks for your nice work!

I have one question, here you introduce linear projection W_A image The parameters of W_A (r H H) are related to the ratio. However, in your ablation study(table 1), image That confuses me... Can i ask why?

zhoudaquan commented 3 years ago

Hi, thanks for your nice work!

I have one question, here you introduce linear projection W_A image The parameters of W_A (r H H) are related to the ratio. However, in your ablation study(table 1), image That confuses me... Can i ask why?

Hi Tiankai,

Thanks for your interests in this work! For your question, the number of parameters included in the linear expansion/project layer is calculated as #Heads^2 #Expansion. As we typically use 12 heads, the total number of parameter overheads with expansion ratio of 6 is 144 6. Thus, when combined with the total number of parameters in the model, there are actually negligible. If we take two more decimal precision, they are reflected in a magnitude of less than 0.1M.

I hope this clarify your question. If you need any other clarifications, do drop me a message.

TiankaiHang commented 3 years ago

Thanks for your kind reply :-)