In the PVT-v2 code, have you tried not using a linear projection after the pooling layer in the spatial reduction part of the attention?

whai362 / PVT

Official implementation of PVT series

Apache License 2.0

1.73k stars 246 forks source link

In the PVT-v2 code, have you tried not using a linear projection after the pooling layer in the spatial reduction part of the attention? #106

Open Phuoc-Hoan-Le opened 1 year ago

Phuoc-Hoan-Le commented 1 year ago

I noticed in the PVT-v2 code, that you use a linear projection after the pooling layer in the spatial reduction part of the attention?

I am wondering have you tried training the model without using a linear projection after the pooling layer in the spatial reduction part of the attention? Does it work or not?