I read the paper, it was providing straight ideas about how does GFNT tackled the limitation of using Self-Attention with a Vision-Transformer in computations cost and complexity also, i would like to take step further to use it in Segmentation task but i will use Global Filter as Attention mechanism in Unet model similar to Attention Gate
Algorithm pseudocode
Inputs: $F_g$ - input feature dimension of the global filter, $F_l$ - input feature dimension of the local filter, $F_{int}$ - intermediate feature dimension, $dim$ - spatial dimensions
my Question is : based on Learning from Frequencies which means I kept learning of neural Net on Frequencies similar to Complex-Value NN how does your opinion on the algorithm i provide anything i misunderstood or not correct
I think it might be interesting to try GFNet in UNet models. I think one of the most straightforward ideas is to directly replace (some of) spatial convs in UNet with our global filters. I am not familiar with Attention Gate, but I see it might be a bit strange to apply softmax on frequency-domain features. Maybe a sigmoid/tanh like Attention Gate is a better solution.
thank you so much for opinion i tried out that already as you said i got strange behave from the model is predicting right but some features lost somewhere which means Sofmax applied on Dim , i think i will try sigmoid better
thank you so much for opinion i tried out that already as you said i got strange behave from the model is predicting right but some features lost somewhere which means Sofmax applied on Dim , i think i will try sigmoid better
I read the paper, it was providing straight ideas about how does GFNT tackled the limitation of using Self-Attention with a Vision-Transformer in computations cost and complexity also, i would like to take step further to use it in Segmentation task but i will use Global Filter as Attention mechanism in Unet model similar to Attention Gate
Algorithm pseudocode
Inputs: $F_g$ - input feature dimension of the global filter, $F_l$ - input feature dimension of the local filter, $F_{int}$ - intermediate feature dimension, $dim$ - spatial dimensions
Outputs: filtered output, gate frequencies, weights frequencies
Function: AttentionFilter($g$, $x$)
Pseudocode:
$G1_{feq} \gets \text{GlobalFilter}(g, F_l, F_{int}, dim)$
$X1_{feq} \gets \text{GlobalFilter}(x, F_g, F_{int}, dim)$
$atten \gets \text{Softmax}\left(\frac{G1_{feq} \odot X1_{feq}}{\sqrt{2\pi\sigma^2}}\right)$
$x1 \gets \text{irfft2}\left(atten, s=(H, W), dim=(1, 2), \text{norm}='ortho'\right)$
$out \gets \text{NormLayer(x1 + x)}$
Return: $out$, $G1_{feq}$, $X1_{feq}$
I think it might be interesting to try GFNet in UNet models. I think one of the most straightforward ideas is to directly replace (some of) spatial convs in UNet with our global filters. I am not familiar with Attention Gate, but I see it might be a bit strange to apply softmax on frequency-domain features. Maybe a sigmoid/tanh like Attention Gate is a better solution.
thank you so much for opinion i tried out that already as you said i got strange behave from the model is predicting right but some features lost somewhere which means Sofmax applied on Dim , i think i will try sigmoid better
How's the result after you tried sigmoid?