raoyongming / GFNet

[NeurIPS 2021] [T-PAMI] Global Filter Networks for Image Classification
https://gfnet.ivg-research.xyz/
MIT License
445 stars 40 forks source link

[Q] : can The Global filter Network use for Segmentation images #27

Open deep-matter opened 1 year ago

deep-matter commented 1 year ago

I read the paper, it was providing straight ideas about how does GFNT tackled the limitation of using Self-Attention with a Vision-Transformer in computations cost and complexity also, i would like to take step further to use it in Segmentation task but i will use Global Filter as Attention mechanism in Unet model similar to Attention Gate

Algorithm pseudocode

Inputs: $F_g$ - input feature dimension of the global filter, $F_l$ - input feature dimension of the local filter, $F_{int}$ - intermediate feature dimension, $dim$ - spatial dimensions

Outputs: filtered output, gate frequencies, weights frequencies


Function: AttentionFilter($g$, $x$)

  • Input: $g$ - input global feature map, $x$ - input feature map
  • Output: $out$ - filtered output, $G1_{feq}$ - gate frequencies, $X1_{feq}$ - weights frequencies

Pseudocode:

$G1_{feq} \gets \text{GlobalFilter}(g, F_l, F_{int}, dim)$
$X1_{feq} \gets \text{GlobalFilter}(x, F_g, F_{int}, dim)$
$atten \gets \text{Softmax}\left(\frac{G1_{feq} \odot X1_{feq}}{\sqrt{2\pi\sigma^2}}\right)$
$x1 \gets \text{irfft2}\left(atten, s=(H, W), dim=(1, 2), \text{norm}='ortho'\right)$
$out \gets \text{NormLayer(x1 + x)}$
Return: $out$, $G1_{feq}$, $X1_{feq}$

my Question is : based on Learning from Frequencies which means I kept learning of neural Net on Frequencies similar to Complex-Value NN how does your opinion on the algorithm i provide anything i misunderstood or not correct
raoyongming commented 1 year ago

I think it might be interesting to try GFNet in UNet models. I think one of the most straightforward ideas is to directly replace (some of) spatial convs in UNet with our global filters. I am not familiar with Attention Gate, but I see it might be a bit strange to apply softmax on frequency-domain features. Maybe a sigmoid/tanh like Attention Gate is a better solution.

deep-matter commented 1 year ago

thank you so much for opinion i tried out that already as you said i got strange behave from the model is predicting right but some features lost somewhere which means Sofmax applied on Dim , i think i will try sigmoid better

Billy-ZTB commented 6 months ago

thank you so much for opinion i tried out that already as you said i got strange behave from the model is predicting right but some features lost somewhere which means Sofmax applied on Dim , i think i will try sigmoid better

How's the result after you tried sigmoid?