Closed nbl97 closed 1 year ago
Hi @nbl97 , thanks for your attention.
For both v1 and v2, softmax is utilized to normalized the random matrix, see https://github.com/sail-sg/metaformer/blob/main/metaformer_baselines.py#L301
I did not put the random mixing code in repo poolformer. the class spatialfc
refer to spatial MLP.
@yuweihao Thanks for your clarification! It is my mistake that I got confused between spatialfc
and random mixing
. It seems to me that spatialfc
is a learnable version of random mixing
(ignoring softmax). Can I infer that spatialfc
outperforms random mixing
? If softmax is necessary, does spatialfc+softmax
would achieve better performance than spatialfc
? Looking forward to your insights
Yes, spatialfc
can be regarded as a learnable version of random mixing
. Thus, spatialfc
will outperform random mixing
because of learnable parameters. Since spatialfc
's parameters can not be learned, softmax
is necessary to normalize the random matrix. I have not conducted experiments for spatialfc+softmax
, so I am not sure whether it can achieve better performance than spatialfc
. I guess the performance of spatialfc+softmax
and spatialfc
will be similar.
Huge thanks to your explanation and experience~
Welcome~
Hi~ Many thanks to your excellent works and codebase. I still have some puzzles about the random mixing operator:
class spatialfc
corresponds to the random mixing, but it doesn't seem to freeze in the codebase.If you could explain how random mixing works best, I would appreciate it!