Open zhaozhaoooo opened 1 month ago
why doesn't the architecture need position embedding?
My understanding is that in Restormer, the self-attention module, Multi-Dconv Head Transposed Attention (MDTA), operates along the channel dimension rather than the spatial dimension, so position embedding is not necessary.
why doesn't the architecture need position embedding?