ZZZHANG-jx / DocRes

[CVPR 2024] DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks
MIT License
307 stars 30 forks source link

about position embedding #17

Open zhaozhaoooo opened 1 month ago

zhaozhaoooo commented 1 month ago

why doesn't the architecture need position embedding?

ZZZHANG-jx commented 1 month ago

why doesn't the architecture need position embedding?

My understanding is that in Restormer, the self-attention module, Multi-Dconv Head Transposed Attention (MDTA), operates along the channel dimension rather than the spatial dimension, so position embedding is not necessary.