facebookresearch / ToMe

A method to increase the speed and lower the memory footprint of existing vision transformers.
Other
931 stars 67 forks source link

Does ToMe work for focal modulation networks? #24

Closed subneed closed 1 year ago

subneed commented 1 year ago

any help on modifying ToMe for focal modulation networks? I guess in FMN we could apply tome on Q/M. Also it has downsampling layers in each stage, so r value changes each stage and model definition?

dbolya commented 1 year ago

Duplicate of #23