Closed subneed closed 1 year ago
any help on modifying ToMe for focal modulation networks? I guess in FMN we could apply tome on Q/M. Also it has downsampling layers in each stage, so r value changes each stage and model definition?
Duplicate of #23
any help on modifying ToMe for focal modulation networks? I guess in FMN we could apply tome on Q/M. Also it has downsampling layers in each stage, so r value changes each stage and model definition?