rayleizhu / BiFormer

[CVPR 2023] Official code release of our paper "BiFormer: Vision Transformer with Bi-Level Routing Attention"
https://arxiv.org/abs/2303.08810
MIT License
461 stars 36 forks source link

memory problem #21

Closed yjssa closed 1 year ago

yjssa commented 1 year ago

When I input the 1X20X600X500 picture into the nchwBRA, the memory will reach about 100G, is there any way to reduce the memory usage (My pictures cannot be cropped),thank you very much.

rayleizhu commented 1 year ago

It is caused by the gather operator, which causes redundant memory copy (k times redundancy). The only way to avoid that is to implement custom CUDA kernel.

I’ve only implemented FP32 forward kernel, which is faster and memory-efficient. But there are still a lot things to do: backward kernel, FP16, tensor core, etc. :(

On 19 Jun 2023, at 7:29 PM, 一介书生 @.***> wrote:



When I input the 1X20X600X500 picture into the nchwBRA, the memory will reach about 100G, is there any way to reduce the memory usage (My pictures cannot be cropped),thank you very much.

— Reply to this email directly, view it on GitHubhttps://github.com/rayleizhu/BiFormer/issues/21, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AEYCTO3BGPBQIHZXOMQKV6TXMEDG3ANCNFSM6AAAAAAZMTWFBM. You are receiving this because you are subscribed to this thread.Message ID: @.***>