rayleizhu / BiFormer

[CVPR 2023] Official code release of our paper "BiFormer: Vision Transformer with Bi-Level Routing Attention"
https://arxiv.org/abs/2303.08810
MIT License
460 stars 36 forks source link

Problems with using the biformer attention mechanism #8

Closed Gaoang1 closed 1 year ago

Gaoang1 commented 1 year ago

RuntimeError: scatter_add_cuda_kernel does not have a deterministic implementation, but you set 'torch.use_deterministic_algorithms(True)'. You can turn off determinism just for this operation, or you can use the 'warn_only=True' option, if that's accepta...

rayleizhu commented 1 year ago

RuntimeError: scatter_add_cuda_kernel does not have a deterministic implementation, but you set 'torch.use_deterministic_algorithms(True)'. You can turn off determinism just for this operation, or you can use the 'warn_only=True' option, if that's accepta...

I did not meet such errors in all of my experiments and hence can not provide suggestions. Did you change something?

emm110112 commented 1 year ago

I am also having the same problem, may I ask what is the reason for this situation

Gaoang1 commented 1 year ago

I don't know what causes it, but you can fix it by setting torch.use_deterministic_algorithms(True) to False