lucidrains / performer-pytorch

An implementation of Performer, a linear attention-based transformer, in Pytorch
MIT License
1.09k stars 144 forks source link

unable to import cuda code for auto-regressive Performer #15

Open batrlatom opened 4 years ago

batrlatom commented 4 years ago

Hi, I have tried to test your implementation, but have a problem to make it run correctly. Do you know what the problem could be?

env: fresh conda env, python 3.69, cuda 11, pytorch 1.7 card: gtx1080ti

amirardalan9473 commented 4 years ago

Having the same issue

qazwsxal commented 4 years ago

This library requires pytorch-fast-transformers to be installed as it uses implementations of causal linear attention from it. When fast-transformers is installed, it attempts to call the CUDA compiler nvcc in order to create memory-optimised CUDA versions of some functions. If the optimised versions can't be compiled for any reason, performer-pytorch will use its own, less efficient implementation resulting in this warning. The code should still run but will use more memory.

qazwsxal commented 4 years ago

https://github.com/idiap/fast-transformers/issues/23#issuecomment-693323065 This comment should explain how to install the pytorch-fast-transformers library in a way that works

AliOskooeiTR commented 3 years ago

I get the same error and it just crashes even when I try to run it on cpu.

AliOskooeiTR commented 3 years ago

@qazwsxal I successfully installed pytorch-fast-transformers but still get this error either with cpu or gpu: unable to import cuda code for auto-regressive Performer. will default to the memory inefficient non-cuda version Segmentation fault. Would appreciate any help or pointer. Thank you

qazwsxal commented 3 years ago

As this library is written in 100% Python, a segfault is unlikely to be caused by anything here. Without a full stacktrace, it's also going to be difficult to figure out why the crash is happening

iago-suarez commented 3 years ago

Having the same issue:

vasiliyeskin commented 3 years ago

Dear friends, I solved this problem to my machine. In terminal:

  1. Install patches for cuda toolkits 10.2 https://developer.nvidia.com/cuda-10.2-download-archive?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1804&target_type=deblocal
  2. Install wheel
  3. Install ninja
  4. Install fast-performer from git: python3.8 -m pip install git+https://github.com/idiap/fast-transformers.git

My system is