xFormers can't load C++/CUDA extensions. xFormers was built for:

I got an error when starting gnfactor training: my env setting is below GPU: nvidia l40s cuda: 11.7 ( in a docker environment) ubuntu: 22.04

it seems that xformers is incompatible with torch 2.4.0 but " pip install torchvision --upgrade" in the installation.md occurs torch==2.4.0.

Do you know how to handle this? please help me ! :(

` Error executing job with overrides: ['method=GNFACTOR_BC', 'rlbench.task_name=GNFACTOR_BC_20240725', 'rlbench.demo_path=/GNFactor/data/train_data', 'replay.path=/GNFactor/replay/GNFACTOR_BC_20240725', 'framework.start_seed=0', 'framework.use_wandb=False', 'method.use_wandb=False', 'framework.wandb_group=GNFACTOR_BC_20240725', 'ddp.num_devices=1', 'ddp.master_port=12345'] Traceback (most recent call last): File "/GNFactor/GNFactor/train.py", line 100, in main mp.spawn(run_seed_fn.run_seed, File "/root/miniconda3/envs/gnfactor/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 282, in spawn return start_processes(fn, args, nprocs, join, daemon, start_method="spawn")

train_runner.start()

File "/GNFactor/third_party/YARR/yarr/runners/offline_train_runner.py", line 175, in start loss = self._step(i, batch) File "/GNFactor/third_party/YARR/yarr/runners/offline_train_runner.py", line 97, in _step update_dict = self._agent.update(i, sampled_batch) File "/GNFactor/GNFactor/helpers/preprocess_agent.py", line 58, in update return self._pose_agent.update(step, replay_sample) File "/GNFactor/GNFactor/agents/gnfactor_bc/qattention_stack_agent.py", line 47, in update update_dict = qa.update(step, replay_sample) File "/GNFactor/GNFactor/agents/gnfactor_bc/qattention_gnfactor_bc_agent.py", line 772, in update rendering_loss_dict = self._q(obs, File "/root/miniconda3/envs/gnfactor/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "/root/miniconda3/envs/gnfactor/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl return forward_call(*args, *kwargs) File "/GNFactor/GNFactor/agents/gnfactor_bc/qattention_gnfactor_bc_agent.py", line 227, in forward rendering_loss_dict = self._neural_renderer(voxel_feat=voxel_grid_feature, \ File "/root/miniconda3/envs/gnfactor/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/root/miniconda3/envs/gnfactor/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl return forward_call(*args, **kwargs) File "/root/miniconda3/envs/gnfactor/lib/python3.9/site-packages/torch/nn/parallel/distributed.py", line 1636, in forward else self._run_dd out = xformers.ops.memory_efficient_attention(q, k, v, attn_bias=None, op=self.attention_op) File "/root/miniconda3/envs/gnfactor/lib/python3.9/site-packages/xformers/ops/fmha/init.py", line 196, in memory_efficient_attention return _memory_efficient_attention( File "/root/miniconda3/envs/gnfactor/lib/python3.9/site-packages/xformers/ops/fmha/init.py", line 294, in _memory_efficient_attention return _memory_efficient_attention_forward( File "/root/miniconda3/envs/gnfactor/lib/python3.9/site-packages/xformers/ops/fmha/init.py", line 310, in _memory_efficient_attention_forward op = _dispatch_fw(inp) File "/root/miniconda3/envs/gnfactor/lib/python3.9/site-packages/xformers/ops/fmha/dispatch.py", line 98, in _dispatch_fw return _run_priority_list( File "/root/miniconda3/envs/gnfactor/lib/python3.9/site-packages/xformers/ops/fmha/dispatch.py", line 73, in _run_priority_list raise NotImplementedError(msg) NotImplementedError: No operator found for memory_efficient_attention_forward with inputs: query : shape=(1, 4096, 1, 512) (torch.float32) key : shape=(1, 4096, 1, 512) (torch.float32) value : shape=(1, 4096, 1, 512) (torch.float32) attn_bias : <class 'NoneType'> p : 0.0 cutlassF is not supported because: xFormers wasn't build with CUDA support Operator wasn't built - see python -m xformers.info for more info flshattF is not supported because: xFormers wasn't build with CUDA support dtype=torch.float32 (supported: {torch.bfloat16, torch.float16}) max(query.shape[-1] != value.shape[-1]) > 128 Operator wasn't built - see python -m xformers.info for more info tritonflashattF is not supported because: xFormers wasn't build with CUDA support dtype=torch.float32 (supported: {torch.bfloat16, torch.float16}) max(query.shape[-1] != value.shape[-1]) > 128 requires A100 GPU smallkF is not supported because: xFormers wasn't build with CUDA support max(query.shape[-1] != value.shape[-1]) > 32 Operator wasn't built - see python -m xformers.info for more info unsupported embed per head: 512

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace. [2024-07-25 07:46:05,821][root][INFO] - Using env device cuda:0. [2024-07-25 07:46:05,836][root][INFO] - Evaluating seed 0. [2024-07-25 07:46:06,043][xformers][WARNING] - WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for: PyTorch 2.0.0+cu118 with CUDA 1108 (you have 2.4.0+cu121) Python 3.9.16 (you have 3.9.19) Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers) Memory-efficient attention, SwiGLU, sparse and more won't be available. Set XFORMERS_MORE_DETAILS=1 for more details /root/miniconda3/envs/gnfactor/lib/python3.9/site-packages/xformers/triton/softmax.py:30: FutureWarning: torch.cuda.amp.custom_fwd(args...) is deprecated. Please use torch.amp.custom_fwd(args..., device_type='cuda') instead. @custom_fwd(cast_inputs=torch.float16 if _triton_softmax_fp16_enabled else None) /root/miniconda3/envs/gnfactor/lib/python3.9/site-packages/xformers/triton/softmax.py:87: FutureWarning: torch.cuda.amp.custom_bwd(args...) is deprecated. Pl File "/romaking attention of type 'vanilla-xformers' with 512 in_channels [89/1941] building MemoryEfficientAttnBlock with 512 in_channels... Working with z of shape (1, 4, 32, 32) = 4096 dimensions. making attention of type 'vanilla-xformers' with 512 in_channels building MemoryEfficientAttnBlock with 512 in_channels... INFO:fvcore.common.checkpoint:[Checkpointer] Loading from sd://v1-3 ... INFO:iopath.common.file_io:URL https://huggingface.co/CompVis/stable-diffusion-v-1-3-original/resolve/main/sd-v1-3.ckpt cached in /root/.torch/iopath_cache/CompVis/stable -diffusion-v-1-3-original/resolve/main/sd-v1-3.ckpt WARNING:fvcore.common.checkpoint:The checkpoint state_dict contains keys that are not used by the model: model_ema.{decay, num_updates} diffusion feature dims: 512, 512, 2560, 1920, 960, 640, 512, 512 foundation model diffusion is build. loss weight: 0.01 NeuralRenderer rgb loss weight: 1.0

INFO:root:# Q Params: 44 M [rank0]:[W725 07:45:58.101337864 reducer.cpp:1400] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the fo rward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unu sed parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) Error executing job with overrides: ['method=GNFACTOR_BC', 'rlbench.task_name=GNFACTOR_BC_20240725', 'rlbench.demo_path=/GNFactor/data/train_data', 'replay.path=/GNFactor /replay/GNFACTOR_BC_20240725', 'framework.start_seed=0', 'framework.use_wandb=False', 'method.use_wandb=False', 'framework.wandb_group=GNFACTOR_BC_20240725', 'ddp.num_dev ices=1', 'ddp.master_port=12345'] Traceback (most recent call last): File "/GNFactor/GNFactor/train.py", line 100, in main`

+) This is my pip list

(gnfactor) root@node100:/GNFactor# pip list Package Version Editable project location

absl-py 2.1.0 aiohttp 3.9.5 aiosignal 1.3.1 albumentations 1.3.0 antlr4-python3-runtime 4.8 asttokens 2.4.1 async-timeout 4.0.3 attrs 23.2.0 black 24.4.2 blessed 1.20.0 boto3 1.34.148 botocore 1.34.148 certifi 2024.7.4 cffi 1.14.2 chardet 5.2.0 charset-normalizer 3.3.2 click 8.1.7 clip 1.0 /CLIP cloudpickle 3.0.0 cmake 3.30.1 colorama 0.4.6 contourpy 1.2.1 cycler 0.12.1 decorator 4.4.2 detectron2 0.6 /detectron2 diffdist 0.1 docker-pycreds 0.4.0 dotmap 1.3.30 einops 0.3.0 exceptiongroup 1.2.2 executing 2.0.1 filelock 3.15.4 fonttools 4.53.1 freetype-py 2.4.0 frozenlist 1.4.1 fsspec 2024.6.1 ftfy 6.2.0 future 1.0.0 fvcore 0.1.5.post20221221 gitdb 4.0.11 GitPython 3.1.43 gnfactor 0.1.0 /GNFactor/GNFactor gpustat 1.1.1 grpcio 1.65.1 h5py 3.3.0 html-testRunner 1.2.1 huggingface-hub 0.24.2 hydra-core 1.1.0 idna 3.7 imageio 2.34.2 imageio-ffmpeg 0.5.1 importlib_metadata 8.2.0 importlib_resources 6.4.0 iopath 0.1.9 ipdb 0.13.13 ipython 8.18.1 jedi 0.19.1 Jinja2 3.1.4 jmespath 1.0.1 joblib 1.4.2 jsonpatch 1.33 jsonpointer 3.0.0 kiwisolver 1.4.5 kornia 0.6.0 lazy_loader 0.4 lit 18.1.8 Markdown 3.6 MarkupSafe 2.1.5 mask2former 0.1 matplotlib 3.9.1 matplotlib-inline 0.1.7 mkl-fft 1.3.8 mkl-random 1.2.4 mkl-service 2.4.0 moviepy 1.0.3 mpmath 1.3.0 multidict 6.0.5 mypy-extensions 1.0.0 natsort 8.4.0 networkx 3.2.1 nltk 3.8.1 numpy 1.23.5 nvidia-cublas-cu11 11.10.3.66 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu11 11.7.101 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu11 11.7.99 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu11 11.7.99 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu11 8.5.0.96 nvidia-cudnn-cu12 9.1.0.70 nvidia-cufft-cu11 10.9.0.58 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu11 10.2.10.91 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu11 11.4.0.1 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu11 11.7.4.91 nvidia-cusparse-cu12 12.1.0.106 nvidia-ml-py 12.555.43 nvidia-nccl-cu11 2.14.3 nvidia-nccl-cu12 2.20.5 nvidia-nvjitlink-cu12 12.5.82 nvidia-nvtx-cu11 11.7.91 nvidia-nvtx-cu12 12.1.105 odise 0.1 /GNFactor/third_party/ODISE omegaconf 2.1.1 open-clip-torch 2.0.2 opencv-python 4.6.0.66 opencv-python-headless 4.10.0.84 packaging 24.1 pandas 1.4.1 panopticapi 0.1 parso 0.8.4 pathspec 0.12.1 pexpect 4.9.0 pillow 10.4.0 pip 24.0 platformdirs 4.2.2 portalocker 2.10.1 proglog 0.1.10 prompt_toolkit 3.0.47 protobuf 4.25.4 psutil 6.0.0 ptyprocess 0.7.0 pure_eval 0.2.3 pycocotools 2.0.8 pycparser 2.22 pyDeprecate 0.3.1 pyglet 2.0.16 Pygments 2.18.0 pyhocon 0.3.61 PyOpenGL 3.1.0 pyparsing 3.1.2 pyquaternion 0.9.9 pyre-extensions 0.0.23 pyrender 0.1.45 PyRep 4.1.0.3 python-dateutil 2.9.0.post0 pytorch-lightning 1.4.2 pytorch3d 0.7.7 /pytorch3d pytz 2024.1 PyYAML 6.0.1 qudida 0.0.4 regex 2024.7.24 requests 2.32.3 rlbench 1.2.0 /GNFactor/third_party/RLBench s3transfer 0.10.2 safetensors 0.4.3 scikit-image 0.24.0 scikit-learn 1.5.1 scipy 1.13.1 sentencepiece 0.2.0 sentry-sdk 2.11.0 setproctitle 1.3.3 setuptools 61.1.0 six 1.16.0 smmap 5.0.1 stable-diffusion-sdkit 2.1.3 stack-data 0.6.3 sympy 1.13.1 tabulate 0.9.0 tensorboard 2.17.0 tensorboard-data-server 0.7.2 termcolor 2.4.0 test-tube 0.7.5 threadpoolctl 3.5.0 tifffile 2024.7.24 timeout-decorator 0.5.0 timm 0.6.11 tokenizers 0.13.3 tomli 2.0.1 torch 2.4.0 torchaudio 0.10.0 torchmetrics 0.6.0 [0/1941] torchvision 0.19.0 tornado 6.4.1 tqdm 4.66.4 traitlets 5.14.3 transformers 4.26.1 trimesh 4.4.3 triton 3.0.0 typing_extensions 4.11.0 typing-inspect 0.9.0 urllib3 1.26.19 visdom 0.2.4 wandb 0.17.5 wcwidth 0.2.13 websocket-client 1.8.0 Werkzeug 3.0.3 wheel 0.43.0 xformers 0.0.18 yacs 0.1.8 yarl 1.9.4 yarr 0.1 /GNFactor/third_party/YARR zipp 3.19.2

YanjieZe / GNFactor

xFormers can't load C++/CUDA extensions. xFormers was built for: #10