wusize / CLIPSelf

[ICLR2024 Spotlight] Code Release of CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
https://arxiv.org/abs/2310.01403
Other
149 stars 8 forks source link

Environmental problem #10

Open zhangyupeng123 opened 4 months ago

zhangyupeng123 commented 4 months ago

您好,我在运行的时候出现错误:FViT: EvaCLIPViT: Model config for EVA02-CLIP-B-16 not found; 还有一个问题就是安装xformers的时候显示需要torch2.2.0,但是跟安装的mmcv和torch好像冲突; 请问该怎么解决呢,谢谢~

wusize commented 4 months ago

Hi, please try using older xformers versions, e.g., v0.0.19

zhangyupeng123 commented 4 months ago

Thank you for your assistance. I've updated the version of xformers to 0.0.19, but this action has automatically upgraded PyTorch to version 2.0.0. As a result, do I also need to update the versions of mmcv and mmdetection?

wusize commented 4 months ago

Hi! You can install xformers from source and use --no-deps to avoid automatically updating pytorch.

zhangyupeng123 commented 4 months ago

Thank you for your response. I have successfully installed it following your method, but I encountered a problem: "FViT: EvaCLIPViT: Model config for EVA02-CLIP-B-16 not found." What could be the reason for this?

wusize commented 4 months ago

May I know how you installed openclip?

zhangyupeng123 commented 4 months ago

First I downloaded version 2.16.0 of open-clip-torch from GitHub, and then used the command "pip install -e . -v".

wusize commented 4 months ago

The CLIPSelf repo is adapted from open-clip 2.16.0. Please run

cd ./CLIPSelf
pip install -e . -v

to install the openclip modified by us.

zhangyupeng123 commented 4 months ago

Thank you very much for your guidance. I have installed it following your method, but then I encountered another problem: "FViT: EvaCLIPViT: No module named 'fused_layer_norm_cuda'". I researched and it seems to be an issue related to the installation of 'apex', but when I use "git clone https://github.com/NVIDIA/apex; cd apex; pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" ./" to install, it fails to install. Are there any requirements or precautions when installing apex? (Currently, my CUDA version is 11.6, and I installed torch version 1.11.0+cu113 to match with mmcv1.7.0 and mmdetection2.28.1)

wusize commented 4 months ago

apex is not used in this repo. Can you give a more detailed error message?

zhangyupeng123 commented 4 months ago

发生异常: NotImplementedError No operator found for memory_efficient_attention_forward with inputs: query : shape=(2, 1601, 12, 64) (torch.float16) key : shape=(2, 1601, 12, 64) (torch.float16) value : shape=(2, 1601, 12, 64) (torch.float16) attn_bias : <class 'NoneType'> p : 0.0 cutlassF is not supported because: xFormers wasn't build with CUDA support Operator wasn't built - see python -m xformers.info for more info flshattF is not supported because: xFormers wasn't build with CUDA support Operator wasn't built - see python -m xformers.info for more info tritonflashattF is not supported because: xFormers wasn't build with CUDA support Operator wasn't built - see python -m xformers.info for more info triton is not available requires A100 GPU smallkF is not supported because: xFormers wasn't build with CUDA support dtype=torch.float16 (supported: {torch.float32}) max(query.shape[-1] != value.shape[-1]) > 32 has custom scale Operator wasn't built - see python -m xformers.info for more info unsupported embed per head: 64 File "/home/zhangyupeng/anaconda3/envs/clipself/lib/python3.10/site-packages/xformers/ops/fmha/dispatch.py", line 79, in _run_priority_list raise NotImplementedError(msg) File "/home/zhangyupeng/anaconda3/envs/clipself/lib/python3.10/site-packages/xformers/ops/fmha/dispatch.py", line 104, in _dispatch_fw return _run_priority_list( File "/home/zhangyupeng/anaconda3/envs/clipself/lib/python3.10/site-packages/xformers/ops/fmha/init.py", line 306, in _memory_efficient_attention_forward op = _dispatch_fw(inp) File "/home/zhangyupeng/anaconda3/envs/clipself/lib/python3.10/site-packages/xformers/ops/fmha/init.py", line 290, in _memory_efficient_attention return _memory_efficient_attention_forward( File "/home/zhangyupeng/anaconda3/envs/clipself/lib/python3.10/site-packages/xformers/ops/fmha/init.py", line 192, in memory_efficient_attention return _memory_efficient_attention( File "/mnt/21T/zhangyupeng/code/CLIPSelf/src/open_clip/eva_clip/eva_vit_model.py", line 211, in forward x = xops.memory_efficient_attention( File "/home/zhangyupeng/anaconda3/envs/clipself/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, kwargs) File "/mnt/21T/zhangyupeng/code/CLIPSelf/src/open_clip/eva_clip/eva_vit_model.py", line 306, in forward x = x + self.drop_path(self.attn(self.norm1(x), rel_pos_bias=rel_pos_bias, attn_mask=attn_mask)) File "/home/zhangyupeng/anaconda3/envs/clipself/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, *kwargs) File "/mnt/21T/zhangyupeng/code/CLIPSelf/F-ViT/models/evaclip_vit.py", line 92, in forward x = blk(x, rel_pos_bias=rel_pos_bias) File "/home/zhangyupeng/anaconda3/envs/clipself/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(input, kwargs) File "/mnt/21T/zhangyupeng/code/CLIPSelf/F-ViT/models/fvit.py", line 43, in forward_train res_feats = self.backbone(img) File "/mnt/21T/zhangyupeng/code/CLIPSelf/mmdetection/mmdet/models/detectors/base.py", line 172, in forward return self.forward_train(img, img_metas, kwargs) File "/mnt/21T/zhangyupeng/code/CLIPSelf/mmcv/mmcv/runner/fp16_utils.py", line 149, in new_func output = old_func(*new_args, *new_kwargs) File "/home/zhangyupeng/anaconda3/envs/clipself/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(input, kwargs) File "/mnt/21T/zhangyupeng/code/CLIPSelf/mmdetection/mmdet/models/detectors/base.py", line 248, in train_step losses = self(*data) File "/mnt/21T/zhangyupeng/code/CLIPSelf/mmcv/mmcv/parallel/data_parallel.py", line 77, in train_step return self.module.train_step(inputs[0], kwargs[0]) File "/mnt/21T/zhangyupeng/code/CLIPSelf/mmcv/mmcv/runner/epoch_based_runner.py", line 31, in run_iter outputs = self.model.train_step(data_batch, self.optimizer, File "/mnt/21T/zhangyupeng/code/CLIPSelf/mmcv/mmcv/runner/epoch_based_runner.py", line 53, in train self.run_iter(data_batch, train_mode=True, kwargs) File "/mnt/21T/zhangyupeng/code/CLIPSelf/mmcv/mmcv/runner/epoch_based_runner.py", line 136, in run epoch_runner(data_loaders[i], **kwargs) File "/mnt/21T/zhangyupeng/code/CLIPSelf/mmdetection/mmdet/apis/train.py", line 246, in train_detector runner.run(data_loaders, cfg.workflow) File "/mnt/21T/zhangyupeng/code/CLIPSelf/F-ViT/train.py", line 240, in main train_detector( File "/mnt/21T/zhangyupeng/code/CLIPSelf/F-ViT/train.py", line 251, in main() NotImplementedError: No operator found for memory_efficient_attention_forward with inputs: query : shape=(2, 1601, 12, 64) (torch.float16) key : shape=(2, 1601, 12, 64) (torch.float16) value : shape=(2, 1601, 12, 64) (torch.float16) attn_bias : <class 'NoneType'> p : 0.0 cutlassF is not supported because: xFormers wasn't build with CUDA support Operator wasn't built - see python -m xformers.info for more info flshattF is not supported because: xFormers wasn't build with CUDA support Operator wasn't built - see python -m xformers.info for more info tritonflashattF is not supported because: xFormers wasn't build with CUDA support Operator wasn't built - see python -m xformers.info for more info triton is not available requires A100 GPU smallkF is not supported because: xFormers wasn't build with CUDA support dtype=torch.float16 (supported: {torch.float32}) max(query.shape[-1] != value.shape[-1]) > 32 has custom scale Operator wasn't built - see python -m xformers.info for more info unsupported embed per head: 64

Thank you for the reminder. After uninstalling apex, the error was gone. However, I encountered another problem as mentioned above. I want to use the code for downstream detection tasks with clipself. Should I use 'F-ViT/configs/ov_coco/fvit_vitb16_upsample_fpn_bs64_3e_ovcoco_eva_original.py', correct?

zhangyupeng123 commented 4 months ago

Hi~Have you encountered the issue mentioned above? And I want to use the code for downstream detection tasks with clipself. Should I use 'F-ViT/configs/ov_coco/fvit_vitb16_upsample_fpn_bs64_3e_ovcoco_eva_original.py', correct?

zhiyustar commented 2 months ago

Hi! You can install xformers from source and use --no-deps to avoid automatically updating pytorch.

Hi! i installed xformers with pip install ninja pip install -v -U git+https://github.com/facebookresearch/xformers.git@7e05e2caaaf8060c1c6baadc2b04db02d5458a94 and the version of xformers is '0.0.15+7e05e2c.d20240425', but I met this error: NotImplementedError: Could not run 'xformers::efficient_attention_forward_cutlass' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'xformers::efficient_attention_forward_cutlass' is only available for these backends: [UNKNOWN_TENSOR_TYPE_ID, QuantizedXPU, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, SparseCPU, SparseCUDA, SparseHIP, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, SparseVE, UNKNOWN_TENSOR_TYPE_ID, NestedTensorCUDA, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID, UNKNOWN_TENSOR_TYPE_ID]. and when I reinstall xformers with pip install xformers==0.0.19 --no-deps, another error occured: NotImplementedError: No operator found formemory_efficient_attention_forwardwith inputs: query : shape=(1, 4097, 12, 64) (torch.float16) key : shape=(1, 4097, 12, 64) (torch.float16) value : shape=(1, 4097, 12, 64) (torch.float16) attn_bias : <class 'NoneType'> p : 0.0 cutlassFis not supported because: xFormers wasn't build with CUDA support Operator wasn't built - seepython -m xformers.infofor more info flshattFis not supported because: xFormers wasn't build with CUDA support Operator wasn't built - seepython -m xformers.infofor more info tritonflashattFis not supported because: xFormers wasn't build with CUDA support requires A100 GPU smallkFis not supported because: xFormers wasn't build with CUDA support dtype=torch.float16 (supported: {torch.float32}) max(query.shape[-1] != value.shape[-1]) > 32 has custom scale Operator wasn't built - seepython -m xformers.infofor more info unsupported embed per head: 64 how to fix this? Thanks.