open-mmlab / mmcv

OpenMMLab Computer Vision Foundation
https://mmcv.readthedocs.io/en/latest/
Apache License 2.0
5.82k stars 1.63k forks source link

Issue in applying Mask2former: RuntimeError: "ms_deform_attn_forward_cuda" not implemented for 'Half' #2167

Open brakuta opened 2 years ago

brakuta commented 2 years ago

I am trying to train Mask2former for instance segmentation (image size 512*512), but I encountered this issue: RuntimeError: "ms_deform_attn_forward_cuda" not implemented for 'Half'.

from mmcv import Config

baseline_cfg_path='/content/mmdetection/configs/mask2former/mask2former_swin-s-p4-w7-224_lsj_8x2_50e_coco.py'

cfg = Config.fromfile(baseline_cfg_path)

cfg.dataset_type = 'COCODataset' cfg.classes = ('class_A',) cfg.data.train.ann_file = r"/content/drive/MyDrive/Samples/Train.json" cfg.data.train.img_prefix = r"/content/drive/MyDrive/Samples/Train" # Prefix of image path cfg.data.train.classes = cfg.classes cfg.data.train.type='CocoDataset'

cfg.data.val.ann_file = r"/content/drive/MyDrive/Samples/Val.json" cfg.data.val.img_prefix = r"/content/drive/MyDrive/Samples/Val" cfg.data.val.classes = cfg.classes cfg.data.val.type='CocoDataset'

cfg.seed = 0 set_random_seed(0, deterministic=False) cfg.gpu_ids = range(1)

cfg.data.samples_per_gpu = 1 cfg.data.workers_per_gpu = 1 cfg.runner = dict(type='IterBasedRunner', max_iters=100000) cfg.optimizer=dict(type='Adam', lr=0.0005, weight_decay=0.0001) cfg.evaluation.interval=2000

cfg.evaluation.metric='loss'

cfg.evaluation.save_best='auto' #added

cfg.checkpoint_config.interval = 2000

cfg.runner.max_iters = 40000

cfg.log_config.interval = 50

cfg.log_config.hooks = [ dict(type='TextLoggerHook'), dict(type='TensorboardLoggerHook')]

cfg.fp16 = dict(loss_scale=512.0) meta = dict() meta['config'] = cfg.pretty_text cfg.device='cuda'

Build dataset

datasets = [build_dataset(cfg.data.train)]

Build the detector

model = build_detector(cfg.model,train_cfg=cfg.get('train_cfg'), test_cfg=cfg.get('test_cfg')) model.CLASSES = datasets[0].CLASSES mmcv.mkdir_or_exist(osp.abspath(cfg.work_dir)) train_detector(model, datasets, cfg, distributed=False, validate=True, meta=meta)

grimoire commented 2 years ago

Try replace all AT_DISPATCH_FLOATING_TYPES in ms_deform_attn_cuda.cu with AT_DISPATCH_FLOATING_TYPES_AND_HALF and recompile MMCV

brakuta commented 2 years ago

Thanks for your response. I have changed the file accordingly, but I am still facing the same issue.

zhouzaida commented 2 years ago

After changing the file, did you re-compile mmcv-full?

brakuta commented 2 years ago

I am so sorry for this silly question, but what should I do to re-compile mmcv-full? Restart the kernel,? Reinstall?

zhouzaida commented 2 years ago

Refer to https://mmcv.readthedocs.io/en/latest/get_started/build.html#build-on-linux-or-macos

WEIZHIHONG720 commented 1 year ago

我对这个愚蠢的问题感到非常抱歉,但是我应该怎么做才能重新编译 mmcv-full 呢?重新启动内核,?重新安装?

Have you solved it?