Closed ZHIZIHUABU closed 10 months ago
"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin\nvcc" -c csrc/pairwise/pairwise.cu -o build\temp.win-amd64-3.8\Release\csr c/pairwise/pairwise.obj -ID:\Anaconda3\envs\boxinstseg\lib\site-packages\torch\include -ID:\Anaconda3\envs\boxinstseg\lib\site-packages\ torch\include\torch\csrc\api\include -ID:\Anaconda3\envs\boxinstseg\lib\site-packages\torch\include\TH -ID:\Anaconda3\envs\boxinstseg\li b\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\include" -ID:\Anaconda3\envs\boxinstseg\in clude -ID:\Anaconda3\envs\boxinstseg\Include "-IC:\Program Files\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\ATLMFC \include" "-IC:\Program Files\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include" "-IC:\Program Files (x86)\Window s Kits\NETFXSDK\4.6\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows K its\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" -Xcudafe --diag_suppress=d ll_interface_conflict_dllexport_assumed -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=field_with out_dll_interface -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcompiler /EHsc -Xcompiler /wd4190 -Xcompiler /wd4018 -Xcompiler /wd4275 -Xcompiler /wd4267 -Xcompiler /wd4244 -Xcompiler /wd4251 -Xcompiler /wd4819 -Xcompiler /MD -DCUDA_NO_HALF_OPERATOR S -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -DCUDA_NO _HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=pa irwise_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --use-local-env pairwise.cu csrc/pairwise/pairwise.cu(162): error: no instance of function template "at::cuda::ATenCeilDiv" matches the argument list argument types are: (long long, const long)
csrc/pairwise/pairwise.cu(162): error: no instance of overloaded function "std::min" matches the argument list
argument types are: (
csrc/pairwise/pairwise.cu(162): error: no instance of function template "at::cuda::ATenCeilDiv" matches the argument list argument types are: (long long, const long)
csrc/pairwise/pairwise.cu(162): error: no instance of overloaded function "std::min" matches the argument list
argument types are: (
csrc/pairwise/pairwise.cu(187): error: no instance of function template "at::cuda::ATenCeilDiv" matches the argument list argument types are: (long long, const long)
csrc/pairwise/pairwise.cu(187): error: no instance of overloaded function "std::min" matches the argument list
argument types are: (
csrc/pairwise/pairwise.cu(187): error: no instance of function template "at::cuda::ATenCeilDiv" matches the argument list argument types are: (long long, const long)
csrc/pairwise/pairwise.cu(187): error: no instance of overloaded function "std::min" matches the argument list
argument types are: (
8 errors detected in the compilation of "csrc/pairwise/pairwise.cu". error: command 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin\nvcc.exe' failed with exit code 1
"C:\Program Files\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX86\x64\cl.exe" /c /nologo /O2 /W3 /GL /DNDEB UG /MD -IE:\PycharmCode\cuda11.6\BoxInstSeg-main\mmdet\ops\tree_filter\src -ID:\Anaconda3\envs\boxinstseg\lib\site-packages\torch\includ e -ID:\Anaconda3\envs\boxinstseg\lib\site-packages\torch\include\torch\csrc\api\include -ID:\Anaconda3\envs\boxinstseg\lib\site-packages \torch\include\TH -ID:\Anaconda3\envs\boxinstseg\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CU DA\v11.1\include" -ID:\Anaconda3\envs\boxinstseg\include -ID:\Anaconda3\envs\boxinstseg\Include "-IC:\Program Files\Microsoft Visual Stu dio\2019\Community\VC\Tools\MSVC\14.29.30133\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14 .29.30133\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.6\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10. 0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include \10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\includ e\10.0.19041.0\cppwinrt" /EHsc /TpE:\PycharmCode\cuda11.6\BoxInstSeg-main\mmdet\ops\tree_filter\src\mst\boruvka.cpp /Fobuild\temp.win-am d64-3.8\Release\PycharmCode\cuda11.6\BoxInstSeg-main\mmdet\ops\tree_filter\src\mst\boruvka.obj /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd42 75 /wd4018 /wd4190 /EHsc -O3 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=tree_filter_cuda -D_GLIBCXX_USE_CXX11_ABI=0 cl: 命令行 warning D9002 :忽略未知选项“-O3” boruvka.cpp "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin\nvcc" -c E:\PycharmCode\cuda11.6\BoxInstSeg-main\mmdet\ops\tree_filter\src \mst\mst.cu -o build\temp.win-amd64-3.8\Release\PycharmCode\cuda11.6\BoxInstSeg-main\mmdet\ops\tree_filter\src\mst\mst.obj -IE:\PycharmC ode\cuda11.6\BoxInstSeg-main\mmdet\ops\tree_filter\src -ID:\Anaconda3\envs\boxinstseg\lib\site-packages\torch\include -ID:\Anaconda3\env s\boxinstseg\lib\site-packages\torch\include\torch\csrc\api\include -ID:\Anaconda3\envs\boxinstseg\lib\site-packages\torch\include\TH -I D:\Anaconda3\envs\boxinstseg\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\include" -I D:\Anaconda3\envs\boxinstseg\include -ID:\Anaconda3\envs\boxinstseg\Include "-IC:\Program Files\Microsoft Visual Studio\2019\Community\V C\Tools\MSVC\14.29.30133\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include" " -IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.6\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC :\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "- IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwi nrt" -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcu dafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcompiler /EHsc -Xcomp iler /wd4190 -Xcompiler /wd4018 -Xcompiler /wd4275 -Xcompiler /wd4267 -Xcompiler /wd4244 -Xcompiler /wd4251 -Xcompiler /wd4819 -Xcompile r /MD -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --exp t-relaxed-constexpr -O3 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=tree_filter_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch =compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --use-local-env mst.cu E:/PycharmCode/cuda11.6/BoxInstSeg-main/mmdet/ops/tree_filter/src/mst/mst.cu(102): error: expression must have a constant value E:/PycharmCode/cuda11.6/BoxInstSeg-main/mmdet/ops/tree_filter/src/mst/mst.cu(102): note: the value of variable "batch_size" (90): here cannot be used as a constant
1 error detected in the compilation of "E:/PycharmCode/cuda11.6/BoxInstSeg-main/mmdet/ops/tree_filter/src/mst/mst.cu". error: command 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin\nvcc.exe' failed with exit code 1
@ZHIZIHUABU 你好, 目前还没有测试过在win上编译运行,建议在Ubuntu下。 可以尝试注释算子相关代码,使用 discobox模型,看其是否能运行,这个不需要算子编译。
后面我对pairwise.cu及mst.cu文件进行了修改,发现可以编译成功,但是在自己的数据集上训练时卡住了,gpu memory占用也很少,这是什么原因引起的呢? D:\Anaconda3\envs\boxinstseg\python.exe E:/PycharmCode/cuda11.6/BoxInstSeg-main/tools/train.py E:\PycharmCode\cuda11.6\BoxInstSeg-main\mmdet\utils\setup_env.py:38: UserWarning: Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. warnings.warn( E:\PycharmCode\cuda11.6\BoxInstSeg-main\mmdet\utils\setup_env.py:48: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. warnings.warn( 2023-12-27 15:46:59,855 - mmdet - INFO - Distributed training: False 2023-12-27 15:47:00,150 - mmdet - INFO - Config: dataset_type = 'CocoDataset' data_root = '../datasets/DuanMian/' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) train_pipeline = [ dict(type='LoadImageFromFile', to_float32=True), dict(type='LoadAnnotations', with_bbox=True, with_mask=False), dict(type='GenerateBoxMask'), dict(type='RandomFlip', flip_ratio=0.5), dict( type='Resize', img_scale=(640, 640), ratio_range=(0.1, 2.0), multiscale_mode='range', keep_ratio=True), dict( type='RandomCrop', crop_size=(640, 640), crop_type='absolute', recompute_bbox=True, allow_negative_crop=True), dict( type='FilterAnnotations', min_gt_bbox_wh=(1e-05, 1e-05), keep_empty=True), dict( type='Pad', size=(640, 640), pad_val=dict(img=(128, 128, 128), masks=0, seg=255)), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='DefaultFormatBundle', img_to_float=True), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']) ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(800, 512), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Pad', size_divisor=32, pad_val=dict(img=(128, 128, 128), masks=0, seg=255)), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ] data = dict( samples_per_gpu=4, workers_per_gpu=2, train=dict( type='CocoDataset', ann_file='../datasets/DuanMian/train.json', img_prefix='../datasets/DuanMian/train/JPEGImages/', pipeline=[ dict(type='LoadImageFromFile', to_float32=True), dict(type='LoadAnnotations', with_bbox=True, with_mask=False), dict(type='GenerateBoxMask'), dict(type='RandomFlip', flip_ratio=0.5), dict( type='Resize', img_scale=(640, 640), ratio_range=(0.1, 2.0), multiscale_mode='range', keep_ratio=True), dict( type='RandomCrop', crop_size=(640, 640), crop_type='absolute', recompute_bbox=True, allow_negative_crop=True), dict( type='FilterAnnotations', min_gt_bbox_wh=(1e-05, 1e-05), keep_empty=True), dict( type='Pad', size=(640, 640), pad_val=dict(img=(128, 128, 128), masks=0, seg=255)), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='DefaultFormatBundle', img_to_float=True), dict( type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']) ]), val=dict( type='CocoDataset', ann_file='../datasets/DuanMian/val.json', img_prefix='../datasets/DuanMian/val/JPEGImages/', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(800, 512), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Pad', size_divisor=32, pad_val=dict(img=(128, 128, 128), masks=0, seg=255)), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ]), test=dict( type='CocoDataset', ann_file='../datasets/DuanMian/val.json', img_prefix='../datasets/DuanMian/val/JPEGImages/', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(800, 512), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Pad', size_divisor=32, pad_val=dict(img=(128, 128, 128), masks=0, seg=255)), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ])) evaluation = dict( interval=10, metric=['bbox', 'segm'], dynamic_intervals=[(5001, 5000)]) log_config = dict( interval=10, hooks=[ dict(type='TextLoggerHook', by_epoch=False), dict(type='TensorboardLoggerHook', by_epoch=False) ]) custom_hooks = [dict(type='NumClassCheckHook')] dist_params = dict(backend='nccl') log_level = 'INFO' load_from = None resume_from = None workflow = [('train', 10)] opencv_num_threads = 0 mp_start_method = 'fork' auto_scale_lr = dict(enable=False, base_batch_size=16) num_thing_classes = 1 num_stuff_classes = 0 num_classes = 1 model = dict( type='Box2Mask', backbone=dict( type='ResNet', depth=50, num_stages=4, out_indices=(0, 1, 2, 3), frozen_stages=-1, norm_cfg=dict(type='BN', requires_grad=False), norm_eval=True, style='pytorch', init_cfg=dict( type='Pretrained', checkpoint= 'https://download.pytorch.org/models/resnet50-11ad3fa6.pth')), panoptic_head=dict( type='Box2MaskHead', in_channels=[256, 512, 1024, 2048], strides=[4, 8, 16, 32], feat_channels=256, out_channels=256, num_things_classes=1, num_stuff_classes=0, num_queries=100, num_transformer_feat_level=3, pixel_decoder=dict( type='MSDeformAttnPixelDecoder', num_outs=3, norm_cfg=dict(type='GN', num_groups=32), act_cfg=dict(type='ReLU'), encoder=dict( type='DetrTransformerEncoder', num_layers=6, transformerlayers=dict( type='BaseTransformerLayer', attn_cfgs=dict( type='MultiScaleDeformableAttention', embed_dims=256, num_heads=8, num_levels=3, num_points=4, im2col_step=64, dropout=0.0, batch_first=False, norm_cfg=None, init_cfg=None), ffn_cfgs=dict( type='FFN', embed_dims=256, feedforward_channels=1024, num_fcs=2, ffn_drop=0.0, act_cfg=dict(type='ReLU', inplace=True)), operation_order=('self_attn', 'norm', 'ffn', 'norm')), init_cfg=None), positional_encoding=dict( type='SinePositionalEncoding', num_feats=128, normalize=True), init_cfg=None), enforce_decoder_input_project=False, positional_encoding=dict( type='SinePositionalEncoding', num_feats=128, normalize=True), transformer_decoder=dict( type='DetrTransformerDecoder', return_intermediate=True, num_layers=9, transformerlayers=dict( type='DetrTransformerDecoderLayer', attn_cfgs=dict( type='MultiheadAttention', embed_dims=256, num_heads=8, attn_drop=0.0, proj_drop=0.0, dropout_layer=None, batch_first=False), ffn_cfgs=dict( embed_dims=256, feedforward_channels=2048, num_fcs=2, act_cfg=dict(type='ReLU', inplace=True), ffn_drop=0.0, dropout_layer=None, add_identity=True), feedforward_channels=2048, operation_order=('cross_attn', 'norm', 'self_attn', 'norm', 'ffn', 'norm')), init_cfg=None), loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=2.0, reduction='mean', class_weight=[1.0, 0.1]), loss_mask=dict(type='LevelsetLoss', loss_weight=1.0), loss_box=dict(type='BoxProjectionLoss', loss_weight=5.0)), panoptic_fusion_head=dict( type='MaskFormerFusionHead', num_things_classes=1, num_stuff_classes=0, loss_panoptic=None, init_cfg=None), train_cfg=dict( assigner=dict( type='MaskHungarianAssigner', cls_cost=dict(type='ClassificationCost', weight=2.0), dice_cost=dict( type='BoxMatchingCost', weight=5.0, pred_act=True, eps=1.0)), sampler=dict(type='MaskPseudoSampler')), test_cfg=dict( panoptic_on=False, semantic_on=False, instance_on=True, max_per_image=100, iou_thr=0.8, filter_low_score=True), init_cfg=None) image_size = (640, 640) pad_cfg = dict(img=(128, 128, 128), masks=0, seg=255) embed_multi = dict(lr_mult=1.0, decay_mult=0.0) optimizer = dict( type='AdamW', lr=0.0001, weight_decay=0.05, eps=1e-08, betas=(0.9, 0.999), paramwise_cfg=dict( custom_keys=dict( backbone=dict(lr_mult=0.1, decay_mult=1.0), query_embed=dict(lr_mult=1.0, decay_mult=0.0), query_feat=dict(lr_mult=1.0, decay_mult=0.0), level_embed=dict(lr_mult=1.0, decay_mult=0.0)), norm_decay_mult=0.0)) optimizer_config = dict(grad_clip=dict(max_norm=0.01, norm_type=2)) lr_config = dict( policy='step', gamma=0.1, by_epoch=False, step=[4000, 4500], warmup='linear', warmup_by_epoch=False, warmup_ratio=1.0, warmup_iters=10) max_iters = 5000 runner = dict(type='IterBasedRunner', max_iters=5000) interval = 10 checkpoint_config = dict( by_epoch=False, interval=10, save_last=True, max_keep_ckpts=3) dynamic_intervals = [(5001, 5000)] find_unused_parameters = True work_dir = './work_dirs/box2mask_duanmian_r50_50e/' auto_resume = False gpu_ids = [0]
2023-12-27 15:47:00,162 - mmdet - INFO - Set random seed to 1929929277, deterministic: False 2023-12-27 15:47:00,417 - mmdet - INFO - initialize ResNet with init_cfg {'type': 'Pretrained', 'checkpoint': 'https://download.pytorch.org/models/resnet50-11ad3fa6.pth'} 2023-12-27 15:47:00,417 - mmcv - INFO - load model from: https://download.pytorch.org/models/resnet50-11ad3fa6.pth 2023-12-27 15:47:00,417 - mmcv - INFO - load checkpoint from http path: https://download.pytorch.org/models/resnet50-11ad3fa6.pth 2023-12-27 15:47:00,536 - mmcv - WARNING - The model and loaded state dict do not match exactly
unexpected key in source state_dict: fc.weight, fc.bias
loading annotations into memory...
Done (t=0.03s)
creating index...
index created!
fatal: not a git repository (or any of the parent directories): .git
2023-12-27 15:47:02,145 - mmdet - INFO - Automatic scaling of learning rate (LR) has been disabled.
loading annotations into memory...
Done (t=0.15s)
creating index...
index created!
2023-12-27 15:47:02,473 - mmdet - INFO - Start running, host: gdwangzhi@GD-XS-770, work_dir: E:\PycharmCode\cuda11.6\BoxInstSeg-main\tools\work_dirs\box2mask_duanmian_r50_50e
2023-12-27 15:47:02,473 - mmdet - INFO - Hooks will be executed in the following order:
before_run:
(VERY_HIGH ) StepLrUpdaterHook
(NORMAL ) CheckpointHook
(LOW ) EvalHook
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook
before_train_epoch:
(VERY_HIGH ) StepLrUpdaterHook
(NORMAL ) NumClassCheckHook
(LOW ) IterTimerHook
(LOW ) EvalHook
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook
before_train_iter:
(VERY_HIGH ) StepLrUpdaterHook
(LOW ) IterTimerHook
(LOW ) EvalHook
after_train_iter:
(ABOVE_NORMAL) OptimizerHook
(NORMAL ) CheckpointHook
(LOW ) IterTimerHook
(LOW ) EvalHook
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook
after_train_epoch:
(NORMAL ) CheckpointHook
(LOW ) EvalHook
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook
before_val_epoch:
(NORMAL ) NumClassCheckHook
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook
before_val_iter: (LOW ) IterTimerHook
after_val_iter: (LOW ) IterTimerHook
after_val_epoch:
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook
after_run:
(VERY_LOW ) TextLoggerHook
(VERY_LOW ) TensorboardLoggerHook
2023-12-27 15:47:02,473 - mmdet - INFO - workflow: [('train', 10)], max: 5000 iters 2023-12-27 15:47:02,473 - mmdet - INFO - Checkpoints will be saved to E:\PycharmCode\cuda11.6\BoxInstSeg-main\tools\work_dirs\box2mask_duanmian_r50_50e by HardDiskBackend.
后面我对pairwise.cu及mst.cu文件进行了修改,发现可以编译成功,但是在自己的数据集上训练时卡住了,gpu memory占用也很少,这是什么原因引起的呢? D:\Anaconda3\envs\boxinstseg\python.exe E:/PycharmCode/cuda11.6/BoxInstSeg-main/tools/train.py E:\PycharmCode\cuda11.6\BoxInstSeg-main\mmdet\utils\setup_env.py:38: UserWarning: Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. warnings.warn( E:\PycharmCode\cuda11.6\BoxInstSeg-main\mmdet\utils\setup_env.py:48: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. warnings.warn( 2023-12-27 15:46:59,855 - mmdet - INFO - Distributed training: False 2023-12-27 15:47:00,150 - mmdet - INFO - Config: dataset_type = 'CocoDataset' data_root = '../datasets/DuanMian/' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) train_pipeline = [ dict(type='LoadImageFromFile', to_float32=True), dict(type='LoadAnnotations', with_bbox=True, with_mask=False), dict(type='GenerateBoxMask'), dict(type='RandomFlip', flip_ratio=0.5), dict( type='Resize', img_scale=(640, 640), ratio_range=(0.1, 2.0), multiscale_mode='range', keep_ratio=True), dict( type='RandomCrop', crop_size=(640, 640), crop_type='absolute', recompute_bbox=True, allow_negative_crop=True), dict( type='FilterAnnotations', min_gt_bbox_wh=(1e-05, 1e-05), keep_empty=True), dict( type='Pad', size=(640, 640), pad_val=dict(img=(128, 128, 128), masks=0, seg=255)), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='DefaultFormatBundle', img_to_float=True), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']) ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(800, 512), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Pad', size_divisor=32, pad_val=dict(img=(128, 128, 128), masks=0, seg=255)), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ] data = dict( samples_per_gpu=4, workers_per_gpu=2, train=dict( type='CocoDataset', ann_file='../datasets/DuanMian/train.json', img_prefix='../datasets/DuanMian/train/JPEGImages/', pipeline=[ dict(type='LoadImageFromFile', to_float32=True), dict(type='LoadAnnotations', with_bbox=True, with_mask=False), dict(type='GenerateBoxMask'), dict(type='RandomFlip', flip_ratio=0.5), dict( type='Resize', img_scale=(640, 640), ratio_range=(0.1, 2.0), multiscale_mode='range', keep_ratio=True), dict( type='RandomCrop', crop_size=(640, 640), crop_type='absolute', recompute_bbox=True, allow_negative_crop=True), dict( type='FilterAnnotations', min_gt_bbox_wh=(1e-05, 1e-05), keep_empty=True), dict( type='Pad', size=(640, 640), pad_val=dict(img=(128, 128, 128), masks=0, seg=255)), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='DefaultFormatBundle', img_to_float=True), dict( type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']) ]), val=dict( type='CocoDataset', ann_file='../datasets/DuanMian/val.json', img_prefix='../datasets/DuanMian/val/JPEGImages/', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(800, 512), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Pad', size_divisor=32, pad_val=dict(img=(128, 128, 128), masks=0, seg=255)), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ]), test=dict( type='CocoDataset', ann_file='../datasets/DuanMian/val.json', img_prefix='../datasets/DuanMian/val/JPEGImages/', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(800, 512), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Pad', size_divisor=32, pad_val=dict(img=(128, 128, 128), masks=0, seg=255)), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ])) evaluation = dict( interval=10, metric=['bbox', 'segm'], dynamic_intervals=[(5001, 5000)]) log_config = dict( interval=10, hooks=[ dict(type='TextLoggerHook', by_epoch=False), dict(type='TensorboardLoggerHook', by_epoch=False) ]) custom_hooks = [dict(type='NumClassCheckHook')] dist_params = dict(backend='nccl') log_level = 'INFO' load_from = None resume_from = None workflow = [('train', 10)] opencv_num_threads = 0 mp_start_method = 'fork' auto_scale_lr = dict(enable=False, base_batch_size=16) num_thing_classes = 1 num_stuff_classes = 0 num_classes = 1 model = dict( type='Box2Mask', backbone=dict( type='ResNet', depth=50, num_stages=4, out_indices=(0, 1, 2, 3), frozen_stages=-1, norm_cfg=dict(type='BN', requires_grad=False), norm_eval=True, style='pytorch', init_cfg=dict( type='Pretrained', checkpoint= 'https://download.pytorch.org/models/resnet50-11ad3fa6.pth')), panoptic_head=dict( type='Box2MaskHead', in_channels=[256, 512, 1024, 2048], strides=[4, 8, 16, 32], feat_channels=256, out_channels=256, num_things_classes=1, num_stuff_classes=0, num_queries=100, num_transformer_feat_level=3, pixel_decoder=dict( type='MSDeformAttnPixelDecoder', num_outs=3, norm_cfg=dict(type='GN', num_groups=32), act_cfg=dict(type='ReLU'), encoder=dict( type='DetrTransformerEncoder', num_layers=6, transformerlayers=dict( type='BaseTransformerLayer', attn_cfgs=dict( type='MultiScaleDeformableAttention', embed_dims=256, num_heads=8, num_levels=3, num_points=4, im2col_step=64, dropout=0.0, batch_first=False, norm_cfg=None, init_cfg=None), ffn_cfgs=dict( type='FFN', embed_dims=256, feedforward_channels=1024, num_fcs=2, ffn_drop=0.0, act_cfg=dict(type='ReLU', inplace=True)), operation_order=('self_attn', 'norm', 'ffn', 'norm')), init_cfg=None), positional_encoding=dict( type='SinePositionalEncoding', num_feats=128, normalize=True), init_cfg=None), enforce_decoder_input_project=False, positional_encoding=dict( type='SinePositionalEncoding', num_feats=128, normalize=True), transformer_decoder=dict( type='DetrTransformerDecoder', return_intermediate=True, num_layers=9, transformerlayers=dict( type='DetrTransformerDecoderLayer', attn_cfgs=dict( type='MultiheadAttention', embed_dims=256, num_heads=8, attn_drop=0.0, proj_drop=0.0, dropout_layer=None, batch_first=False), ffn_cfgs=dict( embed_dims=256, feedforward_channels=2048, num_fcs=2, act_cfg=dict(type='ReLU', inplace=True), ffn_drop=0.0, dropout_layer=None, add_identity=True), feedforward_channels=2048, operation_order=('cross_attn', 'norm', 'self_attn', 'norm', 'ffn', 'norm')), init_cfg=None), loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=2.0, reduction='mean', class_weight=[1.0, 0.1]), loss_mask=dict(type='LevelsetLoss', loss_weight=1.0), loss_box=dict(type='BoxProjectionLoss', loss_weight=5.0)), panoptic_fusion_head=dict( type='MaskFormerFusionHead', num_things_classes=1, num_stuff_classes=0, loss_panoptic=None, init_cfg=None), train_cfg=dict( assigner=dict( type='MaskHungarianAssigner', cls_cost=dict(type='ClassificationCost', weight=2.0), dice_cost=dict( type='BoxMatchingCost', weight=5.0, pred_act=True, eps=1.0)), sampler=dict(type='MaskPseudoSampler')), test_cfg=dict( panoptic_on=False, semantic_on=False, instance_on=True, max_per_image=100, iou_thr=0.8, filter_low_score=True), init_cfg=None) image_size = (640, 640) pad_cfg = dict(img=(128, 128, 128), masks=0, seg=255) embed_multi = dict(lr_mult=1.0, decay_mult=0.0) optimizer = dict( type='AdamW', lr=0.0001, weight_decay=0.05, eps=1e-08, betas=(0.9, 0.999), paramwise_cfg=dict( custom_keys=dict( backbone=dict(lr_mult=0.1, decay_mult=1.0), query_embed=dict(lr_mult=1.0, decay_mult=0.0), query_feat=dict(lr_mult=1.0, decay_mult=0.0), level_embed=dict(lr_mult=1.0, decay_mult=0.0)), norm_decay_mult=0.0)) optimizer_config = dict(grad_clip=dict(max_norm=0.01, norm_type=2)) lr_config = dict( policy='step', gamma=0.1, by_epoch=False, step=[4000, 4500], warmup='linear', warmup_by_epoch=False, warmup_ratio=1.0, warmup_iters=10) max_iters = 5000 runner = dict(type='IterBasedRunner', max_iters=5000) interval = 10 checkpoint_config = dict( by_epoch=False, interval=10, save_last=True, max_keep_ckpts=3) dynamic_intervals = [(5001, 5000)] find_unused_parameters = True work_dir = './work_dirs/box2mask_duanmian_r50_50e/' auto_resume = False gpu_ids = [0]
2023-12-27 15:47:00,162 - mmdet - INFO - Set random seed to 1929929277, deterministic: False 2023-12-27 15:47:00,417 - mmdet - INFO - initialize ResNet with init_cfg {'type': 'Pretrained', 'checkpoint': 'https://download.pytorch.org/models/resnet50-11ad3fa6.pth'} 2023-12-27 15:47:00,417 - mmcv - INFO - load model from: https://download.pytorch.org/models/resnet50-11ad3fa6.pth 2023-12-27 15:47:00,417 - mmcv - INFO - load checkpoint from http path: https://download.pytorch.org/models/resnet50-11ad3fa6.pth 2023-12-27 15:47:00,536 - mmcv - WARNING - The model and loaded state dict do not match exactly
unexpected key in source state_dict: fc.weight, fc.bias
loading annotations into memory...
Done (t=0.03s) creating index... index created! fatal: not a git repository (or any of the parent directories): .git 2023-12-27 15:47:02,145 - mmdet - INFO - Automatic scaling of learning rate (LR) has been disabled. loading annotations into memory... Done (t=0.15s) creating index... index created! 2023-12-27 15:47:02,473 - mmdet - INFO - Start running, host: gdwangzhi@GD-XS-770, work_dir: E:\PycharmCode\cuda11.6\BoxInstSeg-main\tools\work_dirs\box2mask_duanmian_r50_50e 2023-12-27 15:47:02,473 - mmdet - INFO - Hooks will be executed in the following order: before_run: (VERY_HIGH ) StepLrUpdaterHook (NORMAL ) CheckpointHook (LOW ) EvalHook (VERY_LOW ) TextLoggerHook (VERY_LOW ) TensorboardLoggerHook
before_train_epoch:
(VERY_HIGH ) StepLrUpdaterHook (NORMAL ) NumClassCheckHook (LOW ) IterTimerHook (LOW ) EvalHook (VERY_LOW ) TextLoggerHook (VERY_LOW ) TensorboardLoggerHook
before_train_iter:
(VERY_HIGH ) StepLrUpdaterHook (LOW ) IterTimerHook (LOW ) EvalHook
after_train_iter:
(ABOVE_NORMAL) OptimizerHook (NORMAL ) CheckpointHook (LOW ) IterTimerHook (LOW ) EvalHook (VERY_LOW ) TextLoggerHook (VERY_LOW ) TensorboardLoggerHook
after_train_epoch:
(NORMAL ) CheckpointHook (LOW ) EvalHook (VERY_LOW ) TextLoggerHook (VERY_LOW ) TensorboardLoggerHook
before_val_epoch:
(NORMAL ) NumClassCheckHook (LOW ) IterTimerHook (VERY_LOW ) TextLoggerHook (VERY_LOW ) TensorboardLoggerHook
before_val_iter:
(LOW ) IterTimerHook
after_val_iter:
(LOW ) IterTimerHook
after_val_epoch:
(VERY_LOW ) TextLoggerHook (VERY_LOW ) TensorboardLoggerHook
after_run:
(VERY_LOW ) TextLoggerHook (VERY_LOW ) TensorboardLoggerHook 2023-12-27 15:47:02,473 - mmdet - INFO - workflow: [('train', 10)], max: 5000 iters 2023-12-27 15:47:02,473 - mmdet - INFO - Checkpoints will be saved to E:\PycharmCode\cuda11.6\BoxInstSeg-main\tools\work_dirs\box2mask_duanmian_r50_50e by HardDiskBackend.
您好,请问您是怎么修改的文件,可以分享下吗
windows10,完全按照install.md安装环境,torch、mmcv、python版本都是一致的,但是编译cu算子失败,请问代码是否可以在windows上运行?