issue with class weight and cross entropy loss

JosephBChoi commented 1 year ago

Describe the bug

Tried to train PSPNet with class weights [0.052, 9.23]. When calculating cross entropy loss, it tries to create a tensor from list of the tensor where PyTorch throws an error.

[/content/mmsegmentation/mmseg/models/losses/cross_entropy_loss.py](https://ra7s57vsu1g-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab_20231024-060124_RC00_576097381#) in cross_entropy(pred, label, weight, class_weight, reduction, avg_factor, ignore_index, avg_non_ignore)
     64         else:
     65             # the average factor should take the class weights into account
---> 66             label_weights = torch.tensor([class_weight[cls] for cls in label],
     67                                          device=class_weight.device)
     68             if avg_non_ignore:

ValueError: only one element tensors can be converted to Python scalars

If I remove the class weights, it doesn't produce the error.

shinpaul14 commented 1 year ago

Have you solved this?

giuseppemartino26 commented 1 year ago

I have the same issue too

JosephBChoi commented 1 year ago

Have you solved this?

Still having the same issue.

1dmesh commented 1 year ago

Seems to be a duplicate of https://github.com/open-mmlab/mmsegmentation/issues/3406

LHamnett commented 1 year ago

Same issue here, attaching current config and error trace:

Config:

checkpoint = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/segformer/mit_b5_20220624-658746d9.pth' crop_size = ( 768, 768, ) data_preprocessor = dict( bgr_to_rgb=True, mean=[ 123.675, 116.28, 103.53, ], pad_val=0, seg_pad_val=255, size=( 512, 512, ), std=[ 58.395, 57.12, 57.375, ], type='SegDataPreProcessor') data_root = '/content/gdrive/MyDrive/BCSS_dataset' dataset_type = 'CancerDataset' default_hooks = dict( checkpoint=dict(by_epoch=False, interval=0, type='CheckpointHook'), logger=dict(interval=100, log_metric_by_epoch=False, type='LoggerHook'), param_scheduler=dict(type='ParamSchedulerHook'), sampler_seed=dict(type='DistSamplerSeedHook'), timer=dict(type='IterTimerHook'), visualization=dict(type='SegVisualizationHook')) default_scope = 'mmseg' env_cfg = dict( cudnn_benchmark=True, dist_cfg=dict(backend='nccl'), mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0)) fp16 = dict(opt_level='O3') img_ratios = [ 0.5, 0.75, 1.0, 1.25, 1.5, 1.75, ] load_from = None log_level = 'INFO' log_processor = dict(by_epoch=False) model = dict( backbone=dict( attn_drop_rate=0.0, drop_path_rate=0.1, drop_rate=0.0, embed_dims=64, in_channels=3, init_cfg=dict( checkpoint= 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/segformer/mit_b5_20220624-658746d9.pth', type='Pretrained'), mlp_ratio=4, num_heads=[ 1, 2, 5, 8, ], num_layers=[ 3, 6, 40, 3, ], num_stages=4, out_indices=( 0, 1, 2, 3, ), patch_sizes=[ 7, 3, 3, 3, ], qkv_bias=True, sr_ratios=[ 8, 4, 2, 1, ], type='MixVisionTransformer'), data_preprocessor=dict( bgr_to_rgb=True, mean=[ 123.675, 116.28, 103.53, ], pad_val=0, seg_pad_val=255, size=( 512, 512, ), std=[ 58.395, 57.12, 57.375, ], type='SegDataPreProcessor'), decode_head=dict( align_corners=False, channels=256, dropout_ratio=0.1, ignore_index=0, in_channels=[ 64, 128, 320, 512, ], in_index=[ 0, 1, 2, 3, ], loss_decode=[ dict( class_weight=[ 1, 1.1, 1, 1, 1, 1, ], loss_name='loss_ce', loss_weight=1.0, type='CrossEntropyLoss'), ], norm_cfg=dict(requires_grad=True, type='BN'), num_classes=6, type='SegformerHead'), pretrained=None, test_cfg=dict(mode='whole'), train_cfg=dict(), type='EncoderDecoder') norm_cfg = dict(requires_grad=True, type='BN') num_classes = 6 optim_wrapper = dict( optimizer=dict( betas=( 0.9, 0.999, ), lr=0.001, type='AdamW', weight_decay=0.01), paramwise_cfg=dict( custom_keys=dict( head=dict(lr_mult=10.0), norm=dict(decay_mult=0.0), pos_block=dict(decay_mult=0.0))), type='AmpOptimWrapper') optimizer = dict(lr=0.005, momentum=0.9, type='AdamW', weight_decay=0.0005) param_scheduler = [ dict( by_epoch=False, eta_max=0.001, three_phase=True, total_steps=500, type='OneCycleLR'), ] resume = False test_cfg = dict(type='TestLoop') test_dataloader = dict( batch_size=1, dataset=dict( ann_file='splits/val.txt', data_prefix=dict( img_path= '/content/gdrive/MyDrive/BCSS_dataset/patched_images_masks_2d_6_class/images', seg_map_path= '/content/gdrive/MyDrive/BCSS_dataset/patched_images_masks_2d_6_class/masks' ), data_root='/content/gdrive/MyDrive/BCSS_dataset', pipeline=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations'), dict(type='PackSegInputs'), ], type='CancerDataset'), num_workers=4, persistent_workers=False, sampler=dict(shuffle=False, type='DefaultSampler')) test_evaluator = dict( iou_metrics=[ 'mDice', ], type='IoUMetric') test_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations'), dict(type='PackSegInputs'), ] train_cfg = dict(max_iters=500, type='IterBasedTrainLoop', val_interval=1000) train_dataloader = dict( batch_size=2, dataset=dict( dataset=dict( ann_file='splits/train.txt', data_prefix=dict( img_path= '/content/gdrive/MyDrive/BCSS_dataset/patched_images_masks_2d_6_class/images', seg_map_path= '/content/gdrive/MyDrive/BCSS_dataset/patched_images_masks_2d_6_class/masks' ), data_root='/content/gdrive/MyDrive/BCSS_dataset', pipeline=[ dict(type='LoadImageFromFile'), dict(reduce_zero_label=False, type='LoadAnnotations'), dict(type='PackSegInputs'), ], type='CancerDataset'), times=5, type='RepeatDataset'), num_workers=1, persistent_workers=False, sampler=dict(shuffle=True, type='InfiniteSampler')) train_pipeline = [ dict(type='LoadImageFromFile'), dict(reduce_zero_label=False, type='LoadAnnotations'), dict(type='PackSegInputs'), ] tta_model = None tta_pipeline = [ dict(backend_args=None, type='LoadImageFromFile'), dict( transforms=[ [ dict(keep_ratio=True, scale_factor=0.5, type='Resize'), dict(keep_ratio=True, scale_factor=0.75, type='Resize'), dict(keep_ratio=True, scale_factor=1.0, type='Resize'), dict(keep_ratio=True, scale_factor=1.25, type='Resize'), dict(keep_ratio=True, scale_factor=1.5, type='Resize'), ], [ dict(direction='vertical', prob=0.5, type='RandomFlip'), dict(direction='horizontal', prob=0.5, type='RandomFlip'), ], [ dict(type='LoadAnnotations'), ], [ dict(type='PackSegInputs'), ], ], type='TestTimeAug'), ] val_cfg = dict(type='ValLoop') val_dataloader = dict( batch_size=1, dataset=dict( ann_file='splits/val.txt', data_prefix=dict( img_path= '/content/gdrive/MyDrive/BCSS_dataset/patched_images_masks_2d_6_class/images', seg_map_path= '/content/gdrive/MyDrive/BCSS_dataset/patched_images_masks_2d_6_class/masks' ), data_root='/content/gdrive/MyDrive/BCSS_dataset', pipeline=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations'), dict(type='PackSegInputs'), ], type='CancerDataset'), num_workers=4, persistent_workers=False, sampler=dict(shuffle=False, type='DefaultSampler')) val_evaluator = dict( iou_metrics=[ 'mDice', 'mIoU', ], type='IoUMetric') vis_backends = [ dict(type='LocalVisBackend'), ] visualizer = dict( name='visualizer', type='SegLocalVisualizer', vis_backends=[ dict(type='LocalVisBackend'), ]) work_dir = '/content/gdrive/MyDrive/BCSS_dataset/logs'

Error trace:

`ValueError Traceback (most recent call last) in <cell line: 3>() 1 #test one model make sure one model can be trained correctly 2 runner = Runner.from_cfg(cfg) ----> 3 runner.train()

15 frames /usr/local/lib/python3.10/dist-packages/mmengine/runner/runner.py in train(self) 1775 self._maybe_compile('train_step') 1776 -> 1777 model = self.train_loop.run() # type: ignore 1778 self.call_hook('after_run') 1779 return model

/usr/local/lib/python3.10/dist-packages/mmengine/runner/loops.py in run(self) 276 277 data_batch = next(self.dataloader_iterator) --> 278 self.run_iter(data_batch) 279 280 self._decide_current_val_interval()

/usr/local/lib/python3.10/dist-packages/mmengine/runner/loops.py in run_iter(self, data_batch) 299 # synchronization during gradient accumulation process. 300 # outputs should be a dict of loss. --> 301 outputs = self.runner.model.train_step( 302 data_batch, optim_wrapper=self.runner.optim_wrapper) 303

/usr/local/lib/python3.10/dist-packages/mmengine/model/base_model/base_model.py in train_step(self, data, optim_wrapper) 112 with optim_wrapper.optim_context(self): 113 data = self.data_preprocessor(data, True) --> 114 losses = self._run_forward(data, mode='loss') # type: ignore 115 parsed_losses, log_vars = self.parse_losses(losses) # type: ignore 116 optim_wrapper.update_params(parsed_losses)

/usr/local/lib/python3.10/dist-packages/mmengine/model/base_model/base_model.py in _run_forward(self, data, mode) 344 """ 345 if isinstance(data, dict): --> 346 results = self(*data, mode=mode) 347 elif isinstance(data, (list, tuple)): 348 results = self(data, mode=mode)

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, kwargs) 1516 return self._compiled_call_impl(*args, *kwargs) # type: ignore[misc] 1517 else: -> 1518 return self._call_impl(args, kwargs) 1519 1520 def _call_impl(self, *args, **kwargs):

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, *kwargs) 1525 or _global_backward_pre_hooks or _global_backward_hooks 1526 or _global_forward_hooks or _global_forward_pre_hooks): -> 1527 return forward_call(args, **kwargs) 1528 1529 try:

/content/mmsegmentation/mmseg/models/segmentors/base.py in forward(self, inputs, data_samples, mode) 92 """ 93 if mode == 'loss': ---> 94 return self.loss(inputs, data_samples) 95 elif mode == 'predict': 96 return self.predict(inputs, data_samples)

/content/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py in loss(self, inputs, data_samples) 176 losses = dict() 177 --> 178 loss_decode = self._decode_head_forward_train(x, data_samples) 179 losses.update(loss_decode) 180

/content/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py in _decode_head_forward_train(self, inputs, data_samples) 137 training.""" 138 losses = dict() --> 139 loss_decode = self.decode_head.loss(inputs, data_samples, 140 self.train_cfg) 141

/content/mmsegmentation/mmseg/models/decode_heads/decode_head.py in loss(self, inputs, batch_data_samples, train_cfg) 260 """ 261 seg_logits = self.forward(inputs) --> 262 losses = self.loss_by_feat(seg_logits, batch_data_samples) 263 return losses 264

/content/mmsegmentation/mmseg/models/decode_heads/decode_head.py in loss_by_feat(self, seg_logits, batch_data_samples) 322 for loss_decode in losses_decode: 323 if loss_decode.loss_name not in loss: --> 324 loss[loss_decode.loss_name] = loss_decode( 325 seg_logits, 326 seg_label,

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _wrapped_call_impl(self, *args, kwargs) 1516 return self._compiled_call_impl(*args, *kwargs) # type: ignore[misc] 1517 else: -> 1518 return self._call_impl(args, kwargs) 1519 1520 def _call_impl(self, *args, **kwargs):

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, *kwargs) 1525 or _global_backward_pre_hooks or _global_backward_hooks 1526 or _global_forward_hooks or _global_forward_pre_hooks): -> 1527 return forward_call(args, **kwargs) 1528 1529 try:

/content/mmsegmentation/mmseg/models/losses/cross_entropy_loss.py in forward(self, cls_score, label, weight, avg_factor, reduction_override, ignore_index, *kwargs) 283 class_weight = None 284 # Note: for BCE loss, label < 0 is invalid. --> 285 loss_cls = self.loss_weight self.cls_criterion( 286 cls_score, 287 label,

/content/mmsegmentation/mmseg/models/losses/cross_entropy_loss.py in cross_entropy(pred, label, weight, class_weight, reduction, avg_factor, ignore_index, avg_non_ignore) 64 else: 65 # the average factor should take the class weights into account ---> 66 label_weights = torch.tensor([class_weight[cls] for cls in label], 67 device=class_weight.device) 68 if avg_non_ignore:

ValueError: only one element tensors can be converted to Python scalars`

mmeendez8 commented 11 months ago

Same issue here. It seems a bug was introduced in this PR. If you downgrade to version 1.1.2 everything should work as expected. Will try to send a fix soon

call560 commented 11 months ago

Same issue here, attaching current config and error trace:

Config:

checkpoint = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/segformer/mit_b5_20220624-658746d9.pth' crop_size = ( 768, 768, ) data_preprocessor = dict( bgr_to_rgb=True, mean=[ 123.675, 116.28, 103.53, ], pad_val=0, seg_pad_val=255, size=( 512, 512, ), std=[ 58.395, 57.12, 57.375, ], type='SegDataPreProcessor') data_root = '/content/gdrive/MyDrive/BCSS_dataset' dataset_type = 'CancerDataset' default_hooks = dict( checkpoint=dict(by_epoch=False, interval=0, type='CheckpointHook'), logger=dict(interval=100, log_metric_by_epoch=False, type='LoggerHook'), param_scheduler=dict(type='ParamSchedulerHook'), sampler_seed=dict(type='DistSamplerSeedHook'), timer=dict(type='IterTimerHook'), visualization=dict(type='SegVisualizationHook')) default_scope = 'mmseg' env_cfg = dict( cudnn_benchmark=True, dist_cfg=dict(backend='nccl'), mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0)) fp16 = dict(opt_level='O3') img_ratios = [ 0.5, 0.75, 1.0, 1.25, 1.5, 1.75, ] load_from = None log_level = 'INFO' log_processor = dict(by_epoch=False) model = dict( backbone=dict( attn_drop_rate=0.0, drop_path_rate=0.1, drop_rate=0.0, embed_dims=64, in_channels=3, init_cfg=dict( checkpoint= 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/segformer/mit_b5_20220624-658746d9.pth', type='Pretrained'), mlp_ratio=4, num_heads=[ 1, 2, 5, 8, ], num_layers=[ 3, 6, 40, 3, ], num_stages=4, out_indices=( 0, 1, 2, 3, ), patch_sizes=[ 7, 3, 3, 3, ], qkv_bias=True, sr_ratios=[ 8, 4, 2, 1, ], type='MixVisionTransformer'), data_preprocessor=dict( bgr_to_rgb=True, mean=[ 123.675, 116.28, 103.53, ], pad_val=0, seg_pad_val=255, size=( 512, 512, ), std=[ 58.395, 57.12, 57.375, ], type='SegDataPreProcessor'), decode_head=dict( align_corners=False, channels=256, dropout_ratio=0.1, ignore_index=0, in_channels=[ 64, 128, 320, 512, ], in_index=[ 0, 1, 2, 3, ], loss_decode=[ dict( class_weight=[ 1, 1.1, 1, 1, 1, 1, ], loss_name='loss_ce', loss_weight=1.0, type='CrossEntropyLoss'), ], norm_cfg=dict(requires_grad=True, type='BN'), num_classes=6, type='SegformerHead'), pretrained=None, test_cfg=dict(mode='whole'), train_cfg=dict(), type='EncoderDecoder') norm_cfg = dict(requires_grad=True, type='BN') num_classes = 6 optim_wrapper = dict( optimizer=dict( betas=( 0.9, 0.999, ), lr=0.001, type='AdamW', weight_decay=0.01), paramwise_cfg=dict( custom_keys=dict( head=dict(lr_mult=10.0), norm=dict(decay_mult=0.0), pos_block=dict(decay_mult=0.0))), type='AmpOptimWrapper') optimizer = dict(lr=0.005, momentum=0.9, type='AdamW', weight_decay=0.0005) param_scheduler = [ dict( by_epoch=False, eta_max=0.001, three_phase=True, total_steps=500, type='OneCycleLR'), ] resume = False test_cfg = dict(type='TestLoop') test_dataloader = dict( batch_size=1, dataset=dict( ann_file='splits/val.txt', data_prefix=dict( img_path= '/content/gdrive/MyDrive/BCSS_dataset/patched_images_masks_2d_6_class/images', seg_map_path= '/content/gdrive/MyDrive/BCSS_dataset/patched_images_masks_2d_6_class/masks' ), data_root='/content/gdrive/MyDrive/BCSS_dataset', pipeline=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations'), dict(type='PackSegInputs'), ], type='CancerDataset'), num_workers=4, persistent_workers=False, sampler=dict(shuffle=False, type='DefaultSampler')) test_evaluator = dict( iou_metrics=[ 'mDice', ], type='IoUMetric') test_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations'), dict(type='PackSegInputs'), ] train_cfg = dict(max_iters=500, type='IterBasedTrainLoop', val_interval=1000) train_dataloader = dict( batch_size=2, dataset=dict( dataset=dict( ann_file='splits/train.txt', data_prefix=dict( img_path= '/content/gdrive/MyDrive/BCSS_dataset/patched_images_masks_2d_6_class/images', seg_map_path= '/content/gdrive/MyDrive/BCSS_dataset/patched_images_masks_2d_6_class/masks' ), data_root='/content/gdrive/MyDrive/BCSS_dataset', pipeline=[ dict(type='LoadImageFromFile'), dict(reduce_zero_label=False, type='LoadAnnotations'), dict(type='PackSegInputs'), ], type='CancerDataset'), times=5, type='RepeatDataset'), num_workers=1, persistent_workers=False, sampler=dict(shuffle=True, type='InfiniteSampler')) train_pipeline = [ dict(type='LoadImageFromFile'), dict(reduce_zero_label=False, type='LoadAnnotations'), dict(type='PackSegInputs'), ] tta_model = None tta_pipeline = [ dict(backend_args=None, type='LoadImageFromFile'), dict( transforms=[ [ dict(keep_ratio=True, scale_factor=0.5, type='Resize'), dict(keep_ratio=True, scale_factor=0.75, type='Resize'), dict(keep_ratio=True, scale_factor=1.0, type='Resize'), dict(keep_ratio=True, scale_factor=1.25, type='Resize'), dict(keep_ratio=True, scale_factor=1.5, type='Resize'), ], [ dict(direction='vertical', prob=0.5, type='RandomFlip'), dict(direction='horizontal', prob=0.5, type='RandomFlip'), ], [ dict(type='LoadAnnotations'), ], [ dict(type='PackSegInputs'), ], ], type='TestTimeAug'), ] val_cfg = dict(type='ValLoop') val_dataloader = dict( batch_size=1, dataset=dict( ann_file='splits/val.txt', data_prefix=dict( img_path= '/content/gdrive/MyDrive/BCSS_dataset/patched_images_masks_2d_6_class/images', seg_map_path= '/content/gdrive/MyDrive/BCSS_dataset/patched_images_masks_2d_6_class/masks' ), data_root='/content/gdrive/MyDrive/BCSS_dataset', pipeline=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations'), dict(type='PackSegInputs'), ], type='CancerDataset'), num_workers=4, persistent_workers=False, sampler=dict(shuffle=False, type='DefaultSampler')) val_evaluator = dict( iou_metrics=[ 'mDice', 'mIoU', ], type='IoUMetric') vis_backends = [ dict(type='LocalVisBackend'), ] visualizer = dict( name='visualizer', type='SegLocalVisualizer', vis_backends=[ dict(type='LocalVisBackend'), ]) work_dir = '/content/gdrive/MyDrive/BCSS_dataset/logs'

Error trace:

ValueError Traceback (most recent call last) in <cell line: 3>() 1 #test one model make sure one model can be trained correctly 2 runner = Runner.from_cfg(cfg) ----> 3 runner.train() 15 frames [/usr/local/lib/python3.10/dist-packages/mmengine/runner/runner.py](https://localhost:8080/#) in train(self) 1775 self._maybe_compile('train_step') 1776 -> 1777 model = self.train_loop.run() # type: ignore 1778 self.call_hook('after_run') 1779 return model [/usr/local/lib/python3.10/dist-packages/mmengine/runner/loops.py](https://localhost:8080/#) in run(self) 276 277 data_batch = next(self.dataloader_iterator) --> 278 self.run_iter(data_batch) 279 280 self._decide_current_val_interval() [/usr/local/lib/python3.10/dist-packages/mmengine/runner/loops.py](https://localhost:8080/#) in run_iter(self, data_batch) 299 # synchronization during gradient accumulation process. 300 # outputs should be a dict of loss. --> 301 outputs = self.runner.model.train_step( 302 data_batch, optim_wrapper=self.runner.optim_wrapper) 303 [/usr/local/lib/python3.10/dist-packages/mmengine/model/base_model/base_model.py](https://localhost:8080/#) in train_step(self, data, optim_wrapper) 112 with optim_wrapper.optim_context(self): 113 data = self.data_preprocessor(data, True) --> 114 losses = self._run_forward(data, mode='loss') # type: ignore 115 parsed_losses, log_vars = self.parse_losses(losses) # type: ignore 116 optim_wrapper.update_params(parsed_losses) [/usr/local/lib/python3.10/dist-packages/mmengine/model/base_model/base_model.py](https://localhost:8080/#) in _run_forward(self, data, mode) 344 """ 345 if isinstance(data, dict): --> 346 results = self(**data, mode=mode) 347 elif isinstance(data, (list, tuple)): 348 results = self(*data, mode=mode) [/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _wrapped_call_impl(self, *args, **kwargs) 1516 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc] 1517 else: -> 1518 return self._call_impl(*args, **kwargs) 1519 1520 def _call_impl(self, *args, **kwargs): [/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _call_impl(self, *args, **kwargs) 1525 or _global_backward_pre_hooks or _global_backward_hooks 1526 or _global_forward_hooks or _global_forward_pre_hooks): -> 1527 return forward_call(*args, **kwargs) 1528 1529 try: [/content/mmsegmentation/mmseg/models/segmentors/base.py](https://localhost:8080/#) in forward(self, inputs, data_samples, mode) 92 """ 93 if mode == 'loss': ---> 94 return self.loss(inputs, data_samples) 95 elif mode == 'predict': 96 return self.predict(inputs, data_samples) [/content/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py](https://localhost:8080/#) in loss(self, inputs, data_samples) 176 losses = dict() 177 --> 178 loss_decode = self._decode_head_forward_train(x, data_samples) 179 losses.update(loss_decode) 180 [/content/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py](https://localhost:8080/#) in _decode_head_forward_train(self, inputs, data_samples) 137 training.""" 138 losses = dict() --> 139 loss_decode = self.decode_head.loss(inputs, data_samples, 140 self.train_cfg) 141 [/content/mmsegmentation/mmseg/models/decode_heads/decode_head.py](https://localhost:8080/#) in loss(self, inputs, batch_data_samples, train_cfg) 260 """ 261 seg_logits = self.forward(inputs) --> 262 losses = self.loss_by_feat(seg_logits, batch_data_samples) 263 return losses 264 [/content/mmsegmentation/mmseg/models/decode_heads/decode_head.py](https://localhost:8080/#) in loss_by_feat(self, seg_logits, batch_data_samples) 322 for loss_decode in losses_decode: 323 if loss_decode.loss_name not in loss: --> 324 loss[loss_decode.loss_name] = loss_decode( 325 seg_logits, 326 seg_label, [/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _wrapped_call_impl(self, *args, **kwargs) 1516 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc] 1517 else: -> 1518 return self._call_impl(*args, **kwargs) 1519 1520 def _call_impl(self, *args, **kwargs): [/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _call_impl(self, *args, **kwargs) 1525 or _global_backward_pre_hooks or _global_backward_hooks 1526 or _global_forward_hooks or _global_forward_pre_hooks): -> 1527 return forward_call(*args, **kwargs) 1528 1529 try: [/content/mmsegmentation/mmseg/models/losses/cross_entropy_loss.py](https://localhost:8080/#) in forward(self, cls_score, label, weight, avg_factor, reduction_override, ignore_index, **kwargs) 283 class_weight = None 284 # Note: for BCE loss, label < 0 is invalid. --> 285 loss_cls = self.loss_weight * self.cls_criterion( 286 cls_score, 287 label, [/content/mmsegmentation/mmseg/models/losses/cross_entropy_loss.py](https://localhost:8080/#) in cross_entropy(pred, label, weight, class_weight, reduction, avg_factor, ignore_index, avg_non_ignore) 64 else: 65 # the average factor should take the class weights into account ---> 66 label_weights = torch.tensor([class_weight[cls] for cls in label], 67 device=class_weight.device) 68 if avg_non_ignore: ValueError: only one element tensors can be converted to Python scalars

I have the same error when I was training the "Knet" with WCE loss. May I ask how you solved it? Thank you!

mmeendez8 commented 11 months ago

I sent a PR to fix the error and that was already merged: https://github.com/open-mmlab/mmsegmentation/pull/3457

JosephBChoi commented 11 months ago

@mmeendez8 I assume this issues has been resolved by the recent PR. If so, could I close this issue?

mmeendez8 commented 11 months ago

Yes, I think so

JosephBChoi commented 11 months ago

closing this issue with the PR #3457

JH95-ai commented 10 months ago

I use new code ,and use class weights in cityscape dataset and pidnet-s models ,only have error `

   else:
        # the average factor should take the class weights into account
        label_weights = torch.stack([class_weight[cls] for cls in label
                                     ]).to(device=class_weight.device)
        if avg_non_ignore:
            label_weights[label == ignore_index] = 0
        avg_factor = label_weights.sum()`

@mmeendez8 @JosephBChoi

BartvanMarrewijk commented 7 months ago

Still an error after update

../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1180,0,0], thread: [64,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.

laoyao0822 commented 15 hours ago

更新后仍有错误
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1180,0,0], thread: [64,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
https://github.com/open-mmlab/mmsegmentation/issues/3568 同样遇到了，参考这里成功解决

open-mmlab / mmsegmentation

issue with class weight and cross entropy loss #3412