NVIDIA / semantic-segmentation

Nvidia Semantic Segmentation monorepo
BSD 3-Clause "New" or "Revised" License
1.76k stars 388 forks source link

AssertionError: expected 1.0 to be the target scale #105

Closed SupriyaB1 closed 3 years ago

SupriyaB1 commented 3 years ago

Hi, I want to run inference with 2.0 scale, Command : python3 -m torch.distributed.launch --nproc_per_node=1 /home/HRNet/semantic-segmentation/train.py --dataset cityscapes --cv 0 --bs_val 1 --n_scales "2.0" --eval folder --eval_folder '/home/HRNet/semantic-segmentation/large_data/data/cityscapes/leftImg8bit_trainvaltest/leftImg8bit/test/' --snapshot "ASSETS_PATH/seg_weights/cityscapes_ocrnet.HRNet_Mscale_outstanding-turtle.pth" --arch ocrnet.HRNet_Mscale --result_dir ./Test@2.0_Scale but i am getting error:
None Global Rank: 0 Local Rank: 0 Torch version: 1.7, 1.7.0+cu101 n scales [2.0] dataset = cityscapes ignore_label = 255 num_classes = 19 Found 1 folder imgs cn num_classes 19 Using Cross Entropy Loss Using Cross Entropy Loss Loading weights from: checkpoint=/home/HRNet/semantic-segmentation/large_data/seg_weights/cityscapes_ocrnet.HRNet_Mscale_outstanding-turtle.pth Warning: using Python fallback for SyncBatchNorm, possibly because apex was installed without --cuda_ext. The exception raised when attempting to import the cuda backend was: No module named 'syncbn' => init weights from normal distribution => loading pretrained model /home/HRNet/semantic-segmentation/large_data/seg_weights/hrnetv2_w48_imagenet_pretrained.pth Trunk: hrnetv2 Model params = 72.1M Selected optimization level O1: Insert automatic casts around Pytorch functions and Tensor methods.

Defaults for this optimization level are: enabled : True opt_level : O1 cast_model_type : None patch_torch_functions : True keep_batchnorm_fp32 : None master_weights : None loss_scale : dynamic Processing user overrides (additional kwargs that are not None)... After processing overrides, optimization options are: enabled : True opt_level : O1 cast_model_type : None patch_torch_functions : True keep_batchnorm_fp32 : None master_weights : None loss_scale : dynamic Warning: multi_tensor_applier fused unscale kernel is unavailable, possibly because apex was installed without --cuda_ext --cpp_ext. Using Python fallback. Original ImportError was: ModuleNotFoundError("No module named 'amp_C'",) Warning: apex was installed without --cpp_ext. Falling back to Python flatten and unflatten. Traceback (most recent call last): File "/home/HRNet/semantic-segmentation/train.py", line 601, in main() File "/home/HRNet/semantic-segmentation/train.py", line 426, in main dump_all_images=True) File "/home/HRNet/semantic-segmentation/train.py", line 574, in validate args, val_idx) File "/home/HRNet/semantic-segmentation/utils/trnval_utils.py", line 142, in eval_minibatch output_dict = net(inputs) File "/home/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, kwargs) File "/home/.local/lib/python3.6/site-packages/apex/parallel/distributed.py", line 560, in forward result = self.module(*inputs, *kwargs) File "/home/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(input, kwargs) File "/home/HRNet/semantic-segmentation/network/ocrnet.py", line 332, in forward return self.nscale_forward(inputs, cfg.MODEL.N_SCALES) File "/home/HRNet/semantic-segmentation/network/ocrnet.py", line 213, in nscale_forward assert 1.0 in scales, 'expected 1.0 to be the target scale' AssertionError: expected 1.0 to be the target scale Traceback (most recent call last): File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/.local/lib/python3.6/site-packages/torch/distributed/launch.py", line 260, in main() File "/home/.local/lib/python3.6/site-packages/torch/distributed/launch.py", line 256, in main cmd=cmd) subprocess.CalledProcessError: Command '['/usr/bin/python3', '-u', '/home/HRNet/semantic-segmentation/train.py', '--local_rank=0', '--dataset', 'cityscapes', '--cv', '0', '--bs_val', '1', '--n_scales', '2.0', '--eval', 'folder', '--eval_folder', '/home/HRNet/semantic-segmentation/large_data/data/cityscapes/leftImg8bit_trainvaltest/leftImg8bit/test/', '--snapshot', 'ASSETS_PATH/seg_weights/cityscapes_ocrnet.HRNet_Mscale_outstanding-turtle.pth', '--arch', 'ocrnet.HRNet_Mscale', '--result_dir', './Test@2.0_Scale']' returned non-zero exit status 1. Can you please help me to solve this error

ajtao commented 3 years ago

The assert indicates the problem. You must include 1.0 in the list of scales.

SupriyaB1 commented 3 years ago

Thanks @ajtao , it worked.