mseg-dataset / mseg-semantic

An Official Repo of CVPR '20 "MSeg: A Composite Dataset for Multi-Domain Segmentation"
MIT License
457 stars 78 forks source link

Testing time takes much longer than reported in the repo!! #31

Open ArghyaChatterjee opened 2 years ago

ArghyaChatterjee commented 2 years ago

It's taking 35-40s to process the segmentation of a single frame. My test setup configuration: i. Ubuntu 18.04 LTS, core i7, 24 Gb RAM ii. Graphics Nvidia 1070M (Laptop version of 1070Ti) iii. Cuda 10.2 iv. Pytorch version: 1.6.0 + cu101 v. CUDA_HOME = /usr/local/cuda-10.2 This is the output from my terminal:

arghya@arghya-Erazer-X7849-MD60379:~$ python3 -u ~/mseg-semantic/mseg_semantic/tool/universal_demo.py --config=/home/arghya/mseg-semantic/mseg_semantic/config/test/default_config_720_ms.yaml model_name mseg-3m-720p model_path ~/Downloads/mseg-3m-720p.pth input_file ~/Downloads/Urban_3_fps.mp4
Namespace(config='/home/arghya/mseg-semantic/mseg_semantic/config/test/default_config_720_ms.yaml', file_save='default', opts=['model_name', 'mseg-3m-720p', 'model_path', '/home/arghya/Downloads/mseg-3m-720p.pth', 'input_file', '/home/arghya/Downloads/SubT_Urban_3_fps.mp4'])
arch: hrnet
base_size: 720
batch_size_val: 1
dataset: Urban_3_fps
has_prediction: False
ignore_label: 255
img_name_unique: False
index_start: 0
index_step: 0
input_file: /home/arghya/Downloads/Urban_3_fps.mp4
layers: 50
model_name: mseg-3m-720p
model_path: /home/arghya/Downloads/mseg-3m-720p.pth
network_name: None
save_folder: default
scales: [0.5, 0.75, 1.0, 1.25, 1.5, 1.75]
small: True
split: val
test_gpu: [0]
test_h: 713
test_w: 713
version: 4.0
vis_freq: 20
workers: 16
zoom_factor: 8
[2021-07-29 02:50:46,742 INFO universal_demo.py line 59 11926] arch: hrnet
base_size: 720
batch_size_val: 1
dataset: Urban_3_fps
has_prediction: False
ignore_label: 255
img_name_unique: True
index_start: 0
index_step: 0
input_file: /home/arghya/Downloads/Urban_3_fps.mp4
layers: 50
model_name: mseg-3m-720p
model_path: /home/arghya/Downloads/mseg-3m-720p.pth
network_name: None
print_freq: 10
save_folder: default
scales: [0.5, 0.75, 1.0, 1.25, 1.5, 1.75]
small: True
split: test
test_gpu: [0]
test_h: 713
test_w: 713
u_classes: ['backpack', 'umbrella', 'bag', 'tie', 'suitcase', 'case', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'animal_other', 'microwave', 'radiator', 'oven', 'toaster', 'storage_tank', 'conveyor_belt', 'sink', 'refrigerator', 'washer_dryer', 'fan', 'dishwasher', 'toilet', 'bathtub', 'shower', 'tunnel', 'bridge', 'pier_wharf', 'tent', 'building', 'ceiling', 'laptop', 'keyboard', 'mouse', 'remote', 'cell phone', 'television', 'floor', 'stage', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot_dog', 'pizza', 'donut', 'cake', 'fruit_other', 'food_other', 'chair_other', 'armchair', 'swivel_chair', 'stool', 'seat', 'couch', 'trash_can', 'potted_plant', 'nightstand', 'bed', 'table', 'pool_table', 'barrel', 'desk', 'ottoman', 'wardrobe', 'crib', 'basket', 'chest_of_drawers', 'bookshelf', 'counter_other', 'bathroom_counter', 'kitchen_island', 'door', 'light_other', 'lamp', 'sconce', 'chandelier', 'mirror', 'whiteboard', 'shelf', 'stairs', 'escalator', 'cabinet', 'fireplace', 'stove', 'arcade_machine', 'gravel', 'platform', 'playingfield', 'railroad', 'road', 'snow', 'sidewalk_pavement', 'runway', 'terrain', 'book', 'box', 'clock', 'vase', 'scissors', 'plaything_other', 'teddy_bear', 'hair_dryer', 'toothbrush', 'painting', 'poster', 'bulletin_board', 'bottle', 'cup', 'wine_glass', 'knife', 'fork', 'spoon', 'bowl', 'tray', 'range_hood', 'plate', 'person', 'rider_other', 'bicyclist', 'motorcyclist', 'paper', 'streetlight', 'road_barrier', 'mailbox', 'cctv_camera', 'junction_box', 'traffic_sign', 'traffic_light', 'fire_hydrant', 'parking_meter', 'bench', 'bike_rack', 'billboard', 'sky', 'pole', 'fence', 'railing_banister', 'guard_rail', 'mountain_hill', 'rock', 'frisbee', 'skis', 'snowboard', 'sports_ball', 'kite', 'baseball_bat', 'baseball_glove', 'skateboard', 'surfboard', 'tennis_racket', 'net', 'base', 'sculpture', 'column', 'fountain', 'awning', 'apparel', 'banner', 'flag', 'blanket', 'curtain_other', 'shower_curtain', 'pillow', 'towel', 'rug_floormat', 'vegetation', 'bicycle', 'car', 'autorickshaw', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'trailer', 'boat_ship', 'slow_wheeled_object', 'river_lake', 'sea', 'water_other', 'swimming_pool', 'waterfall', 'wall', 'window', 'window_blind']
version: 4.0
vis_freq: 20
workers: 16
zoom_factor: 8
[2021-07-29 02:50:46,743 INFO universal_demo.py line 60 11926] => creating model ...
[2021-07-29 02:50:49,912 INFO inference_task.py line 307 11926] => loading checkpoint '/home/arghya/Downloads/mseg-3m-720p.pth'
[2021-07-29 02:50:50,433 INFO inference_task.py line 313 11926] => loaded checkpoint '/home/arghya/Downloads/mseg-3m-720p.pth'
[2021-07-29 02:50:50,437 INFO inference_task.py line 326 11926] >>>>>>>>>>>>>> Start inference task >>>>>>>>>>>>>
[2021-07-29 02:50:50,440 INFO inference_task.py line 437 11926] Write video to /home/arghya/mseg-semantic/temp_files/SubT_Urban_3_fps_mseg-3m-720p_universal_scales_ms_base_sz_720.mp4
Video fps: 3.00 @ 720x1280 resolution.
[2021-07-29 02:50:50,451 INFO inference_task.py line 442 11926] On image 0/1312
/home/arghya/.local/lib/python3.6/site-packages/torch/nn/functional.py:3121: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
  "See the documentation of nn.Upsample for details.".format(mode))
/home/arghya/.local/lib/python3.6/site-packages/torch/nn/functional.py:2941: UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.
  warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.")
[2021-07-29 02:51:15,910 INFO inference_task.py line 442 11926] On image 1/1312
/home/arghya/.local/lib/python3.6/site-packages/torch/nn/functional.py:3121: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
  "See the documentation of nn.Upsample for details.".format(mode))
/home/arghya/.local/lib/python3.6/site-packages/torch/nn/functional.py:2941: UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.
  warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.")
[2021-07-29 02:51:39,985 INFO inference_task.py line 442 11926] On image 2/1312
[2021-07-29 02:52:03,004 INFO inference_task.py line 442 11926] On image 3/1312
[2021-07-29 02:52:25,585 INFO inference_task.py line 442 11926] On image 4/1312
[2021-07-29 02:52:48,215 INFO inference_task.py line 442 11926] On image 5/1312
[2021-07-29 02:53:11,059 INFO inference_task.py line 442 11926] On image 6/1312
[2021-07-29 02:53:34,598 INFO inference_task.py line 442 11926] On image 7/1312
[2021-07-29 02:53:57,468 INFO inference_task.py line 442 11926] On image 8/1312
[2021-07-29 02:54:21,657 INFO inference_task.py line 442 11926] On image 9/1312
[2021-07-29 02:54:47,152 INFO inference_task.py line 442 11926] On image 10/1312
[2021-07-29 02:55:14,647 INFO inference_task.py line 442 11926] On image 11/1312
[2021-07-29 02:55:43,719 INFO inference_task.py line 442 11926] On image 12/1312
[2021-07-29 02:56:13,402 INFO inference_task.py line 442 11926] On image 13/1312
[2021-07-29 02:56:45,320 INFO inference_task.py line 442 11926] On image 14/1312
[2021-07-29 02:57:18,402 INFO inference_task.py line 442 11926] On image 15/1312
[2021-07-29 02:57:51,063 INFO inference_task.py line 442 11926] On image 16/1312
[2021-07-29 02:58:26,466 INFO inference_task.py line 442 11926] On image 17/1312
[2021-07-29 02:59:00,356 INFO inference_task.py line 442 11926] On image 18/1312
[2021-07-29 02:59:33,815 INFO inference_task.py line 442 11926] On image 19/1312
[2021-07-29 03:00:08,435 INFO inference_task.py line 442 11926] On image 20/1312
[2021-07-29 03:00:43,657 INFO inference_task.py line 442 11926] On image 21/1312
[2021-07-29 03:01:19,335 INFO inference_task.py line 442 11926] On image 22/1312
[2021-07-29 03:01:55,123 INFO inference_task.py line 442 11926] On image 23/1312
[2021-07-29 03:02:30,679 INFO inference_task.py line 442 11926] On image 24/1312
[2021-07-29 03:03:05,552 INFO inference_task.py line 442 11926] On image 25/1312
[2021-07-29 03:03:42,121 INFO inference_task.py line 442 11926] On image 26/1312
[2021-07-29 03:04:18,147 INFO inference_task.py line 442 11926] On image 27/1312
[2021-07-29 03:04:53,813 INFO inference_task.py line 442 11926] On image 28/1312
[2021-07-29 03:05:30,250 INFO inference_task.py line 442 11926] On image 29/1312
[2021-07-29 03:06:06,936 INFO inference_task.py line 442 11926] On image 30/1312
[2021-07-29 03:06:43,893 INFO inference_task.py line 442 11926] On image 31/1312
[2021-07-29 03:07:22,273 INFO inference_task.py line 442 11926] On image 32/1312
[2021-07-29 03:07:58,141 INFO inference_task.py line 442 11926] On image 33/1312
[2021-07-29 03:08:36,007 INFO inference_task.py line 442 11926] On image 34/1312
[2021-07-29 03:09:13,067 INFO inference_task.py line 442 11926] On image 35/1312
[2021-07-29 03:09:48,738 INFO inference_task.py line 442 11926] On image 36/1312
[2021-07-29 03:10:25,400 INFO inference_task.py line 442 11926] On image 37/1312
[2021-07-29 03:11:02,181 INFO inference_task.py line 442 11926] On image 38/1312
[2021-07-29 03:11:39,011 INFO inference_task.py line 442 11926] On image 39/1312
[2021-07-29 03:12:18,203 INFO inference_task.py line 442 11926] On image 40/1312
[2021-07-29 03:12:56,330 INFO inference_task.py line 442 11926] On image 41/1312
[2021-07-29 03:13:35,509 INFO inference_task.py line 442 11926] On image 42/1312
[2021-07-29 03:14:11,946 INFO inference_task.py line 442 11926] On image 43/1312
[2021-07-29 03:14:47,958 INFO inference_task.py line 442 11926] On image 44/1312
[2021-07-29 03:15:23,517 INFO inference_task.py line 442 11926] On image 45/1312
[2021-07-29 03:15:59,211 INFO inference_task.py line 442 11926] On image 46/1312
[2021-07-29 03:16:34,647 INFO inference_task.py line 442 11926] On image 47/1312
[2021-07-29 03:17:11,127 INFO inference_task.py line 442 11926] On image 48/1312
[2021-07-29 03:17:47,540 INFO inference_task.py line 442 11926] On image 49/1312
[2021-07-29 03:18:24,153 INFO inference_task.py line 442 11926] On image 50/1312
[2021-07-29 03:19:00,355 INFO inference_task.py line 442 11926] On image 51/1312
[2021-07-29 03:19:36,037 INFO inference_task.py line 442 11926] On image 52/1312
[2021-07-29 03:20:12,058 INFO inference_task.py line 442 11926] On image 53/1312
[2021-07-29 03:20:47,903 INFO inference_task.py line 442 11926] On image 54/1312
[2021-07-29 03:21:24,239 INFO inference_task.py line 442 11926] On image 55/1312
[2021-07-29 03:22:00,087 INFO inference_task.py line 442 11926] On image 56/1312
[2021-07-29 03:22:35,845 INFO inference_task.py line 442 11926] On image 57/1312
[2021-07-29 03:23:11,774 INFO inference_task.py line 442 11926] On image 58/1312
[2021-07-29 03:23:47,541 INFO inference_task.py line 442 11926] On image 59/1312
[2021-07-29 03:24:23,755 INFO inference_task.py line 442 11926] On image 60/1312
[2021-07-29 03:25:00,037 INFO inference_task.py line 442 11926] On image 61/1312
[2021-07-29 03:25:35,865 INFO inference_task.py line 442 11926] On image 62/1312
[2021-07-29 03:26:12,221 INFO inference_task.py line 442 11926] On image 63/1312
[2021-07-29 03:26:49,299 INFO inference_task.py line 442 11926] On image 64/1312
[2021-07-29 03:27:26,690 INFO inference_task.py line 442 11926] On image 65/1312
[2021-07-29 03:28:03,340 INFO inference_task.py line 442 11926] On image 66/1312
[2021-07-29 03:28:40,227 INFO inference_task.py line 442 11926] On image 67/1312
[2021-07-29 03:29:17,089 INFO inference_task.py line 442 11926] On image 68/1312
[2021-07-29 03:29:54,721 INFO inference_task.py line 442 11926] On image 69/1312
[2021-07-29 03:30:31,592 INFO inference_task.py line 442 11926] On image 70/1312
[2021-07-29 03:31:08,450 INFO inference_task.py line 442 11926] On image 71/1312
[2021-07-29 03:31:45,385 INFO inference_task.py line 442 11926] On image 72/1312
[2021-07-29 03:32:22,591 INFO inference_task.py line 442 11926] On image 73/1312
[2021-07-29 03:32:58,804 INFO inference_task.py line 442 11926] On image 74/1312
[2021-07-29 03:33:35,249 INFO inference_task.py line 442 11926] On image 75/1312
[2021-07-29 03:34:12,158 INFO inference_task.py line 442 11926] On image 76/1312
[2021-07-29 03:34:48,676 INFO inference_task.py line 442 11926] On image 77/1312
[2021-07-29 03:35:26,471 INFO inference_task.py line 442 11926] On image 78/1312
[2021-07-29 03:36:03,713 INFO inference_task.py line 442 11926] On image 79/1312
[2021-07-29 03:36:40,725 INFO inference_task.py line 442 11926] On image 80/1312
[2021-07-29 03:37:17,384 INFO inference_task.py line 442 11926] On image 81/1312
[2021-07-29 03:37:54,698 INFO inference_task.py line 442 11926] On image 82/1312
[2021-07-29 03:38:32,027 INFO inference_task.py line 442 11926] On image 83/1312
[2021-07-29 03:39:07,970 INFO inference_task.py line 442 11926] On image 84/1312
[2021-07-29 03:39:45,093 INFO inference_task.py line 442 11926] On image 85/1312
[2021-07-29 03:40:22,227 INFO inference_task.py line 442 11926] On image 86/1312
[2021-07-29 03:40:58,539 INFO inference_task.py line 442 11926] On image 87/1312
[2021-07-29 03:41:35,155 INFO inference_task.py line 442 11926] On image 88/1312
[2021-07-29 03:42:11,598 INFO inference_task.py line 442 11926] On image 89/1312
[2021-07-29 03:42:49,505 INFO inference_task.py line 442 11926] On image 90/1312
[2021-07-29 03:43:26,028 INFO inference_task.py line 442 11926] On image 91/1312
[2021-07-29 03:44:02,794 INFO inference_task.py line 442 11926] On image 92/1312
[2021-07-29 03:44:39,811 INFO inference_task.py line 442 11926] On image 93/1312
[2021-07-29 03:45:15,724 INFO inference_task.py line 442 11926] On image 94/1312
[2021-07-29 03:45:53,234 INFO inference_task.py line 442 11926] On image 95/1312
[2021-07-29 03:46:29,862 INFO inference_task.py line 442 11926] On image 96/1312
[2021-07-29 03:47:06,696 INFO inference_task.py line 442 11926] On image 97/1312
[2021-07-29 03:47:42,920 INFO inference_task.py line 442 11926] On image 98/1312
[2021-07-29 03:48:19,640 INFO inference_task.py line 442 11926] On image 99/1312
[2021-07-29 03:48:55,935 INFO inference_task.py line 442 11926] On image 100/1312
[2021-07-29 03:49:32,223 INFO inference_task.py line 442 11926] On image 101/1312
[2021-07-29 03:50:08,022 INFO inference_task.py line 442 11926] On image 102/1312
[2021-07-29 03:50:44,716 INFO inference_task.py line 442 11926] On image 103/1312
[2021-07-29 03:51:22,088 INFO inference_task.py line 442 11926] On image 104/1312
[2021-07-29 03:51:59,518 INFO inference_task.py line 442 11926] On image 105/1312
[2021-07-29 03:52:37,975 INFO inference_task.py line 442 11926] On image 106/1312
[2021-07-29 03:53:14,968 INFO inference_task.py line 442 11926] On image 107/1312
[2021-07-29 03:53:50,860 INFO inference_task.py line 442 11926] On image 108/1312
[2021-07-29 03:54:26,802 INFO inference_task.py line 442 11926] On image 109/1312
[2021-07-29 03:55:02,738 INFO inference_task.py line 442 11926] On image 110/1312
[2021-07-29 03:55:38,041 INFO inference_task.py line 442 11926] On image 111/1312
[2021-07-29 03:56:13,368 INFO inference_task.py line 442 11926] On image 112/1312
[2021-07-29 03:56:48,973 INFO inference_task.py line 442 11926] On image 113/1312
[2021-07-29 03:57:24,275 INFO inference_task.py line 442 11926] On image 114/1312
[2021-07-29 03:57:59,858 INFO inference_task.py line 442 11926] On image 115/1312
[2021-07-29 03:58:36,053 INFO inference_task.py line 442 11926] On image 116/1312
[2021-07-29 03:59:12,738 INFO inference_task.py line 442 11926] On image 117/1312
[2021-07-29 03:59:52,303 INFO inference_task.py line 442 11926] On image 118/1312
[2021-07-29 04:00:32,366 INFO inference_task.py line 442 11926] On image 119/1312
[2021-07-29 04:01:12,757 INFO inference_task.py line 442 11926] On image 120/1312
[2021-07-29 04:01:51,448 INFO inference_task.py line 442 11926] On image 121/1312
[2021-07-29 04:02:31,448 INFO inference_task.py line 442 11926] On image 122/1312
[2021-07-29 04:03:08,493 INFO inference_task.py line 442 11926] On image 123/1312
[2021-07-29 04:03:45,930 INFO inference_task.py line 442 11926] On image 124/1312
[2021-07-29 04:04:22,804 INFO inference_task.py line 442 11926] On image 125/1312
[2021-07-29 04:04:58,956 INFO inference_task.py line 442 11926] On image 126/1312
[2021-07-29 04:05:35,772 INFO inference_task.py line 442 11926] On image 127/1312
[2021-07-29 04:06:12,102 INFO inference_task.py line 442 11926] On image 128/1312
[2021-07-29 04:06:48,624 INFO inference_task.py line 442 11926] On image 129/1312
[2021-07-29 04:07:25,737 INFO inference_task.py line 442 11926] On image 130/1312
[2021-07-29 04:08:01,762 INFO inference_task.py line 442 11926] On image 131/1312
[2021-07-29 04:08:37,999 INFO inference_task.py line 442 11926] On image 132/1312
[2021-07-29 04:09:15,628 INFO inference_task.py line 442 11926] On image 133/1312
[2021-07-29 04:09:51,805 INFO inference_task.py line 442 11926] On image 134/1312
[2021-07-29 04:10:28,575 INFO inference_task.py line 442 11926] On image 135/1312
[2021-07-29 04:11:04,250 INFO inference_task.py line 442 11926] On image 136/1312
[2021-07-29 04:11:40,140 INFO inference_task.py line 442 11926] On image 137/1312
[2021-07-29 04:12:17,193 INFO inference_task.py line 442 11926] On image 138/1312
[2021-07-29 04:12:56,092 INFO inference_task.py line 442 11926] On image 139/1312
[2021-07-29 04:13:32,841 INFO inference_task.py line 442 11926] On image 140/1312
[2021-07-29 04:14:08,987 INFO inference_task.py line 442 11926] On image 141/1312
[2021-07-29 04:14:45,480 INFO inference_task.py line 442 11926] On image 142/1312
[2021-07-29 04:15:24,121 INFO inference_task.py line 442 11926] On image 143/1312

I think I am missing something. According to this repo, the detection fps should be around 16 fps on a quadro P5000 and I think for Nvidia GTX 1070, it should be something similar but not (1/40) fps.

Can anybody help ??

johnwlambert commented 2 years ago

Hi, thanks for your interest in our work.

Which of our pretrained models are you using? We have models that take in small, medium, or large crop sizes (see https://github.com/mseg-dataset/mseg-semantic#how-fast-can-your-models-run).

Are you using single-scale or multi-scale inference?

The numbers reported in the readme here are for single-scale prediction with our lightweight model. You would need to use the following script instead: https://github.com/mseg-dataset/mseg-semantic/blob/master/mseg_semantic/tool/universal_demo_batched.py

ArghyaChatterjee commented 2 years ago

Hi @johnwlambert , thanks for the reply. I will be discussing about the inference time further in this thread but I need a help regarding running this universal_demo_batched.py file. Here is what I am giving and here is what I am getting:

python3 -u ~/mseg-semantic/mseg_semantic/tool/universal_demo_batched.py --config=/home/arghya/mseg-semantic/mseg_semantic/config/test/720/default_config_batched_ss.yaml model_name mseg-3m-720p model_path ~/Downloads/mseg-3m-720p.pth input_file ~/Downloads/Urban
Namespace(config='/home/arghya/mseg-semantic/mseg_semantic/config/test/720/default_config_batched_ss.yaml', file_save='default', opts=['model_name', 'mseg-3m-720p', 'model_path', '/home/arghya/Downloads/mseg-3m-720p.pth', 'input_file', '/home/arghya/Downloads/Urban'])
arch: hrnet
base_size: None
batch_size_val: 1
dataset: Urban
has_prediction: False
ignore_label: 255
img_name_unique: False
index_start: 0
index_step: 0
input_file: /home/arghya/Downloads/Urban
layers: 50
model_name: mseg-3m-720p
model_path: /home/arghya/Downloads/mseg-3m-720p.pth
native_img_h: -1
native_img_w: -1
network_name: None
save_folder: default
scales: [1.0]
small: True
split: val
test_gpu: [0]
test_h: 593
test_w: 593
vis_freq: 20
workers: 16
zoom_factor: 8
[2021-07-29 19:00:01,013 INFO universal_demo_batched.py line 68 3756] arch: hrnet
base_size: None
batch_size_val: 1
dataset: Urban
has_prediction: False
ignore_label: 255
img_name_unique: True
index_start: 0
index_step: 0
input_file: /home/arghya/Downloads/Urban
layers: 50
model_name: mseg-3m-720p
model_path: /home/arghya/Downloads/mseg-3m-720p.pth
native_img_h: -1
native_img_w: -1
network_name: None
print_freq: 10
save_folder: default
scales: [1.0]
small: True
split: test
test_gpu: [0]
test_h: 593
test_w: 593
u_classes: ['backpack', 'umbrella', 'bag', 'tie', 'suitcase', 'case', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'animal_other', 'microwave', 'radiator', 'oven', 'toaster', 'storage_tank', 'conveyor_belt', 'sink', 'refrigerator', 'washer_dryer', 'fan', 'dishwasher', 'toilet', 'bathtub', 'shower', 'tunnel', 'bridge', 'pier_wharf', 'tent', 'building', 'ceiling', 'laptop', 'keyboard', 'mouse', 'remote', 'cell phone', 'television', 'floor', 'stage', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot_dog', 'pizza', 'donut', 'cake', 'fruit_other', 'food_other', 'chair_other', 'armchair', 'swivel_chair', 'stool', 'seat', 'couch', 'trash_can', 'potted_plant', 'nightstand', 'bed', 'table', 'pool_table', 'barrel', 'desk', 'ottoman', 'wardrobe', 'crib', 'basket', 'chest_of_drawers', 'bookshelf', 'counter_other', 'bathroom_counter', 'kitchen_island', 'door', 'light_other', 'lamp', 'sconce', 'chandelier', 'mirror', 'whiteboard', 'shelf', 'stairs', 'escalator', 'cabinet', 'fireplace', 'stove', 'arcade_machine', 'gravel', 'platform', 'playingfield', 'railroad', 'road', 'snow', 'sidewalk_pavement', 'runway', 'terrain', 'book', 'box', 'clock', 'vase', 'scissors', 'plaything_other', 'teddy_bear', 'hair_dryer', 'toothbrush', 'painting', 'poster', 'bulletin_board', 'bottle', 'cup', 'wine_glass', 'knife', 'fork', 'spoon', 'bowl', 'tray', 'range_hood', 'plate', 'person', 'rider_other', 'bicyclist', 'motorcyclist', 'paper', 'streetlight', 'road_barrier', 'mailbox', 'cctv_camera', 'junction_box', 'traffic_sign', 'traffic_light', 'fire_hydrant', 'parking_meter', 'bench', 'bike_rack', 'billboard', 'sky', 'pole', 'fence', 'railing_banister', 'guard_rail', 'mountain_hill', 'rock', 'frisbee', 'skis', 'snowboard', 'sports_ball', 'kite', 'baseball_bat', 'baseball_glove', 'skateboard', 'surfboard', 'tennis_racket', 'net', 'base', 'sculpture', 'column', 'fountain', 'awning', 'apparel', 'banner', 'flag', 'blanket', 'curtain_other', 'shower_curtain', 'pillow', 'towel', 'rug_floormat', 'vegetation', 'bicycle', 'car', 'autorickshaw', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'trailer', 'boat_ship', 'slow_wheeled_object', 'river_lake', 'sea', 'water_other', 'swimming_pool', 'waterfall', 'wall', 'window', 'window_blind']
vis_freq: 20
workers: 16
zoom_factor: 8
[2021-07-29 19:00:01,013 INFO universal_demo_batched.py line 69 3756] => creating model ...
[2021-07-29 19:00:04,156 INFO inference_task.py line 307 3756] => loading checkpoint '/home/arghya/Downloads/mseg-3m-720p.pth'
[2021-07-29 19:00:04,691 INFO inference_task.py line 313 3756] => loaded checkpoint '/home/arghya/Downloads/mseg-3m-720p.pth'
[2021-07-29 19:00:04,696 INFO batched_inference_task.py line 65 3756] >>>>>>>>>>>>>> Start inference task >>>>>>>>>>>>>
Totally 4 samples in test set.
Checking image&label pair test list done!
[2021-07-29 19:00:05,035 INFO batched_inference_task.py line 98 3756] On batch 0
Traceback (most recent call last):
  File "/home/arghya/mseg-semantic/mseg_semantic/tool/universal_demo_batched.py", line 132, in <module>
    run_universal_demo_batched(args, use_gpu)
  File "/home/arghya/mseg-semantic/mseg_semantic/tool/universal_demo_batched.py", line 87, in run_universal_demo_batched
    itask.execute()
  File "/home/arghya/mseg-semantic/mseg_semantic/tool/batched_inference_task.py", line 73, in execute
    self.execute_on_dataloader_batched(test_loader)
  File "/home/arghya/mseg-semantic/mseg_semantic/tool/batched_inference_task.py", line 101, in execute_on_dataloader_batched
    gray_batch = self.execute_on_batch(input)
  File "/home/arghya/mseg-semantic/mseg_semantic/tool/batched_inference_task.py", line 128, in execute_on_batch
    logits = self.scale_process_cuda_batched(batch, self.args.native_img_h, self.args.native_img_w)
  File "/home/arghya/mseg-semantic/mseg_semantic/tool/batched_inference_task.py", line 163, in scale_process_cuda_batched
    self.std
  File "/home/arghya/mseg-semantic/mseg_semantic/tool/batched_inference_task.py", line 54, in pad_to_crop_sz_batched
    padded_batch[:, :, start_h:end_h, start_w:end_w] = batch
RuntimeError: The expanded size of the tensor (593) must match the existing size (1054) at non-singleton dimension 3.  Target sizes: [1, 3, 593, 593].  Tensor sizes: [3, 593, 1054]

Looks like there is a tensor size mismatching. There are 2 questions:

  1. How to run this one ?? Is the configuration file correct?? Am I giving everything right or missing something ??
  2. I saw this universal_demo_batched.py file does not take video input. Is there a way to give video input to this one ?? Initially I was giving 1280x720 @30 fps video input (.mp4 format) but as it was not supporting any videos. Then I extracted the (1280 x 720) frames of the video, put them inside a directory and gave the image directory path as input path.
Chanka0 commented 1 year ago

I'm late on this but I figured out the issue after debugging it myself. In the default .yaml (in my case config/test/480/default_config_batched_ss.yaml) two values need to be set.

native_img_h
native_img_w

By default they are -1 and apparently supplementing them via commandline wasn't enough. Anyway, in the unlikely case this reply helps someone make sure native_img_h and w are set properly so you don't have squished portrait-mode outputs like I did. Whoops.

lpj20 commented 1 year ago

I'm late on this but I figured out the issue after debugging it myself. In the default .yaml (in my case config/test/480/default_config_batched_ss.yaml) two values need to be set.

native_img_h
native_img_w

By default they are -1 and apparently supplementing them via commandline wasn't enough. Anyway, in the unlikely case this reply helps someone make sure native_img_h and w are set properly so you don't have squished portrait-mode outputs like I did. Whoops.

Nice work. @ArghyaChatterjee I encountered the same problem as you. I run the Mseg network on Nividia 3060 platform and the average time to process a frame is about 0.5s (single-scale). According to the writer's reply, I used universal_demo_batched.py, instead of universal_demo.py, and set the native height and width of the pic. It looks very fast!

luda1013 commented 1 year ago

Hallo, i am currently working with enhancing image enhancement paper and algorithm and trying to implement that. In the process, we need to use MSeg-segmentation for real and rendered images/ datasets. i have like 50-60k images. but now i want to test the first 2500 images. The images is from Playing for data, so 1914 x 1052 rgb images. (https://download.visinf.tu-darmstadt.de/data/from_games/)

So the dependencies MSeg-api and MSeg_semantic were already installed. I tried the google collab first and then copying the commands, so i could run the script in my linux also. the command is like this: python -u mseg_semantic/tool/universal_demo.py --config="default_config_360.yaml" model_name mseg-3m model_path mseg-3m.pth input_file /home/luda1013/PfD/image/try_images

the weight i used, i downloaded it from the google collab, so the mseg-3m-1080.pth

But for me, it took 12 minutes just for 1 image.... do u know why so or where i should check it more?

My setup: Ubuntu 20.04.6 LTS core i9-10980XE @ 3GHz graphic (nvidia-smi): NVIDIA RTX A5000 (48 GB) Cuda 11.7 Pytorch version : 2.0.0 Cuda at: /usr/local/cuda*

Thanks for your help.. i cannot go forward if 1 image takes 12 min :( i already tried the free version on colab, onl get like 140 images before my compute units turn to 0 and cannot use cuda/ GPU anymore.. :(

lpj20 commented 1 year ago

Hallo, i am currently working with enhancing image enhancement paper and algorithm and trying to implement that. In the process, we need to use MSeg-segmentation for real and rendered images/ datasets. i have like 50-60k images. but now i want to test the first 2500 images. The images is from Playing for data, so 1914 x 1052 rgb images. (https://download.visinf.tu-darmstadt.de/data/from_games/)

So the dependencies MSeg-api and MSeg_semantic were already installed. I tried the google collab first and then copying the commands, so i could run the script in my linux also. the command is like this: python -u mseg_semantic/tool/universal_demo.py --config="default_config_360.yaml" model_name mseg-3m model_path mseg-3m.pth input_file /home/luda1013/PfD/image/try_images

the weight i used, i downloaded it from the google collab, so the mseg-3m-1080.pth

But for me, it took 12 minutes just for 1 image.... do u know why so or where i should check it more?

My setup: Ubuntu 20.04.6 LTS core i9-10980XE @ 3GHz graphic (nvidia-smi): NVIDIA RTX A5000 (48 GB) Cuda 11.7 Pytorch version : 2.0.0 Cuda at: /usr/local/cuda*

Thanks for your help.. i cannot go forward if 1 image takes 12 min :( i already tried the free version on colab, onl get like 140 images before my compute units turn to 0 and cannot use cuda/ GPU anymore.. :(

Hi, @luda1013. I think you should use the batched script, you can find it in "/mseg-semantic/mseg_semantic/tool/universal_demo_batched.py". It seems the algorithm would use OpenCV instead of CUDA when executing "universal_demo.py".

luda1013 commented 1 year ago

So the yaml file i got it, it is in the description of the file location, so with the same configuration: same model and config

i have encountered the problem: MSeg-error_yaml-file so it needs to resize the width/ height? each image has width x height = 1914 x 1052 pixels

so i edit the yaml file, i changed: native_img_h: 1052 native_img_w: 1914

according to the size of images that i have. i was on testing just with 3 images, but what i got is only the gray scale image?