DeepLab2 is a TensorFlow library for deep labeling, aiming to provide a unified and state-of-the-art TensorFlow codebase for dense pixel labeling tasks.
Apache License 2.0
1k
stars
159
forks
source link
Can not do evaluation of Vip-Deeplab on SemKitti-DVPS #139
Hello, I have met a question:
I find a failed prompt which didn't interruption the program when doing evaluation:
Hello, I have met a problem. The evaluation of vip-deeplab on SemKitti-DVPS can't get correct results.
I find a failed prompt as below:
2022-10-18 11:02:24.020773: W tensorflow/core/common_runtime/colocation_graph.cc:1145] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [
/job:localhost/replica:0/task:0/device:CPU:0]
And we can also see that, the reuslt (such as STQ) is no correct. Dose it caused by above "failed prompt"? If does, How to solve this problem? Thanks!
Env:
OS Ubuntu 20.04
Python 3.8.10
Tensorflow 2.6.0 (install by pip, without building by bazel)
Cuda 11.2
Cudnn 8.1.1
gcc/g++ 7.5.0
The complete information is as follows:
root@samsung-Precision-7920-Tower:/home/samsung/code/deeplab2/deeplab2# python trainer/train.py --config_file="configs/semkitti_dvps/vip_deeplab/resnet50_beta_os32_zhh.textproto" --mode='eval' --model_dir='/media/samsung/samsung/models/ViP-Deeplab_download/resnet50_beta_os32_vip_deeplab_semkitti_dvps_train/' --num_gpus=1
I1018 11:02:01.840494 140524474017600 train.py:65] Reading the config file.
I1018 11:02:01.846077 140524474017600 train.py:69] Starting the experiment.
2022-10-18 11:02:01.846814: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-10-18 11:02:03.238752: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 22349 MB memory: -> device: 0, name: NVIDIA RTX A5000, pci bus id: 0000:17:00.0, compute capability: 8.6
2022-10-18 11:02:03.239956: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 22000 MB memory: -> device: 1, name: NVIDIA RTX A5000, pci bus id: 0000:73:00.0, compute capability: 8.6
I1018 11:02:03.252338 140524474017600 train_lib.py:108] Using strategy <class 'tensorflow.python.distribute.one_device_strategy.OneDeviceStrategy'> with 1 replicas
I1018 11:02:03.838192 140524474017600 vip_deeplab.py:52] Synchronized Batchnorm is used.
I1018 11:02:03.839751 140524474017600 axial_resnet_instances.py:144] Axial-ResNet final config: {'num_blocks': [3, 4, 6, 3], 'backbone_layer_multiplier': 1.0, 'width_multiplier': 1.0, 'stem_width_multiplier': 1.0, 'output_stride': 32, 'classification_mode': True, 'backbone_type': 'resnet_beta', 'use_axial_beyond_stride': 0, 'backbone_use_transformer_beyond_stride': 0, 'extra_decoder_use_transformer_beyond_stride': 32, 'backbone_decoder_num_stacks': 0, 'backbone_decoder_blocks_per_stage': 1, 'extra_decoder_num_stacks': 0, 'extra_decoder_blocks_per_stage': 1, 'max_num_mask_slots': 128, 'num_mask_slots': 128, 'memory_channels': 256, 'base_transformer_expansion': 1.0, 'global_feed_forward_network_channels': 256, 'high_resolution_output_stride': 4, 'activation': 'relu', 'block_group_config': {'attention_bottleneck_expansion': 2, 'drop_path_keep_prob': 1.0, 'drop_path_beyond_stride': 16, 'drop_path_schedule': 'constant', 'positional_encoding_type': None, 'use_global_beyond_stride': 0, 'use_sac_beyond_stride': -1, 'use_squeeze_and_excite': False, 'conv_use_recompute_grad': False, 'axial_use_recompute_grad': True, 'recompute_within_stride': 0, 'transformer_use_recompute_grad': False, 'axial_layer_config': {'query_shape': (129, 129), 'key_expansion': 1, 'value_expansion': 2, 'memory_flange': (32, 32), 'double_global_attention': False, 'num_heads': 8, 'use_query_rpe_similarity': True, 'use_key_rpe_similarity': True, 'use_content_similarity': True, 'retrieve_value_rpe': True, 'retrieve_value_content': True, 'initialization_std_for_query_key_rpe': 1.0, 'initialization_std_for_value_rpe': 1.0, 'self_attention_activation': 'softmax'}, 'dual_path_transformer_layer_config': {'num_heads': 8, 'bottleneck_expansion': 2, 'key_expansion': 1, 'value_expansion': 2, 'feed_forward_network_channels': 2048, 'use_memory_self_attention': True, 'use_pixel2memory_feedback_attention': True, 'transformer_activation': 'softmax'}}, 'bn_layer': functools.partial(<class 'keras.layers.normalization.batch_normalization.SyncBatchNormalization'>, momentum=0.9900000095367432, epsilon=0.0010000000474974513), 'conv_kernel_weight_decay': 0.0}
I1018 11:02:04.133699 140524474017600 vip_deeplab.py:80] Setting pooling size to (13, 41)
I1018 11:02:04.133956 140524474017600 aspp.py:141] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
I1018 11:02:04.134074 140524474017600 aspp.py:141] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
I1018 11:02:04.134175 140524474017600 aspp.py:141] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
2022-10-18 11:02:08.122948: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
I1018 11:02:08.128028 140524474017600 controller.py:408] restoring or initializing model...
restoring or initializing model...
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/tensorflow/python/training/tracking/util.py:1359: NameBasedSaverStatus.__init__ (from tensorflow.python.training.tracking.util) is deprecated and will be removed in a future version.
Instructions for updating:
Restoring a name-based tf.train.Saver checkpoint using the object-based restore API. This mode uses global names to match variables, and so is somewhat fragile. It also adds new restore ops to the graph each time it is called when graph building. Prefer re-encoding training checkpoints in the object-based format: run save() on the object-based saver (the same one this message is coming from) and use that checkpoint in the future.
W1018 11:02:08.152279 140524474017600 deprecation.py:339] From /usr/local/lib/python3.8/dist-packages/tensorflow/python/training/tracking/util.py:1359: NameBasedSaverStatus.__init__ (from tensorflow.python.training.tracking.util) is deprecated and will be removed in a future version.
Instructions for updating:
Restoring a name-based tf.train.Saver checkpoint using the object-based restore API. This mode uses global names to match variables, and so is somewhat fragile. It also adds new restore ops to the graph each time it is called when graph building. Prefer re-encoding training checkpoints in the object-based format: run save() on the object-based saver (the same one this message is coming from) and use that checkpoint in the future.
I1018 11:02:08.170864 140524474017600 controller.py:414] initialized model.
initialized model.
I1018 11:02:08.171365 140524474017600 controller.py:297] eval | step: 0 | running complete evaluation...
eval | step: 0 | running complete evaluation...
2022-10-18 11:02:08.275621: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
I1018 11:02:09.759397 140524474017600 api.py:446] Eval with scales ListWrapper([1.0])
I1018 11:02:10.598743 140524474017600 api.py:446] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
I1018 11:02:10.624969 140524474017600 api.py:446] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
I1018 11:02:10.650889 140524474017600 api.py:446] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
I1018 11:02:10.675337 140524474017600 api.py:446] Eval scale 1.0; setting pooling size to [13, 41]
I1018 11:02:14.969927 140524474017600 api.py:446] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
I1018 11:02:14.995965 140524474017600 api.py:446] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
I1018 11:02:15.020637 140524474017600 api.py:446] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
I1018 11:02:15.051831 140524474017600 api.py:446] Eval with scales ListWrapper([1.0])
I1018 11:02:15.076245 140524474017600 api.py:446] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
I1018 11:02:15.100892 140524474017600 api.py:446] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
I1018 11:02:15.125504 140524474017600 api.py:446] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
I1018 11:02:15.148420 140524474017600 api.py:446] Eval scale 1.0; setting pooling size to [13, 41]
I1018 11:02:16.588450 140524474017600 api.py:446] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
I1018 11:02:16.614674 140524474017600 api.py:446] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
I1018 11:02:16.639269 140524474017600 api.py:446] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
I1018 11:02:19.770701 140524474017600 api.py:446] Eval with scales ListWrapper([1.0])
I1018 11:02:19.796750 140524474017600 api.py:446] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
I1018 11:02:19.822088 140524474017600 api.py:446] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
I1018 11:02:19.846855 140524474017600 api.py:446] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
I1018 11:02:19.870874 140524474017600 api.py:446] Eval scale 1.0; setting pooling size to [13, 41]
I1018 11:02:21.430940 140524474017600 api.py:446] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
I1018 11:02:21.456978 140524474017600 api.py:446] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
I1018 11:02:21.481885 140524474017600 api.py:446] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
I1018 11:02:21.513078 140524474017600 api.py:446] Eval with scales ListWrapper([1.0])
I1018 11:02:21.539165 140524474017600 api.py:446] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
I1018 11:02:21.564414 140524474017600 api.py:446] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
I1018 11:02:21.589395 140524474017600 api.py:446] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
I1018 11:02:21.612934 140524474017600 api.py:446] Eval scale 1.0; setting pooling size to [13, 41]
I1018 11:02:22.965541 140524474017600 api.py:446] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
I1018 11:02:22.991809 140524474017600 api.py:446] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
I1018 11:02:23.019295 140524474017600 api.py:446] Global average pooling in the ASPP pooling layer was replaced with tiled average pooling using the provided pool_size. Please make sure this behavior is intended.
2022-10-18 11:02:24.020369: W tensorflow/core/common_runtime/colocation_graph.cc:1145] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [
/job:localhost/replica:0/task:0/device:CPU:0].
See below for details of this colocation group:
Colocation Debug Info:
Colocation group had the following types and supported devices:
Root Member(assigned_device_name_index_=-1 requested_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' assigned_device_name_='' resource_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU] possible_devices_=[]
LookupTableRemoveV2: CPU
LookupTableFindV2: CPU
LookupTableInsertV2: CPU
MutableHashTableV2: CPU
LookupTableExportV2: CPU
2022-10-18 11:02:24.020591: W tensorflow/core/common_runtime/colocation_graph.cc:1145] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [
/job:localhost/replica:0/task:0/device:CPU:0].
See below for details of this colocation group:
Colocation Debug Info:
Colocation group had the following types and supported devices:
Root Member(assigned_device_nameindex=-1 requested_devicename='/job:localhost/replica:0/task:0/device:GPU:0' assigned_devicename='' resource_devicename='/job:localhost/replica:0/task:0/device:GPU:0' supported_devicetypes=[CPU] possibledevices=[]
LookupTableRemoveV2: CPU
LookupTableFindV2: CPU
LookupTableInsertV2: CPU
MutableHashTableV2: CPU
LookupTableExportV2: CPU
2022-10-18 11:02:24.020773: W tensorflow/core/common_runtime/colocation_graph.cc:1145] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [
/job:localhost/replica:0/task:0/device:CPU:0].
See below for details of this colocation group:
Colocation Debug Info:
Colocation group had the following types and supported devices:
Root Member(assigned_device_nameindex=-1 requested_devicename='/job:localhost/replica:0/task:0/device:GPU:0' assigned_devicename='' resource_devicename='/job:localhost/replica:0/task:0/device:GPU:0' supported_devicetypes=[CPU] possibledevices=[]
LookupTableRemoveV2: CPU
LookupTableInsertV2: CPU
MutableHashTableV2: CPU
LookupTableExportV2: CPU
2022-10-18 11:02:24.020985: W tensorflow/core/common_runtime/colocation_graph.cc:1145] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [
/job:localhost/replica:0/task:0/device:CPU:0].
See below for details of this colocation group:
Colocation Debug Info:
Colocation group had the following types and supported devices:
Root Member(assigned_device_nameindex=-1 requested_devicename='/job:localhost/replica:0/task:0/device:GPU:0' assigned_devicename='' resource_devicename='/job:localhost/replica:0/task:0/device:GPU:0' supported_devicetypes=[CPU] possibledevices=[]
LookupTableRemoveV2: CPU
LookupTableInsertV2: CPU
MutableHashTableV2: CPU
LookupTableExportV2: CPU
2022-10-18 11:02:24.021158: W tensorflow/core/common_runtime/colocation_graph.cc:1145] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [
/job:localhost/replica:0/task:0/device:CPU:0].
See below for details of this colocation group:
Colocation Debug Info:
Colocation group had the following types and supported devices:
Root Member(assigned_device_nameindex=-1 requested_devicename='/job:localhost/replica:0/task:0/device:GPU:0' assigned_devicename='' resource_devicename='/job:localhost/replica:0/task:0/device:GPU:0' supported_devicetypes=[CPU] possibledevices=[]
LookupTableRemoveV2: CPU
LookupTableFindV2: CPU
LookupTableInsertV2: CPU
MutableHashTableV2: CPU
LookupTableExportV2: CPU
Issue
Hello, I have met a question: I find a failed prompt which didn't interruption the program when doing evaluation:
Hello, I have met a problem. The evaluation of vip-deeplab on SemKitti-DVPS can't get correct results. I find a failed prompt as below:
And we can also see that, the reuslt (such as STQ) is no correct. Dose it caused by above "failed prompt"? If does, How to solve this problem? Thanks!
Best, Wei
Info
Command for evaluation vip-deeplab is:
Env: OS Ubuntu 20.04 Python 3.8.10 Tensorflow 2.6.0 (install by pip, without building by bazel) Cuda 11.2 Cudnn 8.1.1 gcc/g++ 7.5.0
The complete information is as follows:
Colocation members, user-requested devices, and framework assigned devices, if any: ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable (MutableHashTableV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_lookup_table_export_values/LookupTableExportV2 (LookupTableExportV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_lookup_table_insert/LookupTableInsertV2 (LookupTableInsertV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_lookup_table_find/LookupTableFindV2 (LookupTableFindV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_lookup_table_export_values_1/LookupTableExportV2 (LookupTableExportV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_lookup_table_remove/LookupTableRemoveV2 (LookupTableRemoveV2) /job:localhost/replica:0/task:0/device:GPU:0
2022-10-18 11:02:24.020591: W tensorflow/core/common_runtime/colocation_graph.cc:1145] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [ /job:localhost/replica:0/task:0/device:CPU:0]. See below for details of this colocation group: Colocation Debug Info: Colocation group had the following types and supported devices: Root Member(assigned_device_nameindex=-1 requested_devicename='/job:localhost/replica:0/task:0/device:GPU:0' assigned_devicename='' resource_devicename='/job:localhost/replica:0/task:0/device:GPU:0' supported_devicetypes=[CPU] possibledevices=[] LookupTableRemoveV2: CPU LookupTableFindV2: CPU LookupTableInsertV2: CPU MutableHashTableV2: CPU LookupTableExportV2: CPU
Colocation members, user-requested devices, and framework assigned devices, if any: ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_1 (MutableHashTableV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_1_lookup_table_export_values/LookupTableExportV2 (LookupTableExportV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_1_lookup_table_insert/LookupTableInsertV2 (LookupTableInsertV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_1_lookup_table_find/LookupTableFindV2 (LookupTableFindV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_1_lookup_table_export_values_1/LookupTableExportV2 (LookupTableExportV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_1_lookup_table_remove/LookupTableRemoveV2 (LookupTableRemoveV2) /job:localhost/replica:0/task:0/device:GPU:0
2022-10-18 11:02:24.020773: W tensorflow/core/common_runtime/colocation_graph.cc:1145] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [ /job:localhost/replica:0/task:0/device:CPU:0]. See below for details of this colocation group: Colocation Debug Info: Colocation group had the following types and supported devices: Root Member(assigned_device_nameindex=-1 requested_devicename='/job:localhost/replica:0/task:0/device:GPU:0' assigned_devicename='' resource_devicename='/job:localhost/replica:0/task:0/device:GPU:0' supported_devicetypes=[CPU] possibledevices=[] LookupTableRemoveV2: CPU LookupTableInsertV2: CPU MutableHashTableV2: CPU LookupTableExportV2: CPU
Colocation members, user-requested devices, and framework assigned devices, if any: ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_2 (MutableHashTableV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_2_lookup_table_export_values/LookupTableExportV2 (LookupTableExportV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_2_lookup_table_insert/LookupTableInsertV2 (LookupTableInsertV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_2_lookup_table_export_values_1/LookupTableExportV2 (LookupTableExportV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_2_lookup_table_export_values_2/LookupTableExportV2 (LookupTableExportV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_2_lookup_table_remove/LookupTableRemoveV2 (LookupTableRemoveV2) /job:localhost/replica:0/task:0/device:GPU:0
2022-10-18 11:02:24.020985: W tensorflow/core/common_runtime/colocation_graph.cc:1145] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [ /job:localhost/replica:0/task:0/device:CPU:0]. See below for details of this colocation group: Colocation Debug Info: Colocation group had the following types and supported devices: Root Member(assigned_device_nameindex=-1 requested_devicename='/job:localhost/replica:0/task:0/device:GPU:0' assigned_devicename='' resource_devicename='/job:localhost/replica:0/task:0/device:GPU:0' supported_devicetypes=[CPU] possibledevices=[] LookupTableRemoveV2: CPU LookupTableInsertV2: CPU MutableHashTableV2: CPU LookupTableExportV2: CPU
Colocation members, user-requested devices, and framework assigned devices, if any: ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_3 (MutableHashTableV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_3_lookup_table_export_values/LookupTableExportV2 (LookupTableExportV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_3_lookup_table_insert/LookupTableInsertV2 (LookupTableInsertV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_3_lookup_table_export_values_1/LookupTableExportV2 (LookupTableExportV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_3_lookup_table_export_values_2/LookupTableExportV2 (LookupTableExportV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_3_lookup_table_remove/LookupTableRemoveV2 (LookupTableRemoveV2) /job:localhost/replica:0/task:0/device:GPU:0
2022-10-18 11:02:24.021158: W tensorflow/core/common_runtime/colocation_graph.cc:1145] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [ /job:localhost/replica:0/task:0/device:CPU:0]. See below for details of this colocation group: Colocation Debug Info: Colocation group had the following types and supported devices: Root Member(assigned_device_nameindex=-1 requested_devicename='/job:localhost/replica:0/task:0/device:GPU:0' assigned_devicename='' resource_devicename='/job:localhost/replica:0/task:0/device:GPU:0' supported_devicetypes=[CPU] possibledevices=[] LookupTableRemoveV2: CPU LookupTableFindV2: CPU LookupTableInsertV2: CPU MutableHashTableV2: CPU LookupTableExportV2: CPU
Colocation members, user-requested devices, and framework assigned devices, if any: ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_4 (MutableHashTableV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_4_lookup_table_export_values/LookupTableExportV2 (LookupTableExportV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_4_lookup_table_insert/LookupTableInsertV2 (LookupTableInsertV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_4_lookup_table_find/LookupTableFindV2 (LookupTableFindV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_4_lookup_table_export_values_1/LookupTableExportV2 (LookupTableExportV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_4_lookup_table_remove/LookupTableRemoveV2 (LookupTableRemoveV2) /job:localhost/replica:0/task:0/device:GPU:0
2022-10-18 11:02:31.094278: I tensorflow/stream_executor/cuda/cuda_dnn.cc:369] Loaded cuDNN version 8101 I1018 11:40:14.206324 140524474017600 loop_fns.py:81] The dataset iterator is exhausted after 4070 steps. W1018 11:40:14.704017 140524474017600 api.py:446] No detections to evaluate. I1018 11:40:14.990640 140524474017600 controller.py:310] eval | step: 0 | eval time: 2286.8 sec | output: {'evaluation/ap/AP_Mask': 0.0, 'evaluation/depth/AbsErrorRel': 3.429016, 'evaluation/depth/DepthInlier': 0.044548888, 'evaluation/depth/SILog': 57.931797, 'evaluation/depth/SqErrorRel': 17.738022, 'evaluation/iou/IoU': 1.5528065e-05, 'evaluation/pq/FN': 14165.5, 'evaluation/pq/FP': 2414.5, 'evaluation/pq/PQ': 0.0, 'evaluation/pq/RQ': 0.0, 'evaluation/pq/SQ': 0.0, 'evaluation/pq/TP': 0.0, 'evaluation/step/AQ': 0.0, 'evaluation/step/IoU': 0.0, 'evaluation/step/STQ': 0.0, 'evaluation/vpq_2frames/FN': 14505.0, 'evaluation/vpq_2frames/FP': 2479.0, 'evaluation/vpq_2frames/PQ': 0.0, 'evaluation/vpq_2frames/RQ': 0.0, 'evaluation/vpq_2frames/SQ': 0.0, 'evaluation/vpq_2frames/TP': 0.0, 'losses/eval_center_loss': 1.1949805, 'losses/eval_depth_loss': 0.44547412, 'losses/eval_next_regression_loss': 1.9722644, 'losses/eval_regression_loss': 1.9717673, 'losses/eval_semantic_loss': 3.3964477, 'losses/eval_total_loss': 8.980951} eval | step: 0 | eval time: 2286.8 sec | output: {'evaluation/ap/AP_Mask': 0.0, 'evaluation/depth/AbsErrorRel': 3.429016, 'evaluation/depth/DepthInlier': 0.044548888, 'evaluation/depth/SILog': 57.931797, 'evaluation/depth/SqErrorRel': 17.738022, 'evaluation/iou/IoU': 1.5528065e-05, 'evaluation/pq/FN': 14165.5, 'evaluation/pq/FP': 2414.5, 'evaluation/pq/PQ': 0.0, 'evaluation/pq/RQ': 0.0, 'evaluation/pq/SQ': 0.0, 'evaluation/pq/TP': 0.0, 'evaluation/step/AQ': 0.0, 'evaluation/step/IoU': 0.0, 'evaluation/step/STQ': 0.0, 'evaluation/vpq_2frames/FN': 14505.0, 'evaluation/vpq_2frames/FP': 2479.0, 'evaluation/vpq_2frames/PQ': 0.0, 'evaluation/vpq_2frames/RQ': 0.0, 'evaluation/vpq_2frames/SQ': 0.0, 'evaluation/vpq_2frames/TP': 0.0, 'losses/eval_center_loss': 1.1949805, 'losses/eval_depth_loss': 0.44547412, 'losses/eval_next_regression_loss': 1.9722644, 'losses/eval_regression_loss': 1.9717673, 'losses/eval_semantic_loss': 3.3964477, 'losses/eval_total_loss': 8.980951}