google-research / deeplab2

DeepLab2 is a TensorFlow library for deep labeling, aiming to provide a unified and state-of-the-art TensorFlow codebase for dense pixel labeling tasks.
Apache License 2.0
1k stars 159 forks source link

Can not do evaluation of Vip-Deeplab on SemKitti-DVPS #139

Open weimengchuan opened 2 years ago

weimengchuan commented 2 years ago

Issue

Hello, I have met a question: I find a failed prompt which didn't interruption the program when doing evaluation:

Hello, I have met a problem. The evaluation of vip-deeplab on SemKitti-DVPS can't get correct results. I find a failed prompt as below:

2022-10-18 11:02:24.020773: W tensorflow/core/common_runtime/colocation_graph.cc:1145] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [
  /job:localhost/replica:0/task:0/device:CPU:0]

And we can also see that, the reuslt (such as STQ) is no correct. Dose it caused by above "failed prompt"? If does, How to solve this problem? Thanks!

Best, Wei

Info

Colocation members, user-requested devices, and framework assigned devices, if any: ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable (MutableHashTableV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_lookup_table_export_values/LookupTableExportV2 (LookupTableExportV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_lookup_table_insert/LookupTableInsertV2 (LookupTableInsertV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_lookup_table_find/LookupTableFindV2 (LookupTableFindV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_lookup_table_export_values_1/LookupTableExportV2 (LookupTableExportV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_lookup_table_remove/LookupTableRemoveV2 (LookupTableRemoveV2) /job:localhost/replica:0/task:0/device:GPU:0

2022-10-18 11:02:24.020591: W tensorflow/core/common_runtime/colocation_graph.cc:1145] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [ /job:localhost/replica:0/task:0/device:CPU:0]. See below for details of this colocation group: Colocation Debug Info: Colocation group had the following types and supported devices: Root Member(assigned_device_nameindex=-1 requested_devicename='/job:localhost/replica:0/task:0/device:GPU:0' assigned_devicename='' resource_devicename='/job:localhost/replica:0/task:0/device:GPU:0' supported_devicetypes=[CPU] possibledevices=[] LookupTableRemoveV2: CPU LookupTableFindV2: CPU LookupTableInsertV2: CPU MutableHashTableV2: CPU LookupTableExportV2: CPU

Colocation members, user-requested devices, and framework assigned devices, if any: ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_1 (MutableHashTableV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_1_lookup_table_export_values/LookupTableExportV2 (LookupTableExportV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_1_lookup_table_insert/LookupTableInsertV2 (LookupTableInsertV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_1_lookup_table_find/LookupTableFindV2 (LookupTableFindV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_1_lookup_table_export_values_1/LookupTableExportV2 (LookupTableExportV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_1_lookup_table_remove/LookupTableRemoveV2 (LookupTableRemoveV2) /job:localhost/replica:0/task:0/device:GPU:0

2022-10-18 11:02:24.020773: W tensorflow/core/common_runtime/colocation_graph.cc:1145] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [ /job:localhost/replica:0/task:0/device:CPU:0]. See below for details of this colocation group: Colocation Debug Info: Colocation group had the following types and supported devices: Root Member(assigned_device_nameindex=-1 requested_devicename='/job:localhost/replica:0/task:0/device:GPU:0' assigned_devicename='' resource_devicename='/job:localhost/replica:0/task:0/device:GPU:0' supported_devicetypes=[CPU] possibledevices=[] LookupTableRemoveV2: CPU LookupTableInsertV2: CPU MutableHashTableV2: CPU LookupTableExportV2: CPU

Colocation members, user-requested devices, and framework assigned devices, if any: ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_2 (MutableHashTableV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_2_lookup_table_export_values/LookupTableExportV2 (LookupTableExportV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_2_lookup_table_insert/LookupTableInsertV2 (LookupTableInsertV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_2_lookup_table_export_values_1/LookupTableExportV2 (LookupTableExportV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_2_lookup_table_export_values_2/LookupTableExportV2 (LookupTableExportV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_2_lookup_table_remove/LookupTableRemoveV2 (LookupTableRemoveV2) /job:localhost/replica:0/task:0/device:GPU:0

2022-10-18 11:02:24.020985: W tensorflow/core/common_runtime/colocation_graph.cc:1145] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [ /job:localhost/replica:0/task:0/device:CPU:0]. See below for details of this colocation group: Colocation Debug Info: Colocation group had the following types and supported devices: Root Member(assigned_device_nameindex=-1 requested_devicename='/job:localhost/replica:0/task:0/device:GPU:0' assigned_devicename='' resource_devicename='/job:localhost/replica:0/task:0/device:GPU:0' supported_devicetypes=[CPU] possibledevices=[] LookupTableRemoveV2: CPU LookupTableInsertV2: CPU MutableHashTableV2: CPU LookupTableExportV2: CPU

Colocation members, user-requested devices, and framework assigned devices, if any: ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_3 (MutableHashTableV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_3_lookup_table_export_values/LookupTableExportV2 (LookupTableExportV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_3_lookup_table_insert/LookupTableInsertV2 (LookupTableInsertV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_3_lookup_table_export_values_1/LookupTableExportV2 (LookupTableExportV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_3_lookup_table_export_values_2/LookupTableExportV2 (LookupTableExportV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_3_lookup_table_remove/LookupTableRemoveV2 (LookupTableRemoveV2) /job:localhost/replica:0/task:0/device:GPU:0

2022-10-18 11:02:24.021158: W tensorflow/core/common_runtime/colocation_graph.cc:1145] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [ /job:localhost/replica:0/task:0/device:CPU:0]. See below for details of this colocation group: Colocation Debug Info: Colocation group had the following types and supported devices: Root Member(assigned_device_nameindex=-1 requested_devicename='/job:localhost/replica:0/task:0/device:GPU:0' assigned_devicename='' resource_devicename='/job:localhost/replica:0/task:0/device:GPU:0' supported_devicetypes=[CPU] possibledevices=[] LookupTableRemoveV2: CPU LookupTableFindV2: CPU LookupTableInsertV2: CPU MutableHashTableV2: CPU LookupTableExportV2: CPU

Colocation members, user-requested devices, and framework assigned devices, if any: ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_4 (MutableHashTableV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_4_lookup_table_export_values/LookupTableExportV2 (LookupTableExportV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_4_lookup_table_insert/LookupTableInsertV2 (LookupTableInsertV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_4_lookup_table_find/LookupTableFindV2 (LookupTableFindV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_4_lookup_table_export_values_1/LookupTableExportV2 (LookupTableExportV2) /job:localhost/replica:0/task:0/device:GPU:0 ViPDeepLab/video_panoptic_prediction_stitcher/MutableHashTable_4_lookup_table_remove/LookupTableRemoveV2 (LookupTableRemoveV2) /job:localhost/replica:0/task:0/device:GPU:0

2022-10-18 11:02:31.094278: I tensorflow/stream_executor/cuda/cuda_dnn.cc:369] Loaded cuDNN version 8101 I1018 11:40:14.206324 140524474017600 loop_fns.py:81] The dataset iterator is exhausted after 4070 steps. W1018 11:40:14.704017 140524474017600 api.py:446] No detections to evaluate. I1018 11:40:14.990640 140524474017600 controller.py:310] eval | step: 0 | eval time: 2286.8 sec | output: {'evaluation/ap/AP_Mask': 0.0, 'evaluation/depth/AbsErrorRel': 3.429016, 'evaluation/depth/DepthInlier': 0.044548888, 'evaluation/depth/SILog': 57.931797, 'evaluation/depth/SqErrorRel': 17.738022, 'evaluation/iou/IoU': 1.5528065e-05, 'evaluation/pq/FN': 14165.5, 'evaluation/pq/FP': 2414.5, 'evaluation/pq/PQ': 0.0, 'evaluation/pq/RQ': 0.0, 'evaluation/pq/SQ': 0.0, 'evaluation/pq/TP': 0.0, 'evaluation/step/AQ': 0.0, 'evaluation/step/IoU': 0.0, 'evaluation/step/STQ': 0.0, 'evaluation/vpq_2frames/FN': 14505.0, 'evaluation/vpq_2frames/FP': 2479.0, 'evaluation/vpq_2frames/PQ': 0.0, 'evaluation/vpq_2frames/RQ': 0.0, 'evaluation/vpq_2frames/SQ': 0.0, 'evaluation/vpq_2frames/TP': 0.0, 'losses/eval_center_loss': 1.1949805, 'losses/eval_depth_loss': 0.44547412, 'losses/eval_next_regression_loss': 1.9722644, 'losses/eval_regression_loss': 1.9717673, 'losses/eval_semantic_loss': 3.3964477, 'losses/eval_total_loss': 8.980951} eval | step: 0 | eval time: 2286.8 sec | output: {'evaluation/ap/AP_Mask': 0.0, 'evaluation/depth/AbsErrorRel': 3.429016, 'evaluation/depth/DepthInlier': 0.044548888, 'evaluation/depth/SILog': 57.931797, 'evaluation/depth/SqErrorRel': 17.738022, 'evaluation/iou/IoU': 1.5528065e-05, 'evaluation/pq/FN': 14165.5, 'evaluation/pq/FP': 2414.5, 'evaluation/pq/PQ': 0.0, 'evaluation/pq/RQ': 0.0, 'evaluation/pq/SQ': 0.0, 'evaluation/pq/TP': 0.0, 'evaluation/step/AQ': 0.0, 'evaluation/step/IoU': 0.0, 'evaluation/step/STQ': 0.0, 'evaluation/vpq_2frames/FN': 14505.0, 'evaluation/vpq_2frames/FP': 2479.0, 'evaluation/vpq_2frames/PQ': 0.0, 'evaluation/vpq_2frames/RQ': 0.0, 'evaluation/vpq_2frames/SQ': 0.0, 'evaluation/vpq_2frames/TP': 0.0, 'losses/eval_center_loss': 1.1949805, 'losses/eval_depth_loss': 0.44547412, 'losses/eval_next_regression_loss': 1.9722644, 'losses/eval_regression_loss': 1.9717673, 'losses/eval_semantic_loss': 3.3964477, 'losses/eval_total_loss': 8.980951}