ApolloAuto / apollo

An open autonomous driving platform
Apache License 2.0
25.02k stars 9.67k forks source link

Perception module not working #13391

Open Manoj-EK opened 3 years ago

Manoj-EK commented 3 years ago

System information

Steps to reproduce the issue:

ntutangyun commented 3 years ago

hi, please try to launch the perception module in a terminal manually and find any errors in the console output.

Manoj-EK commented 3 years ago

@ntutangyun I got the following errors on console output

[perception] WARNING: Logging before InitGoogleLogging() is written to STDERR [perception] I0201 15:12:54.318677 54397 module_argument.cc:81] []command: mainboard -d /apollo/modules/perception/production/dag/dag_streaming_perception.dag /apollo/modules/perception/production/dag/dag_streaming_perception_camera.dag -p perception -s CYBER_DEFAULT [perception] I0201 15:12:54.318744 54397 module_argument.cc:120] []Found non-option ARGV-element "/apollo/modules/perception/production/dag/dag_streaming_perception_camera.dag" [perception] I0201 15:12:54.318749 54397 module_argument.cc:29] []Usage: [perception] mainboard [OPTION]... [perception] Description: [perception] -h, --help : help information [perception] -d, --dag_conf=CONFIG_FILE : module dag config file [perception] -p, --process_group=process_group: the process namespace for running this module, default in manager process [perception] -s, --sched_name=sched_name: sched policy conf for hole process, sched_name should be conf in cyber.pb.conf [perception] Example: [perception] mainboard -h [perception] mainboard -d dag_conf_file1 -d dag_conf_file2 -p process_group -s sched_name [perception_trafficlights] WARNING: Logging before InitGoogleLogging() is written to STDERR [perception_trafficlights] I0201 15:12:54.319600 54399 module_argument.cc:81] []command: mainboard -d /apollo/modules/perception/production/dag/dag_streaming_perception_trafficlights.dag -p perception_trafficlights -s CYBER_DEFAULT [perception_trafficlights] I0201 15:12:54.320253 54399 global_data.cc:153] []host ip: 192.168.0.103 [perception_trafficlights] I0201 15:12:54.320417 54399 module_argument.cc:57] []binaryname is mainboard, processgroup is perception_trafficlights, has 1 dag conf [perception_trafficlights] I0201 15:12:54.320423 54399 module_argument.cc:60] []dag_conf: /apollo/modules/perception/production/dag/dag_streaming_perception_trafficlights.dag [motion_service] WARNING: Logging before InitGoogleLogging() is written to STDERR [motion_service] I0201 15:12:54.320730 54401 module_argument.cc:81] []command: mainboard -d /apollo/modules/perception/production/dag/dag_motion_service.dag -p motion_service -s CYBER_DEFAULT [motion_service] I0201 15:12:54.321365 54401 global_data.cc:153] []host ip: 192.168.0.103 [motion_service] I0201 15:12:54.321576 54401 module_argument.cc:57] []binaryname is mainboard, processgroup is motion_service, has 1 dag conf [motion_service] I0201 15:12:54.321581 54401 module_argument.cc:60] []dag_conf: /apollo/modules/perception/production/dag/dag_motion_service.dag [motion_service] E0201 15:12:54.322608 54401 module_controller.cc:87] [mainboard]Path does not exist: /apollo/bazel-bin/modules/perception/camera/lib/motion_service/libmotion_service.so [motion_service] E0201 15:12:54.322623 54401 module_controller.cc:67] [mainboard]Failed to load module: /apollo/modules/perception/production/dag/dag_motion_service.dag [motion_service] E0201 15:12:54.322626 54401 mainboard.cc:39] [mainboard]module start error. [motion_service]

ntutangyun commented 3 years ago
[motion_service] E0201 15:12:54.322608 54401 module_controller.cc:87] [mainboard]Path does not exist: /apollo/bazel-bin/modules/perception/camera/lib/motion_service/libmotion_service.so
[motion_service] E0201 15:12:54.322623 54401 module_controller.cc:67] [mainboard]Failed to load module: /apollo/modules/perception/production/dag/dag_motion_service.dag
[motion_service] E0201 15:12:54.322626 54401 mainboard.cc:39] [mainboard]module start error.
[motion_service]

seems that the motion service is causing the error.

what command did you use to start the perception module?

could you try mainboard -d modules/perception/production/dag/dag_streaming_perception.dag and see the console output ?

Manoj-EK commented 3 years ago

@ntutangyun....Once i entered the following command: mainboard -d modules/perception/production/dag/dag_streaming_perception.dag

I got the following console output

[manoj@in-dev-docker:/apollo]$ mainboard -d modules/perception/production/dag/dag_streaming_perception.dag WARNING: Logging before InitGoogleLogging() is written to STDERR I0201 15:43:02.222260 60826 module_argument.cc:81] []command: mainboard -d modules/perception/production/dag/dag_streaming_perception.dag I0201 15:43:02.222978 60826 global_data.cc:153] []host ip: 192.168.0.103 I0201 15:43:02.223189 60826 module_argument.cc:57] []binaryname is mainboard, processgroup is mainboard_default, has 1 dag conf I0201 15:43:02.223198 60826 module_argument.cc:60] []dag_conf: modules/perception/production/dag/dag_streaming_perception.dag

Input filename: /apollo/modules/perception/production/data/perception/lidar/models/detection/point_pillars/rpn.onnx ONNX IR version: 0.0.6 Opset version: 9 Producer name: pytorch Producer version: 1.5 Domain:
Model version: 0 Doc string:

WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: Can't fuse pad and convolution with same pad mode WARNING: Can't fuse pad and convolution with caffe pad mode WARNING: Can't fuse pad and convolution with same pad mode WARNING: Can't fuse pad and convolution with caffe pad mode WARNING: Can't fuse pad and convolution with same pad mode WARNING: Can't fuse pad and convolution with caffe pad mode WARNING: Can't fuse pad and convolution with same pad mode WARNING: Can't fuse pad and convolution with caffe pad mode WARNING: Can't fuse pad and convolution with same pad mode WARNING: Can't fuse pad and convolution with caffe pad mode WARNING: Can't fuse pad and convolution with same pad mode WARNING: Can't fuse pad and convolution with caffe pad mode ^CWARNING: Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles E0201 15:43:23.009835 60826 cyber.cc:38] [mainboard]please initialize cyber firstly. Segmentation fault (core dumped)

Manoj-EK commented 3 years ago

@ntutangyun ...once I checked the perception it was turned on... But after connecting to bridge, I got the following console output

E0201 15:49:24.684509 60893 transform_wrapper.cc:254] [mainboard]Can not find transform. 1612190855.766881943 frame_id: novatel child_frame_id: velodyne128 Error info: canTransform: target_frame novatel does not exist. canTransform: source_frame velodyne128 does not exist.:timeout

Manoj-EK commented 3 years ago

@ntutangyun....After 5 minutes of continous error messages on console output...It ended with following messages

E0201 15:52:39.678357 60886 detection_component.cc:126] [mainboard]Failed to get pose at time: 1612191050.666696072 E0201 15:52:39.748929 60889 transform_wrapper.cc:254] [mainboard]Can not find transform. 1612191050.726696014 frame_id: world child_frame_id: novatel Error info: Lookup would require extrapolation into the future. Requested time 1612191050726695936 but the latest data is at time 1612191050706695936, when looking up transform from frame [novatel] to frame [world]:timeout E0201 15:52:39.748963 60889 detection_component.cc:126] [mainboard]Failed to get pose at time: 1612191050.726696014 terminate called after throwing an instance of 'std::runtimeerror' what(): The following operation failed in the TorchScript interpreter. Traceback of TorchScript, serialized code (most recent call last): File "code/torch/fueling/perception/pointpillars/second/pytorch/models/pointpillars.py", line 52, in forward mask0 = torch.unsqueeze(mask, -1) input0 = torch.mul(features0, mask0) _29 = torch.squeeze((_0).forward(input0, ))


    _30 = torch.contiguous(torch.permute(_29, [1, 0]), memory_format=0)
    return _30
  File "code/__torch__/fueling/perception/pointpillars/second/pytorch/models/pointpillars.py", line 66, in forward
    input1 = torch.contiguous(_32, memory_format=0)
    _33 = torch.permute((_31).forward(input1, ), [0, 2, 1])
    input2 = torch.contiguous(_33, memory_format=0)
             ~~~~~~~~~~~~~~~~ <--- HERE
    x = torch.relu(input2)
    features, _34 = torch.max(x, 1, True)

Traceback of TorchScript, original code (most recent call last):
/home/chenjiahao/data/code/ApolloAuto/apollo-fuel/local/bazel_cache/7e10e8559c5568eba6bbbebc781c72fc/execroot/fuel/bazel-out/k8-fastbuild/bin/fueling/perception/pointpillars/pipelines/convert_libtorch_pipeline.runfiles/fuel/fueling/perception/pointpillars/second/pytorch/models/pointpillars.py(58): forward
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/torch/nn/modules/module.py(534): _slow_forward
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/torch/nn/modules/module.py(548): __call__
/home/chenjiahao/data/code/ApolloAuto/apollo-fuel/local/bazel_cache/7e10e8559c5568eba6bbbebc781c72fc/execroot/fuel/bazel-out/k8-fastbuild/bin/fueling/perception/pointpillars/pipelines/convert_libtorch_pipeline.runfiles/fuel/fueling/perception/pointpillars/second/pytorch/models/pointpillars.py(353): forward
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/torch/nn/modules/module.py(534): _slow_forward
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/torch/nn/modules/module.py(548): __call__
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/torch/jit/__init__.py(1027): trace_module
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/torch/jit/__init__.py(875): trace
/home/chenjiahao/data/code/ApolloAuto/apollo-fuel/local/bazel_cache/7e10e8559c5568eba6bbbebc781c72fc/execroot/fuel/bazel-out/k8-fastbuild/bin/fueling/perception/pointpillars/pipelines/convert_libtorch_pipeline.runfiles/fuel/fueling/perception/pointpillars/second/pytorch/outline_inference.py(132): convert_libtorch
/tmp/Bazel.runfiles_zcw24joq/runfiles/fuel/fueling/perception/pointpillars/pipelines/convert_libtorch_pipeline.py(33): convert_libtorch
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/pyspark/util.py(107): wrapper
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/pyspark/rdd.py(860): processPartition
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/pyspark/rdd.py(425): func
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/pyspark/rdd.py(2596): pipeline_func
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/pyspark/rdd.py(2596): pipeline_func
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/pyspark/rdd.py(2596): pipeline_func
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/pyspark/rdd.py(2596): pipeline_func
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/pyspark/python/lib/pyspark.zip/pyspark/worker.py(595): process
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/pyspark/python/lib/pyspark.zip/pyspark/worker.py(605): main
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/pyspark/python/lib/pyspark.zip/pyspark/daemon.py(74): worker
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/pyspark/python/lib/pyspark.zip/pyspark/daemon.py(186): manager
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/pyspark/python/lib/pyspark.zip/pyspark/daemon.py(211): <module>
/usr/local/miniconda/envs/fuel/lib/python3.6/runpy.py(85): _run_code
/usr/local/miniconda/envs/fuel/lib/python3.6/runpy.py(193): _run_module_as_main
RuntimeError: CUDA out of memory. Tried to allocate 440.00 MiB (GPU 0; 7.79 GiB total capacity; 1021.66 MiB already allocated; 175.88 MiB free; 1.39 GiB reserved in total by PyTorch)

Aborted (core dumped)
ntutangyun commented 3 years ago

seems that you need a GPU with bigger Graphic Memory

Manoj-EK commented 3 years ago

@ntutangyun Current GPU I am using is:

NVIDIA Corporation TU104BM [GeForce RTX 2080 Mobile]

Its Memory is 8GB...isnt this enough

ntutangyun commented 3 years ago

based on this CUDA out of memory message, i guess it's not enough...

Manoj-EK commented 3 years ago

I think its due to error...but is there any other way to overcome this perception problem

Manoj-EK commented 3 years ago

@ntutangyun ....i am using third party perception, ego vehicle is able to follow the pre-defined path..the problem here is with object detection and tracking...I found that ego vehicle does not detect and track any objects. I have provided screenshot of the same in below:

Screenshot from 2021-02-02 09-46-50

Any ideas to overcome the issue

ntutangyun commented 3 years ago

I'm not sure what exactly the third party perception you're using.... I only Lidar Perception provided by Apollo

Autofoxsys commented 3 years ago

You state in your opening message that you're using Apollo v5.5 but in the debug messages I see something related to point pillars. I don't believe that the algorithm using point pillars was introducted until version 6.0 so I'm wondering if you've pulled in some updates from v.6.0.

Check this folder: apollo/modules/perception/lidar/lib/ If you're on v5.5 then you shouldn't have a point pillars folder.

Manoj-EK commented 3 years ago

@Autofoxsys Thank you ....will look into it

lemketron commented 3 years ago

@ntutangyun Current GPU I am using is:

NVIDIA Corporation TU104BM [GeForce RTX 2080 Mobile]

Its Memory is 8GB...isnt this enough

If I remember correctly, Apollo 5.5 does not include support for the RTX 2080. You need to use Apollo 6.0 or later, but in 6.0 and later perception may not be working yet due to in-progress re-factoring. So to use perception on 5.5 you'll need to use a GTX-1080 GPU.