Open Manoj-EK opened 3 years ago
hi, please try to launch the perception module in a terminal manually and find any errors in the console output.
@ntutangyun I got the following errors on console output
[perception] WARNING: Logging before InitGoogleLogging() is written to STDERR [perception] I0201 15:12:54.318677 54397 module_argument.cc:81] []command: mainboard -d /apollo/modules/perception/production/dag/dag_streaming_perception.dag /apollo/modules/perception/production/dag/dag_streaming_perception_camera.dag -p perception -s CYBER_DEFAULT [perception] I0201 15:12:54.318744 54397 module_argument.cc:120] []Found non-option ARGV-element "/apollo/modules/perception/production/dag/dag_streaming_perception_camera.dag" [perception] I0201 15:12:54.318749 54397 module_argument.cc:29] []Usage: [perception] mainboard [OPTION]... [perception] Description: [perception] -h, --help : help information [perception] -d, --dag_conf=CONFIG_FILE : module dag config file [perception] -p, --process_group=process_group: the process namespace for running this module, default in manager process [perception] -s, --sched_name=sched_name: sched policy conf for hole process, sched_name should be conf in cyber.pb.conf [perception] Example: [perception] mainboard -h [perception] mainboard -d dag_conf_file1 -d dag_conf_file2 -p process_group -s sched_name [perception_trafficlights] WARNING: Logging before InitGoogleLogging() is written to STDERR [perception_trafficlights] I0201 15:12:54.319600 54399 module_argument.cc:81] []command: mainboard -d /apollo/modules/perception/production/dag/dag_streaming_perception_trafficlights.dag -p perception_trafficlights -s CYBER_DEFAULT [perception_trafficlights] I0201 15:12:54.320253 54399 global_data.cc:153] []host ip: 192.168.0.103 [perception_trafficlights] I0201 15:12:54.320417 54399 module_argument.cc:57] []binaryname is mainboard, processgroup is perception_trafficlights, has 1 dag conf [perception_trafficlights] I0201 15:12:54.320423 54399 module_argument.cc:60] []dag_conf: /apollo/modules/perception/production/dag/dag_streaming_perception_trafficlights.dag [motion_service] WARNING: Logging before InitGoogleLogging() is written to STDERR [motion_service] I0201 15:12:54.320730 54401 module_argument.cc:81] []command: mainboard -d /apollo/modules/perception/production/dag/dag_motion_service.dag -p motion_service -s CYBER_DEFAULT [motion_service] I0201 15:12:54.321365 54401 global_data.cc:153] []host ip: 192.168.0.103 [motion_service] I0201 15:12:54.321576 54401 module_argument.cc:57] []binaryname is mainboard, processgroup is motion_service, has 1 dag conf [motion_service] I0201 15:12:54.321581 54401 module_argument.cc:60] []dag_conf: /apollo/modules/perception/production/dag/dag_motion_service.dag [motion_service] E0201 15:12:54.322608 54401 module_controller.cc:87] [mainboard]Path does not exist: /apollo/bazel-bin/modules/perception/camera/lib/motion_service/libmotion_service.so [motion_service] E0201 15:12:54.322623 54401 module_controller.cc:67] [mainboard]Failed to load module: /apollo/modules/perception/production/dag/dag_motion_service.dag [motion_service] E0201 15:12:54.322626 54401 mainboard.cc:39] [mainboard]module start error. [motion_service]
[motion_service] E0201 15:12:54.322608 54401 module_controller.cc:87] [mainboard]Path does not exist: /apollo/bazel-bin/modules/perception/camera/lib/motion_service/libmotion_service.so
[motion_service] E0201 15:12:54.322623 54401 module_controller.cc:67] [mainboard]Failed to load module: /apollo/modules/perception/production/dag/dag_motion_service.dag
[motion_service] E0201 15:12:54.322626 54401 mainboard.cc:39] [mainboard]module start error.
[motion_service]
seems that the motion service is causing the error.
what command did you use to start the perception module?
could you try mainboard -d modules/perception/production/dag/dag_streaming_perception.dag
and see the console output ?
@ntutangyun....Once i entered the following command: mainboard -d modules/perception/production/dag/dag_streaming_perception.dag
I got the following console output
WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. WARNING: Can't fuse pad and convolution with same pad mode WARNING: Can't fuse pad and convolution with caffe pad mode WARNING: Can't fuse pad and convolution with same pad mode WARNING: Can't fuse pad and convolution with caffe pad mode WARNING: Can't fuse pad and convolution with same pad mode WARNING: Can't fuse pad and convolution with caffe pad mode WARNING: Can't fuse pad and convolution with same pad mode WARNING: Can't fuse pad and convolution with caffe pad mode WARNING: Can't fuse pad and convolution with same pad mode WARNING: Can't fuse pad and convolution with caffe pad mode WARNING: Can't fuse pad and convolution with same pad mode WARNING: Can't fuse pad and convolution with caffe pad mode ^CWARNING: Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles E0201 15:43:23.009835 60826 cyber.cc:38] [mainboard]please initialize cyber firstly. Segmentation fault (core dumped)
@ntutangyun ...once I checked the perception it was turned on... But after connecting to bridge, I got the following console output
E0201 15:49:24.684509 60893 transform_wrapper.cc:254] [mainboard]Can not find transform. 1612190855.766881943 frame_id: novatel child_frame_id: velodyne128 Error info: canTransform: target_frame novatel does not exist. canTransform: source_frame velodyne128 does not exist.:timeout
@ntutangyun....After 5 minutes of continous error messages on console output...It ended with following messages
E0201 15:52:39.678357 60886 detection_component.cc:126] [mainboard]Failed to get pose at time: 1612191050.666696072 E0201 15:52:39.748929 60889 transform_wrapper.cc:254] [mainboard]Can not find transform. 1612191050.726696014 frame_id: world child_frame_id: novatel Error info: Lookup would require extrapolation into the future. Requested time 1612191050726695936 but the latest data is at time 1612191050706695936, when looking up transform from frame [novatel] to frame [world]:timeout E0201 15:52:39.748963 60889 detection_component.cc:126] [mainboard]Failed to get pose at time: 1612191050.726696014 terminate called after throwing an instance of 'std::runtimeerror' what(): The following operation failed in the TorchScript interpreter. Traceback of TorchScript, serialized code (most recent call last): File "code/torch/fueling/perception/pointpillars/second/pytorch/models/pointpillars.py", line 52, in forward mask0 = torch.unsqueeze(mask, -1) input0 = torch.mul(features0, mask0) _29 = torch.squeeze((_0).forward(input0, ))
_30 = torch.contiguous(torch.permute(_29, [1, 0]), memory_format=0)
return _30
File "code/__torch__/fueling/perception/pointpillars/second/pytorch/models/pointpillars.py", line 66, in forward
input1 = torch.contiguous(_32, memory_format=0)
_33 = torch.permute((_31).forward(input1, ), [0, 2, 1])
input2 = torch.contiguous(_33, memory_format=0)
~~~~~~~~~~~~~~~~ <--- HERE
x = torch.relu(input2)
features, _34 = torch.max(x, 1, True)
Traceback of TorchScript, original code (most recent call last):
/home/chenjiahao/data/code/ApolloAuto/apollo-fuel/local/bazel_cache/7e10e8559c5568eba6bbbebc781c72fc/execroot/fuel/bazel-out/k8-fastbuild/bin/fueling/perception/pointpillars/pipelines/convert_libtorch_pipeline.runfiles/fuel/fueling/perception/pointpillars/second/pytorch/models/pointpillars.py(58): forward
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/torch/nn/modules/module.py(534): _slow_forward
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/torch/nn/modules/module.py(548): __call__
/home/chenjiahao/data/code/ApolloAuto/apollo-fuel/local/bazel_cache/7e10e8559c5568eba6bbbebc781c72fc/execroot/fuel/bazel-out/k8-fastbuild/bin/fueling/perception/pointpillars/pipelines/convert_libtorch_pipeline.runfiles/fuel/fueling/perception/pointpillars/second/pytorch/models/pointpillars.py(353): forward
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/torch/nn/modules/module.py(534): _slow_forward
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/torch/nn/modules/module.py(548): __call__
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/torch/jit/__init__.py(1027): trace_module
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/torch/jit/__init__.py(875): trace
/home/chenjiahao/data/code/ApolloAuto/apollo-fuel/local/bazel_cache/7e10e8559c5568eba6bbbebc781c72fc/execroot/fuel/bazel-out/k8-fastbuild/bin/fueling/perception/pointpillars/pipelines/convert_libtorch_pipeline.runfiles/fuel/fueling/perception/pointpillars/second/pytorch/outline_inference.py(132): convert_libtorch
/tmp/Bazel.runfiles_zcw24joq/runfiles/fuel/fueling/perception/pointpillars/pipelines/convert_libtorch_pipeline.py(33): convert_libtorch
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/pyspark/util.py(107): wrapper
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/pyspark/rdd.py(860): processPartition
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/pyspark/rdd.py(425): func
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/pyspark/rdd.py(2596): pipeline_func
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/pyspark/rdd.py(2596): pipeline_func
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/pyspark/rdd.py(2596): pipeline_func
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/pyspark/rdd.py(2596): pipeline_func
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/pyspark/python/lib/pyspark.zip/pyspark/worker.py(595): process
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/pyspark/python/lib/pyspark.zip/pyspark/worker.py(605): main
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/pyspark/python/lib/pyspark.zip/pyspark/daemon.py(74): worker
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/pyspark/python/lib/pyspark.zip/pyspark/daemon.py(186): manager
/usr/local/miniconda/envs/fuel/lib/python3.6/site-packages/pyspark/python/lib/pyspark.zip/pyspark/daemon.py(211): <module>
/usr/local/miniconda/envs/fuel/lib/python3.6/runpy.py(85): _run_code
/usr/local/miniconda/envs/fuel/lib/python3.6/runpy.py(193): _run_module_as_main
RuntimeError: CUDA out of memory. Tried to allocate 440.00 MiB (GPU 0; 7.79 GiB total capacity; 1021.66 MiB already allocated; 175.88 MiB free; 1.39 GiB reserved in total by PyTorch)
Aborted (core dumped)
seems that you need a GPU with bigger Graphic Memory
@ntutangyun Current GPU I am using is:
NVIDIA Corporation TU104BM [GeForce RTX 2080 Mobile]
Its Memory is 8GB...isnt this enough
based on this CUDA out of memory message, i guess it's not enough...
I think its due to error...but is there any other way to overcome this perception problem
@ntutangyun ....i am using third party perception, ego vehicle is able to follow the pre-defined path..the problem here is with object detection and tracking...I found that ego vehicle does not detect and track any objects. I have provided screenshot of the same in below:
Any ideas to overcome the issue
I'm not sure what exactly the third party perception you're using.... I only Lidar Perception provided by Apollo
You state in your opening message that you're using Apollo v5.5 but in the debug messages I see something related to point pillars. I don't believe that the algorithm using point pillars was introducted until version 6.0 so I'm wondering if you've pulled in some updates from v.6.0.
Check this folder: apollo/modules/perception/lidar/lib/ If you're on v5.5 then you shouldn't have a point pillars folder.
@Autofoxsys Thank you ....will look into it
@ntutangyun Current GPU I am using is:
NVIDIA Corporation TU104BM [GeForce RTX 2080 Mobile]
Its Memory is 8GB...isnt this enough
If I remember correctly, Apollo 5.5 does not include support for the RTX 2080. You need to use Apollo 6.0 or later, but in 6.0 and later perception may not be working yet due to in-progress re-factoring. So to use perception on 5.5 you'll need to use a GTX-1080 GPU.
System information
apollo.sh config
if onmaster
branch:Steps to reproduce the issue: