ApolloAuto / apollo

An open autonomous driving platform
Apache License 2.0
25.21k stars 9.71k forks source link

Apollo5.0: How to Run Perception Module on Your Local Computer can't launch camera #10007

Open cxr1996 opened 5 years ago

cxr1996 commented 5 years ago

System information

-*ubuntu14.04

Steps to reproduce the issue:

cyber_launch start /apollo/modules/perception/production/launch/perception_camera.launch can't work.

Supporting materials (screenshots, command lines, code/script snippets):

[mainboard_default_6443] E1024 13:11:50.780318 6451 class_loader_utility.cc:220] [mainboard] poco LibraryLoadException: libcuda.so.1: cannot open shared object file: No such file or directory [mainboard_default_6443] E1024 13:11:50.780418 6451 class_loader_utility.cc:236] [mainboard] poco shared library failed: /apollo/bazel-bin/modules/perception/onboard/component/libperception_component_camera.so [mainboard_default_6443] E1024 13:11:50.780437 6451 class_loader_manager.h:70] [mainboard] Invalid class name: FusionCameraDetectionComponent [mainboard_default_6443] E1024 13:11:50.780453 6451 module_controller.cc:59] [mainboard] Failed to load module: /apollo/modules/perception/production/dag/dag_streaming_perception_camera.dag [mainboard_default_6443] E1024 13:11:50.780463 6451 class_loader_utility.cc:258] [mainboard] Attempt to UnloadLibrary lib, but can't find lib: /apollo/bazel-bin/modules/perception/onboard/component/libperception_component_camera.so [mainboard_default_6443] E1024 13:11:50.780469 6451 mainboard.cc:43] [mainboard] module start error. [mainboard_default_6443]
[cyber_launch_6443] ERROR Process [mainboard_default_6443] has finished. [pid 6451, cmd mainboard -d /apollo/modules/perception/production/dag/dag_streaming_perception_camera.dag /apollo/modules/perception/production/dag/dag_motion_service.dag -p mainboard_default_6443 -s CYBER_DEFAULT]. [cyber_launch_6443] INFO All processes has died. [cyber_launch_6443] INFO Cyber exit. [cyber_launch_6443] INFO All processes have been stopped.

christian-lanius commented 5 years ago

It is complaining about not being able to find cuda. Can you verify that the library is available inside your docker? Running ldconfig -p | grep libcuda.so should return a line like libcuda.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libcuda.so.1. If the command does not return anything, can you verify if nvidia-smi returns the correct output in your docker?

cxr1996 commented 5 years ago

It is complaining about not being able to find cuda. Can you verify that the library is available inside your docker? Running ldconfig -p | grep libcuda.so should return a line like libcuda.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libcuda.so.1. If the command does not return anything, can you verify if nvidia-smi returns the correct output in your docker?

Thank you very much for your answer. You are right. I have solved this problem, but there are other problems here.

[mainboard_default_15781] [NVBLAS] NVBLAS_CONFIG_FILE environment variable is NOT set : relying on default config filename 'nvblas.conf' [mainboard_default_15781] [NVBLAS] Cannot open default config file 'nvblas.conf' [mainboard_default_15781] [NVBLAS] Config parsed [mainboard_default_15781] [NVBLAS] CPU Blas library need to be provided [cyber_launch_15781] ERROR Process [mainboard_default_15781] has finished. [pid 15800, cmd mainboard -d /apollo/modules/perception/production/dag/dag_streaming_perception_camera.dag /apollo/modules/perception/production/dag/dag_motion_service.dag -p mainboard_default_15781 -s CYBER_DEFAULT]. [cyber_launch_15781] INFO All processes has died. [cyber_launch_15781] INFO Cyber exit. [cyber_launch_15781] INFO All processes have been stopped.

it can't open nvblas.conf

christian-lanius commented 5 years ago

Normally, this file nvblas.conf is not needed in order to run the perception. You can get rid of this error message, by downloading a default nvblas file from nvidia, but this is not related to the crash you are experiencing: https://docs.nvidia.com/cuda/nvblas/index.html#configuration_example

You can run the command: gdb --args mainboard -d /apollo/modules/perception/production/dag/dag_streaming_perception_camera.dag to attach a debugger to the program and then have a look at the stack trace when your process crashes. Maybe this helps you identify possible issues with your setup.

cxr1996 commented 5 years ago

Normally, this file nvblas.conf is not needed in order to run the perception. You can get rid of this error message, by downloading a default nvblas file from nvidia, but this is not related to the crash you are experiencing: https://docs.nvidia.com/cuda/nvblas/index.html#configuration_example

You can run the command: gdb --args mainboard -d /apollo/modules/perception/production/dag/dag_streaming_perception_camera.dag to attach a debugger to the program and then have a look at the stack trace when your process crashes. Maybe this helps you identify possible issues with your setup.

Thank you. Now I can see the specific error report. When I turn on the camera, my collection screen will appear on the screen, and then turn on the perception module, and this will be displayed.

[mainboard_default_6060] E1029 15:18:06.564347 6092 transform_wrapper.cc:224] [mainboard] Can not find transform. 1572333486.486972 frame_id: novatel child_frame_id: front_6mm Error info: canTransform: target_frame novatel does not exist. canTransform: source_frame front_6mm does not exist.canTransform: target_frame novatel does not exist. canTransform: source_frame front_6mm does not exist.canTransform: target_frame novatel does not exist. canTransform: source_frame front_6mm does not exist.canTransform: target_frame novatel does not exist. canTransform: source_frame front_6mm does not exist.canTransform: target_frame novatel does not exist. canTransform: source_frame front_6mm does not exist. [mainboard_default_6060] E1029 15:18:06.564409 6092 fusion_camera_detection_component.cc:684] [perception] failed to get camera to world pose, ts: 1572333486.486972 camera_name: front_6mm [mainboard_default_6060] E1029 15:18:06.564422 6092 fusion_camera_detection_component.cc:299] [perception] InternalProc failed, error_code: 4001

xmyqsh commented 5 years ago

@cxr1996 If you have your own training data or data from nuscenes-devkit, argoverse or waymo-open-dataset, you could config the transform based on the extrinsic parameters the specific dataset provided. Besides that, you may also disable traffic light detection, there is some HDMap definition mismatch when I test it in Aug. It must be a developing feature. I'm not sure it has been updated or not now.

christian-lanius commented 5 years ago

The error you see is because no static transform topic is launched: You have to publish a topic to /static_transform with the correct transforms. If you use the same setup (or don't care about the error introduced) you can just launch it with the extrinsics of the reference vehicle. In that case, just run cyber_launch start modules/transform/launch/static_transform.launch. Otherwise, create a new dag file in modules/transform/dag with a path to a new config, which you have to create yourself as well. There you can set up the extrinsics files, which contain the actual parameters, look at /apollo/modules/transform/conf/static_transform_conf.pb.txt for an example for that file.

That being said, normally, this error is just getting spammed into the log, it does not make the program crash. So, for debugging, it might be good enough to just publish the default transforms and verify if that actually fixes the problem.

cxr1996 commented 5 years ago

The error you see is because no static transform topic is launched: You have to publish a topic to /static_transform with the correct transforms. If you use the same setup (or don't care about the error introduced) you can just launch it with the extrinsics of the reference vehicle. In that case, just run cyber_launch start modules/transform/launch/static_transform.launch. Otherwise, create a new dag file in modules/transform/dag with a path to a new config, which you have to create yourself as well. There you can set up the extrinsics files, which contain the actual parameters, look at /apollo/modules/transform/conf/static_transform_conf.pb.txt for an example for that file.

That being said, normally, this error is just getting spammed into the log, it does not make the program crash. So, for debugging, it might be good enough to just publish the default transforms and verify if that actually fixes the problem.

Ok, thank you. I also found that this log does not affect the process to continue to run, the camera-based perception module is actually running. Another question, if I want to see the visual object detection box, what should I do? My record is visualized lane results, "If you want to visualize lane results overlaid on the captured image and in bird view, mark enable_visualization: True in modules/perception/production/conf/perception/camera/lane_detection_component config before executing the above command, It will pop up when you play the recorded data in point 9, "according to the set already.

cxr1996 commented 5 years ago

If you have your own training data or data from nuscenes-devkit, argoverse or waymo-open-dataset, you could config the transform based on the extrinsic parameters the specific dataset provided. Besides that, you may also disable traffic light detection, there is some HDMap definition mismatch when I test it in Aug. It must be a developing feature. I'm not sure it has been updated or not now.

@cxr1996 If you have your own training data or data from nuscenes-devkit, argoverse or waymo-open-dataset, you could config the transform based on the extrinsic parameters the specific dataset provided. Besides that, you may also disable traffic light detection, there is some HDMap definition mismatch when I test it in Aug. It must be a developing feature. I'm not sure it has been updated or not now.

Ok, thank you very much for your answer.

zhouyapengzi commented 4 years ago

@cxr1996 hi, i have the same issue when compile pperception in my local computer. E0212 09:17:52.781256 26092 class_loader_utility.cc:220] [mainboard] poco LibraryLoadException: libcuda.so.1: cannot open shared object file: No such file or directory E0212 09:17:52.781301 26092 class_loader_utility.cc:236] [mainboard] poco shared library failed: /apollo/bazel-bin/modules/perception/onboard/component/libperception_component_camera.so E0212 09:17:52.781332 26092 class_loader_manager.h:70] [mainboard] Invalid class name: LaneDetectionComponent E0212 09:17:52.781340 26092 module_controller.cc:59] [mainboard] Failed to load module: /apollo/./modules/perception/production/dag/dag_streaming_perception_lane.dag E0212 09:17:52.781347 26092 class_loader_utility.cc:258] [mainboard] Attempt to UnloadLibrary lib, but can't find lib: /apollo/bazel-bin/modules/perception/onboard/component/libperception_component_camera.so E0212 09:17:52.781353 26092 mainboard.cc:43] [mainboard] module start error.

can you share how you solve the problem?

ideasplus commented 4 years ago

@xmyqsh Hi, Can you use the external dataset that you mentioned before including nuscenes-devkit, argoverse or waymo-open-dataset to test apollo system such as perception, prediction, planing module? If so, Can you tell me how to convert the external data into the format understood by Apollo? I don’t know how to get started. Thank you in advance.

RezaMehrabian commented 4 years ago

Hello @christian-lanius

I have a similar problem by launching v2x. I receive the error below: [/apollo/bazel-bin/modules/v2x/v2x --flagfile=/apollo/modules/v2x/conf/v2x.conf] /apollo/bazel-bin/modules/v2x/v2x: error while loading shared libraries: libfastcdr.so.1: cannot open shared object file: No such file or directory However, I run "ldconfig -p | grep libcuda.so" and the terminal shows me:

libcuda.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libcuda.so.1 libcuda.so (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libcuda.so

Do you have any suggestions?

ZhxJia commented 4 years ago

@cxr1996 hi, i have the same issue when compile pperception in my local computer. E0212 09:17:52.781256 26092 class_loader_utility.cc:220] [mainboard] poco LibraryLoadException: libcuda.so.1: cannot open shared object file: No such file or directory E0212 09:17:52.781301 26092 class_loader_utility.cc:236] [mainboard] poco shared library failed: /apollo/bazel-bin/modules/perception/onboard/component/libperception_component_camera.so E0212 09:17:52.781332 26092 class_loader_manager.h:70] [mainboard] Invalid class name: LaneDetectionComponent E0212 09:17:52.781340 26092 module_controller.cc:59] [mainboard] Failed to load module: /apollo/./modules/perception/production/dag/dag_streaming_perception_lane.dag E0212 09:17:52.781347 26092 class_loader_utility.cc:258] [mainboard] Attempt to UnloadLibrary lib, but can't find lib: /apollo/bazel-bin/modules/perception/onboard/component/libperception_component_camera.so E0212 09:17:52.781353 26092 mainboard.cc:43] [mainboard] module start error.

can you share how you solve the problem?

Have you solved this problem? I'm facing the same problem and don't know how to solve it.