ApolloAuto / apollo

An open autonomous driving platform
Apache License 2.0
24.71k stars 9.62k forks source link

Nvidia orin support #15090

Open daohu527 opened 10 months ago

daohu527 commented 10 months ago

Nvidia orin is now supported in branch 9.x_alpha and tested:rocket::rocket::rocket:. Questions about Nvidia orin can be discussed here, or open a new issue and link here.

Why add this issue

We know that many developers are concerned about running autonomous driving systems on embedded devices like Nvidia orin, We supported it on beta and hope to collect more issues. Hope this version can help you :hearts:!

duanchengwen commented 10 months ago

apollo version: 7.0 platform: orin i refer to this link https://github.com/ApolloAuto/apollo/issues/14858#issuecomment-1489733361 to adapt to tensorrt8, and it can be compiled successfully. but this error is reported when i run it

I0811 15:46:10.485334 30293 rt_net.cc:37] 2: [pluginV2Runner.cpp::execute::265] Error Code 2: Internal Error (Assertion status == kSTATUS_SUCCESS failed. )

here is the log mainboard.INFO

daohu527 commented 10 months ago

@duanchengwen We focus on the latest version, 7.0 will not be supported!

CesarLiu commented 10 months ago

apollo version: 9.x_alpha platform: jetson tx2 problem with camera driver:

  1. https://github.com/ApolloAuto/apollo/blob/34eaf82c6bd9cafde4d6dc6ed10406b05a9916e4/modules/drivers/camera/BUILD#L187, no rule/build for format:convert
  2. fatal error: immintrin.h: No such file or directory
CesarLiu commented 10 months ago

in the script to install bazel: https://github.com/ApolloAuto/apollo/blob/34eaf82c6bd9cafde4d6dc6ed10406b05a9916e4/docker/build/installers/install_bazel.sh#L92 do the following lines to install/upgrade bazel with apt really work?

zsh4614 commented 10 months ago

how to use cuda on orin?

The directory structure is as follows:

// kernel.h
#include <iostream>
#include <cuda.h>
#include <cuda_fp16.h>
#include <cuda_runtime_api.h>

...
// kernel.cu
#include "kernel.h"

...
// BAZEL
load("@rules_cc//cc:defs.bzl", "cc_binary")
load("//tools/install:install.bzl", "install")
load("//tools/install:install.bzl", "install", "install_src_files")
load("//tools:cpplint.bzl", "cpplint")
load("//third_party/gpus:common.bzl", "gpu_library", "if_cuda", "if_rocm")

package(default_visibility = ["//visibility:public"])

gpu_library(
    name="kernel",
    hdrs=["kernel.h"],
    srcs=["kernel.cu"],
    deps=[
        "@local_config_cuda//cuda:cublas",
        "@local_config_cuda//cuda:cuda_headers",
        "@local_config_cuda//cuda:cudart",
    ],
    alwayslink = True,
)

when I run:

sudo bazel build --config=gpu --config=nvidia  modules/perception/test/...

report error:

/apollo/modules/perception/test/BUILD:10:12: error while parsing .d file: /apollo/.cache/bazel/540135163923dd7d5820f3ee4b306b32/execroot/apollo/bazel-out/aarch64-fastbuild/bin/modules/perception/test/_objs/test/test.pic.d (No such file or directory)

But this is fine on x86,I want to know how to build and compile gpu code with bazel on orin? Can someone help me, thank you very very much!

usrn1 commented 9 months ago

docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig.real --device=all --compute --video --graphics --utility --require=cuda>=11.4 --pid=6318 /var/lib/docker/overlay2/432c72c884ab8b4a05b80e0511b8ff2a36424fffc0538521c2fa6c8a25efa0d8/merged] nvidia-container-cli: requirement error: unsatisfied condition: cuda >= 11.4: unknown.

Did this question need me to upgrade the cuda? My cuda version is 10-2.

CesarLiu commented 9 months ago

why a customized version of libc6 is needed in the orin docker image? https://github.com/ApolloAuto/apollo/blob/33bc20e2721a82d88ba226032971feec271994fa/docker/build/dev.aarch64.orin.dockerfile#L65C1-L66C67 which makes it always have Unmet dependencies when executing sudo apt upgrade there?

ScottDeng114514 commented 9 months ago

apollo version:9.x_alpha device: jetson orin, jetpack r35.1 I successfully built apollo on orin,but when I launch prediction module some errors occur like this:

[prediction]  E0914 16:45:48.394281 159793 plugin_manager.h:146] [prediction]plugin of class apollo::prediction::SemanticLstmPedestrianTensorrt have not been loaded
[prediction]  E0914 16:45:48.429389 159793 plugin_manager.h:146] [prediction]plugin of class apollo::prediction::SemanticLstmVehicleTensorrt have not been loaded
[prediction]  
[cyber_launch_159786] ERROR Process [prediction] has died [pid 159793, exit code -11, cmd mainboard -d /apollo/modules/prediction/dag/prediction.dag -p prediction -s CYBER_DEFAULT].

and the prediction.log.INFO:

I0914 16:28:59.371124 159073 plugin_manager.h:144] creating plugin instance of apollo::prediction::SemanticLstmPedestrianTensorrt
W0914 16:28:59.371173 159073 plugin_manager.h:233] plugin of class apollo::prediction::SemanticLstmPedestrianTensorrt not found, please check if it's registered
E0914 16:28:59.371182 159073 plugin_manager.h:146] plugin of class apollo::prediction::SemanticLstmPedestrianTensorrt have not been loaded
I0914 16:28:59.371233 159073 plugin_manager.h:144] creating plugin instance of apollo::prediction::SemanticLstmPedestrianGpuTorch
I0914 16:28:59.389387 159073 plugin_manager.h:144] creating plugin instance of apollo::prediction::SemanticLstmPedestrianCpuTorch
I0914 16:28:59.406278 159073 plugin_manager.h:144] creating plugin instance of apollo::prediction::SemanticLstmVehicleTensorrt
W0914 16:28:59.406319 159073 plugin_manager.h:233] plugin of class apollo::prediction::SemanticLstmVehicleTensorrt not found, please check if it's registered
E0914 16:28:59.406328 159073 plugin_manager.h:146] plugin of class apollo::prediction::SemanticLstmVehicleTensorrt have not been loaded
I0914 16:28:59.406366 159073 plugin_manager.h:144] creating plugin instance of apollo::prediction::SemanticLstmVehicleGpuTorch
I0914 16:28:59.423386 159073 plugin_manager.h:144] creating plugin instance of apollo::prediction::SemanticLstmVehicleCpuTorch

Any suggestions? Thanks!

ScottDeng114514 commented 9 months ago

apollo version:9.x_alpha device: jetson orin, jetpack r35.1 I successfully built apollo on orin,but when I launch prediction module some errors occur like this:

[prediction]  E0914 16:45:48.394281 159793 plugin_manager.h:146] [prediction]plugin of class apollo::prediction::SemanticLstmPedestrianTensorrt have not been loaded
[prediction]  E0914 16:45:48.429389 159793 plugin_manager.h:146] [prediction]plugin of class apollo::prediction::SemanticLstmVehicleTensorrt have not been loaded
[prediction]  
[cyber_launch_159786] ERROR Process [prediction] has died [pid 159793, exit code -11, cmd mainboard -d /apollo/modules/prediction/dag/prediction.dag -p prediction -s CYBER_DEFAULT].

and the prediction.log.INFO:

I0914 16:28:59.371124 159073 plugin_manager.h:144] creating plugin instance of apollo::prediction::SemanticLstmPedestrianTensorrt
W0914 16:28:59.371173 159073 plugin_manager.h:233] plugin of class apollo::prediction::SemanticLstmPedestrianTensorrt not found, please check if it's registered
E0914 16:28:59.371182 159073 plugin_manager.h:146] plugin of class apollo::prediction::SemanticLstmPedestrianTensorrt have not been loaded
I0914 16:28:59.371233 159073 plugin_manager.h:144] creating plugin instance of apollo::prediction::SemanticLstmPedestrianGpuTorch
I0914 16:28:59.389387 159073 plugin_manager.h:144] creating plugin instance of apollo::prediction::SemanticLstmPedestrianCpuTorch
I0914 16:28:59.406278 159073 plugin_manager.h:144] creating plugin instance of apollo::prediction::SemanticLstmVehicleTensorrt
W0914 16:28:59.406319 159073 plugin_manager.h:233] plugin of class apollo::prediction::SemanticLstmVehicleTensorrt not found, please check if it's registered
E0914 16:28:59.406328 159073 plugin_manager.h:146] plugin of class apollo::prediction::SemanticLstmVehicleTensorrt have not been loaded
I0914 16:28:59.406366 159073 plugin_manager.h:144] creating plugin instance of apollo::prediction::SemanticLstmVehicleGpuTorch
I0914 16:28:59.423386 159073 plugin_manager.h:144] creating plugin instance of apollo::prediction::SemanticLstmVehicleCpuTorch

Any suggestions? Thanks!

Simply solved by commenting out line 27 to 40 of the file "modules/prediction/conf/prediction_conf.pb.txt".

yueyihua commented 8 months ago

apollo version:9.x_alpha device: jetson orin, jetpack r35.1 I successfully built apollo on orin,but when I launch prediction module some errors occur like this:

[prediction]  E0914 16:45:48.394281 159793 plugin_manager.h:146] [prediction]plugin of class apollo::prediction::SemanticLstmPedestrianTensorrt have not been loaded
[prediction]  E0914 16:45:48.429389 159793 plugin_manager.h:146] [prediction]plugin of class apollo::prediction::SemanticLstmVehicleTensorrt have not been loaded
[prediction]  
[cyber_launch_159786] ERROR Process [prediction] has died [pid 159793, exit code -11, cmd mainboard -d /apollo/modules/prediction/dag/prediction.dag -p prediction -s CYBER_DEFAULT].

and the prediction.log.INFO:

I0914 16:28:59.371124 159073 plugin_manager.h:144] creating plugin instance of apollo::prediction::SemanticLstmPedestrianTensorrt
W0914 16:28:59.371173 159073 plugin_manager.h:233] plugin of class apollo::prediction::SemanticLstmPedestrianTensorrt not found, please check if it's registered
E0914 16:28:59.371182 159073 plugin_manager.h:146] plugin of class apollo::prediction::SemanticLstmPedestrianTensorrt have not been loaded
I0914 16:28:59.371233 159073 plugin_manager.h:144] creating plugin instance of apollo::prediction::SemanticLstmPedestrianGpuTorch
I0914 16:28:59.389387 159073 plugin_manager.h:144] creating plugin instance of apollo::prediction::SemanticLstmPedestrianCpuTorch
I0914 16:28:59.406278 159073 plugin_manager.h:144] creating plugin instance of apollo::prediction::SemanticLstmVehicleTensorrt
W0914 16:28:59.406319 159073 plugin_manager.h:233] plugin of class apollo::prediction::SemanticLstmVehicleTensorrt not found, please check if it's registered
E0914 16:28:59.406328 159073 plugin_manager.h:146] plugin of class apollo::prediction::SemanticLstmVehicleTensorrt have not been loaded
I0914 16:28:59.406366 159073 plugin_manager.h:144] creating plugin instance of apollo::prediction::SemanticLstmVehicleGpuTorch
I0914 16:28:59.423386 159073 plugin_manager.h:144] creating plugin instance of apollo::prediction::SemanticLstmVehicleCpuTorch

Any suggestions? Thanks!

I found build error with 9.x_alpha,how to resolve this?

nvidia@ubuntu:~/01_DevCode/apollo$ sudo bash docker/scripts/dev_start.sh [INFO] Setup geolocation specific configurations for cn [INFO] GeoLocation settings for Mainland China [INFO] Start pulling docker image registry.baidubce.com/apolloauto/apollo:dev-aarch64-20.04-20230719_2137 ... Error response from daemon: manifest for registry.baidubce.com/apolloauto/apollo:dev-aarch64-20.04-20230719_2137 not found: manifest unknown: manifest unknown [ERROR] Failed to pull docker image : registry.baidubce.com/apolloauto/apollo:dev-aarch64-20.04-20230719_2137

ScottDeng114514 commented 8 months ago

apollo version:9.x_alpha device: jetson orin, jetpack r35.1 I successfully built apollo on orin,but when I launch prediction module some errors occur like this:

[prediction]  E0914 16:45:48.394281 159793 plugin_manager.h:146] [prediction]plugin of class apollo::prediction::SemanticLstmPedestrianTensorrt have not been loaded
[prediction]  E0914 16:45:48.429389 159793 plugin_manager.h:146] [prediction]plugin of class apollo::prediction::SemanticLstmVehicleTensorrt have not been loaded
[prediction]  
[cyber_launch_159786] ERROR Process [prediction] has died [pid 159793, exit code -11, cmd mainboard -d /apollo/modules/prediction/dag/prediction.dag -p prediction -s CYBER_DEFAULT].

and the prediction.log.INFO:

I0914 16:28:59.371124 159073 plugin_manager.h:144] creating plugin instance of apollo::prediction::SemanticLstmPedestrianTensorrt
W0914 16:28:59.371173 159073 plugin_manager.h:233] plugin of class apollo::prediction::SemanticLstmPedestrianTensorrt not found, please check if it's registered
E0914 16:28:59.371182 159073 plugin_manager.h:146] plugin of class apollo::prediction::SemanticLstmPedestrianTensorrt have not been loaded
I0914 16:28:59.371233 159073 plugin_manager.h:144] creating plugin instance of apollo::prediction::SemanticLstmPedestrianGpuTorch
I0914 16:28:59.389387 159073 plugin_manager.h:144] creating plugin instance of apollo::prediction::SemanticLstmPedestrianCpuTorch
I0914 16:28:59.406278 159073 plugin_manager.h:144] creating plugin instance of apollo::prediction::SemanticLstmVehicleTensorrt
W0914 16:28:59.406319 159073 plugin_manager.h:233] plugin of class apollo::prediction::SemanticLstmVehicleTensorrt not found, please check if it's registered
E0914 16:28:59.406328 159073 plugin_manager.h:146] plugin of class apollo::prediction::SemanticLstmVehicleTensorrt have not been loaded
I0914 16:28:59.406366 159073 plugin_manager.h:144] creating plugin instance of apollo::prediction::SemanticLstmVehicleGpuTorch
I0914 16:28:59.423386 159073 plugin_manager.h:144] creating plugin instance of apollo::prediction::SemanticLstmVehicleCpuTorch

Any suggestions? Thanks!

I found build error with 9.x_alpha,how to resolve this?

nvidia@ubuntu:~/01_DevCode/apollo$ sudo bash docker/scripts/dev_start.sh [INFO] Setup geolocation specific configurations for cn [INFO] GeoLocation settings for Mainland China [INFO] Start pulling docker image registry.baidubce.com/apolloauto/apollo:dev-aarch64-20.04-20230719_2137 ... Error response from daemon: manifest for registry.baidubce.com/apolloauto/apollo:dev-aarch64-20.04-20230719_2137 not found: manifest unknown: manifest unknown [ERROR] Failed to pull docker image : registry.baidubce.com/apolloauto/apollo:dev-aarch64-20.04-20230719_2137

Did you modify the dev_start.sh? line 23:DOCKER_REPO="apolloauto/apollo";Try not to do so,maybe the image haven't been uploaded to the registry.baidubce.com

sainttelant commented 8 months ago

i couldn't find libperception_componet_camera.so in bazel-bin, bazel-out etc. i suppose the version of alpha9.0 has not supported camera_decompression function yet, is it true?

kuang2022 commented 7 months ago

Please tell me how to deploy the Apollo environment in Orin, is there a detailed tutorial?

tangjiacai101 commented 7 months ago

I also have some image pull failures, and many images in the arm architecture cannot be downloaded, or I cannot find this image at all

tangjiacai101 commented 7 months ago

apollo version:9.x_alpha device: jetson orin ERROR: /apollo/third_party/rtklib/BUILD:8:18: Compiling third_party/rtklib/rtcm3.c failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -MD -MF bazel-out/aarch64-opt/bin/third_party/rtklib/_objs/librtklib.so/rtcm3.pic.d ... (remaining 35 arguments skipped)

Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging aarch64-linux-gnu-gcc-9: error: unrecognized command line option ‘-mavx2’

Who has a solution for errors reported during the compilation process

ScottDeng114514 commented 7 months ago

download the alpha version to orin and bash docker/scripts/dev_into.sh ---- Replied Message ---- | From | @.> | | Date | 11/03/2023 15:10 | | To | ApolloAuto/apollo @.> | | Cc | Scott @.>, Mention @.> | | Subject | Re: [ApolloAuto/apollo] Nvidia orin support (Issue #15090) |

Ladies and gentlemen, please tell me how to deploy the Apollo environment in Orin, is there a detailed tutorial?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

kuang2022 commented 7 months ago

During compilation, I ran into a build failure error. ``(01:38:40) INFO: Found 1 target... (01:39:00) ERROR: /apollo/modules/drivers/video/tools/decode_video/BUILD:6:17: Linking modules/drivers/video/tools/decode_video/video2jpg failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc @bazel-out/aarch64-opt/bin/modules/drivers/video/tools/decode_video/video2jpg-2.params

Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging /usr/bin/ld:bazel-out/aarch64-opt/bin/_solib_local/_U@can_Ucard_Ulibrary_S_S_Chermes_Ucan_Ulib/libbcan.so: file format not recognized; treating as linker script /usr/bin/ld:bazel-out/aarch64-opt/bin/_solib_local/_U@can_Ucard_Ulibrary_S_S_ChermesUcanUlib/libbcan.so:0: syntax error collect2: error: ld returned 1 exit status Target //modules/drivers/video:install failed to build Use --verbose_failures to see the command lines of failed build steps. (01:39:01) INFO: Elapsed time: 22.712s, Critical Path: 18.45s (01:39:01) INFO: 205 processes: 192 internal, 13 linux-sandbox. (01:39:01) FAILED: Build did NOT complete successfully (01:39:01) FAILED: Build did NOT complete successfully [buildtool] 2023-11-15 09:39:01 ERROR Encounter ErrCode.BazelErr [buildtool] 2023-11-15 09:39:01 ERROR hint: Install package drivers-video source failed! [buildtool] 2023-11-15 09:39:01 ERROR solution: Please checkout the build file by following bazel error hints`

`

Hgsil commented 7 months ago

After successfully building the perception module, I attempted to test the effect of the smoke. I used the command "mainboard -d modules/transform/dag/static_transform.dag -d modules/perception/camera_detection_single_stage/dag/camera_detection_single_stage.dag -d modules/perception/camera_location_estimation/dag/camera_location_estimation.dag -d modules/perception/camera_location_refinement/dag/camera_location_refinement.dag -d modules/perception/camera_tracking/dag/camera_tracking.dag -d modules/perception/multi_sensor_fusion/dag/multi_sensor_fusion.dag" to launch the modules. However, the "/apollo/perception/obstacles" topic consistently had no information. Upon inspecting the logs, I found that it was stuck in the tracker module. The logs indicated that the detection, location_estimation, and location_refinement modules were producing output normally, but the tracker module only output the following message once:: "I1116 16:05:16.286335 48529 camera_tracking_component.cc:71] Enter Tracking component, message timestamp: 1513807824.641768 current timestamp: 1700121916.286372." Could you please advise me on how to debug the camera tracker module?

HandsomeAIccx commented 7 months ago

After successfully building the perception module, I attempted to test the effect of the smoke. I used the command "mainboard -d modules/transform/dag/static_transform.dag -d modules/perception/camera_detection_single_stage/dag/camera_detection_single_stage.dag -d modules/perception/camera_location_estimation/dag/camera_location_estimation.dag -d modules/perception/camera_location_refinement/dag/camera_location_refinement.dag -d modules/perception/camera_tracking/dag/camera_tracking.dag -d modules/perception/multi_sensor_fusion/dag/multi_sensor_fusion.dag" to launch the modules. However, the "/apollo/perception/obstacles" topic consistently had no information. Upon inspecting the logs, I found that it was stuck in the tracker module. The logs indicated that the detection, location_estimation, and location_refinement modules were producing output normally, but the tracker module only output the following message once:: "I1116 16:05:16.286335 48529 camera_tracking_component.cc:71] Enter Tracking component, message timestamp: 1513807824.641768 current timestamp: 1700121916.286372." Could you please advise me on how to debug the camera tracker module?

I am also facing a similar issue. When I run the following launch, there is no information displayed in the /apollo/perception/obstacles channel. I have already enabled the transform module, set the vehicle model to perception_test_v1, and the record file is sensor_rgb.record. And I used 'amodel' to install 3d-r4-half_caffe.Can someone please help? Thank you very much.

a0d81be12ec71a8e12054dbe3be760b
Hgsil commented 7 months ago

After successfully building the perception module, I attempted to test the effect of the smoke. I used the command "mainboard -d modules/transform/dag/static_transform.dag -d modules/perception/camera_detection_single_stage/dag/camera_detection_single_stage.dag -d modules/perception/camera_location_estimation/dag/camera_location_estimation.dag -d modules/perception/camera_location_refinement/dag/camera_location_refinement.dag -d modules/perception/camera_tracking/dag/camera_tracking.dag -d modules/perception/multi_sensor_fusion/dag/multi_sensor_fusion.dag" to launch the modules. However, the "/apollo/perception/obstacles" topic consistently had no information. Upon inspecting the logs, I found that it was stuck in the tracker module. The logs indicated that the detection, location_estimation, and location_refinement modules were producing output normally, but the tracker module only output the following message once:: "I1116 16:05:16.286335 48529 camera_tracking_component.cc:71] Enter Tracking component, message timestamp: 1513807824.641768 current timestamp: 1700121916.286372." Could you please advise me on how to debug the camera tracker module?

I am also facing a similar issue. When I run the following launch, there is no information displayed in the /apollo/perception/obstacles channel. I have already enabled the transform module, set the vehicle model to perception_test_v1, and the record file is sensor_rgb.record. And I used 'amodel' to install 3d-r4-half_caffe.Can someone please help? Thank you very much.

a0d81be12ec71a8e12054dbe3be760b

Sorry, I haven't resolved this issue yet.

Hgsil commented 7 months ago

After successfully building the perception module, I attempted to test the effect of the smoke. I used the command "mainboard -d modules/transform/dag/static_transform.dag -d modules/perception/camera_detection_single_stage/dag/camera_detection_single_stage.dag -d modules/perception/camera_location_estimation/dag/camera_location_estimation.dag -d modules/perception/camera_location_refinement/dag/camera_location_refinement.dag -d modules/perception/camera_tracking/dag/camera_tracking.dag -d modules/perception/multi_sensor_fusion/dag/multi_sensor_fusion.dag" to launch the modules. However, the "/apollo/perception/obstacles" topic consistently had no information. Upon inspecting the logs, I found that it was stuck in the tracker module. The logs indicated that the detection, location_estimation, and location_refinement modules were producing output normally, but the tracker module only output the following message once:: "I1116 16:05:16.286335 48529 camera_tracking_component.cc:71] Enter Tracking component, message timestamp: 1513807824.641768 current timestamp: 1700121916.286372." Could you please advise me on how to debug the camera tracker module?

I am also facing a similar issue. When I run the following launch, there is no information displayed in the /apollo/perception/obstacles channel. I have already enabled the transform module, set the vehicle model to perception_test_v1, and the record file is sensor_rgb.record. And I used 'amodel' to install 3d-r4-half_caffe.Can someone please help? Thank you very much.

a0d81be12ec71a8e12054dbe3be760b

I am outputting the content of the 'track' line by line to identify the problem. Currently, it is confirmed that the entire process is stuck in the ”OMTObstacleTracker::Associate2D“ method when calling “OMTObstacleTracker::CreateNewTarget” Upon further output, it can be observed that the code is stuck at “auto &min_tmplt = kMinTemplateHWL.at(sub_type);” At the moment, I cannot find a solution. The log outputs information indicating the successful initialization of “object_template_manager”

image image

zyxcambridge commented 6 months ago

Nvidia orin is now supported in branch 9.x_alpha and tested:rocket::rocket::rocket:. Questions about Nvidia orin can be discussed here, or open a new issue and link here.

Why add this issue

We know that many developers are concerned about running autonomous driving systems on embedded devices like Nvidia orin, We supported it on beta and hope to collect more issues. Hope this version can help you ♥️!

4090 is ok?

wusx commented 6 months ago

how to use cuda on orin?

The directory structure is as follows:

// kernel.h
#include <iostream>
#include <cuda.h>
#include <cuda_fp16.h>
#include <cuda_runtime_api.h>

...
// kernel.cu
#include "kernel.h"

...
// BAZEL
load("@rules_cc//cc:defs.bzl", "cc_binary")
load("//tools/install:install.bzl", "install")
load("//tools/install:install.bzl", "install", "install_src_files")
load("//tools:cpplint.bzl", "cpplint")
load("//third_party/gpus:common.bzl", "gpu_library", "if_cuda", "if_rocm")

package(default_visibility = ["//visibility:public"])

gpu_library(
    name="kernel",
    hdrs=["kernel.h"],
    srcs=["kernel.cu"],
    deps=[
        "@local_config_cuda//cuda:cublas",
        "@local_config_cuda//cuda:cuda_headers",
        "@local_config_cuda//cuda:cudart",
    ],
    alwayslink = True,
)

when I run:

sudo bazel build --config=gpu --config=nvidia  modules/perception/test/...

report error:

/apollo/modules/perception/test/BUILD:10:12: error while parsing .d file: /apollo/.cache/bazel/540135163923dd7d5820f3ee4b306b32/execroot/apollo/bazel-out/aarch64-fastbuild/bin/modules/perception/test/_objs/test/test.pic.d (No such file or directory)

But this is fine on x86,I want to know how to build and compile gpu code with bazel on orin? Can someone help me, thank you very very much!

do you have a solution?

seagap commented 2 months ago

how to use cuda on orin?

The directory structure is as follows:

// kernel.h
#include <iostream>
#include <cuda.h>
#include <cuda_fp16.h>
#include <cuda_runtime_api.h>

...
// kernel.cu
#include "kernel.h"

...
// BAZEL
load("@rules_cc//cc:defs.bzl", "cc_binary")
load("//tools/install:install.bzl", "install")
load("//tools/install:install.bzl", "install", "install_src_files")
load("//tools:cpplint.bzl", "cpplint")
load("//third_party/gpus:common.bzl", "gpu_library", "if_cuda", "if_rocm")

package(default_visibility = ["//visibility:public"])

gpu_library(
    name="kernel",
    hdrs=["kernel.h"],
    srcs=["kernel.cu"],
    deps=[
        "@local_config_cuda//cuda:cublas",
        "@local_config_cuda//cuda:cuda_headers",
        "@local_config_cuda//cuda:cudart",
    ],
    alwayslink = True,
)

when I run:

sudo bazel build --config=gpu --config=nvidia  modules/perception/test/...

report error:

/apollo/modules/perception/test/BUILD:10:12: error while parsing .d file: /apollo/.cache/bazel/540135163923dd7d5820f3ee4b306b32/execroot/apollo/bazel-out/aarch64-fastbuild/bin/modules/perception/test/_objs/test/test.pic.d (No such file or directory)

But this is fine on x86,I want to know how to build and compile gpu code with bazel on orin? Can someone help me, thank you very very much!

Hello have you solved this error?

Ybt-1 commented 2 months ago

在编译过程中,我遇到了构建失败错误。 ``(01:38:40) 信息:找到 1 个目标... (01:39:00) 错误:/apollo/modules/drivers/video/tools/decode_video/BUILD:6:17:链接模块/驱动程序/ video/tools/decode_video/video2jpg 失败:(退出 1):crosstool_wrapper_driver_is_not_gcc 失败:执行命令时出错 external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc @bazel-out/aarch64-opt/bin/modules/drivers/video/tools/解码视频/video2jpg-2.params

使用 --sandbox_debug 查看来自沙箱的详细消息并保留沙箱构建根用于调试 /usr/bin/ld:bazel-out/aarch64-opt/bin/_solib_local/_U@can_Ucard_Ulibrary_S_S_Chermes_Ucan_Ulib/libbcan.so: 文件格式无法识别;视为链接器脚本 /usr/bin/ld:bazel-out/aarch64-opt/bin/_solib_local/_U@can_Ucard_Ulibrary_S_S_ChermesUcanUlib/libbcan.so:0: 语法错误 collect2: 错误: ld 返回 1 退出状态 Target //modules/drivers /video:install 构建失败 使用 --verbose_failures 查看失败构建步骤的命令行。 (01:39:01) 信息:已用时间:22.712 秒,关键路径:18.45 秒 (01:39:01) 信息:205 个进程:192 个内部进程,13 个 linux-sandbox。 (01:39:01) 失败:构建未成功完成 (01:39:01) 失败:构建未成功完成 [buildtool] 2023-11-15 09:39:01 错误遇到 ErrCode.BazelErr [buildtool] 2023 -11-15 09:39:01 错误提示:安装包驱动程序-视频源失败! [buildtool] 2023-11-15 09:39:01 错误解决方案:请按照 bazel 错误提示检查构建文件`

`

hello I also encountered a similar problem. Did you solve it?

nuaawh commented 3 hours ago

how to use cuda on orin? The directory structure is as follows:

// kernel.h
#include <iostream>
#include <cuda.h>
#include <cuda_fp16.h>
#include <cuda_runtime_api.h>

...
// kernel.cu
#include "kernel.h"

...
// BAZEL
load("@rules_cc//cc:defs.bzl", "cc_binary")
load("//tools/install:install.bzl", "install")
load("//tools/install:install.bzl", "install", "install_src_files")
load("//tools:cpplint.bzl", "cpplint")
load("//third_party/gpus:common.bzl", "gpu_library", "if_cuda", "if_rocm")

package(default_visibility = ["//visibility:public"])

gpu_library(
   name="kernel",
   hdrs=["kernel.h"],
   srcs=["kernel.cu"],
   deps=[
       "@local_config_cuda//cuda:cublas",
       "@local_config_cuda//cuda:cuda_headers",
       "@local_config_cuda//cuda:cudart",
   ],
   alwayslink = True,
)

when I run:

sudo bazel build --config=gpu --config=nvidia  modules/perception/test/...

report error:

/apollo/modules/perception/test/BUILD:10:12: error while parsing .d file: /apollo/.cache/bazel/540135163923dd7d5820f3ee4b306b32/execroot/apollo/bazel-out/aarch64-fastbuild/bin/modules/perception/test/_objs/test/test.pic.d (No such file or directory)

But this is fine on x86,I want to know how to build and compile gpu code with bazel on orin? Can someone help me, thank you very very much!

Hello have you solved this error?

Hello have you solved this error?

nuaawh commented 3 hours ago

how to use cuda on orin? The directory structure is as follows:

// kernel.h
#include <iostream>
#include <cuda.h>
#include <cuda_fp16.h>
#include <cuda_runtime_api.h>

...
// kernel.cu
#include "kernel.h"

...
// BAZEL
load("@rules_cc//cc:defs.bzl", "cc_binary")
load("//tools/install:install.bzl", "install")
load("//tools/install:install.bzl", "install", "install_src_files")
load("//tools:cpplint.bzl", "cpplint")
load("//third_party/gpus:common.bzl", "gpu_library", "if_cuda", "if_rocm")

package(default_visibility = ["//visibility:public"])

gpu_library(
    name="kernel",
    hdrs=["kernel.h"],
    srcs=["kernel.cu"],
    deps=[
        "@local_config_cuda//cuda:cublas",
        "@local_config_cuda//cuda:cuda_headers",
        "@local_config_cuda//cuda:cudart",
    ],
    alwayslink = True,
)

when I run:

sudo bazel build --config=gpu --config=nvidia  modules/perception/test/...

report error:

/apollo/modules/perception/test/BUILD:10:12: error while parsing .d file: /apollo/.cache/bazel/540135163923dd7d5820f3ee4b306b32/execroot/apollo/bazel-out/aarch64-fastbuild/bin/modules/perception/test/_objs/test/test.pic.d (No such file or directory)

But this is fine on x86,I want to know how to build and compile gpu code with bazel on orin? Can someone help me, thank you very very much!

do you have a solution?

Hello have you solved this error?