vitoplantamura / OnnxStream

Lightweight inference library for ONNX files, written in C++. It can run Stable Diffusion XL 1.0 on a RPI Zero 2 (or in 298MB of RAM) but also Mistral 7B on desktops and servers. ARM, x86, WASM, RISC-V supported. Accelerated by XNNPACK.
https://yolo.vitoplantamura.com/
Other
1.86k stars 84 forks source link

Work with Jetson Nano? #2

Open tranzmatt opened 1 year ago

tranzmatt commented 1 year ago

Do you know/think it would work on something like a Jetson Nano? I have one running Bionic 18.04 with CUDA 10.2 and would be curious if it could be adapted to this.

vitoplantamura commented 1 year ago

hi,

OnnxStream doesn't support CUDA, but it does support NEON (through XnnPack).

I guess it should work on a Jetson Nano but not to its full potential.

Can you do a test and let me know?

Thanks, Vito

atomicrajat commented 1 year ago

Hi, I gave it a try on jetson nano with jetpack 4.6.3 with ubuntu 18.04 and cmake 3.27.2 but i get the following error while building XNNPACK at 80%

cc1: error: invalid feature modifier in ‘-march=armv8.2-a+bf16’
CMakeFiles/microkernels-all.dir/build.make:51049: recipe for target 'CMakeFiles/microkernels-all.dir/src/bf16-gemm/gen/bf16-gemm-1x4c8-minmax-neonbf16-bfdot.c.o' failed
make[2]: *** [CMakeFiles/microkernels-all.dir/src/bf16-gemm/gen/bf16-gemm-1x4c8-minmax-neonbf16-bfdot.c.o] Error 1
CMakeFiles/Makefile2:209: recipe for target 'CMakeFiles/microkernels-all.dir/all' failed
make[1]: *** [CMakeFiles/microkernels-all.dir/all] Error 2
Makefile:135: recipe for target 'all' failed
make: *** [all] Error 2

Anything changes i need to do before building to avoid this ?

vitoplantamura commented 1 year ago

hi,

I think there's little we can do... in any case, I'd try to build with the most recent version of the code: maybe it's a bug that has been fixed (git command: "git checkout master").

However please note that I'm pretty sure the most recent version of XnnPack is incompatible with OnnxStream.

thanks, Vito

ByerRA commented 1 year ago

I "think" I've solved this issue as I was able to get XNNPACK compiled without issue.

The "issue" is that Nvidia's JetPack for Jetson Nano comes with GCC 7 installed because CUDA v10.2 which is the last version the Jetson Nano supports and won't allow the use of GCC greater than v7 when compiling CUDA code (the NVCC compiler will throw an error about the version of GCC if it's greater than 7)

While XNNPACK and OnnxStream don't use CUDA and GCC 7 that is bundled with Nvidia JetPack for Jetson Nano doesn't support all the dotpod extensions required for XNNPACK so one has to install at a minimum of GCC 10 to get XNNPACK to compile.

Now, you can JUST install GCC/G++ 10 with the following...

sudo apt update && sudo apt install gcc-10 g++-10

And XNNPACK will compile. But if you still need to compile code for CUDA on the Jetson Nano installing GCC 10 will break it. (again, as NVCC will complain about the GCC version)

So what we have to do is install GCC/G++ 10 and then setup them up as alternatives with the existing GCC/G++ 7 using the following after installing GCC/G++ 10

sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-7 7 sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-7 7 sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-10 10 sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-10 10

Then it's a matter of just doing the following to switch between GCC/G++ 7 and GCC/G++ 10 as the default compiler so you can compile XNNPACK and CUDA 10.2 code without issues.

sudo update-alternatives --config gcc sudo update-alternatives --config g++

Doing this, XNNPACK compiled just fine on my Jetson Nano.

vitoplantamura commented 1 year ago

@ByerRA, very very helpful, thank you!

Vito

feiticeir0 commented 5 months ago

I tried your solution, but still getting the same error:

[ 78%] Building C object CMakeFiles/microkernels-all.dir/src/math/gen/f16-tanh-aarch64-neonfp16arith-expm1minus-rr1-p3h1ts-div.c.o
[ 78%] Building C object CMakeFiles/microkernels-all.dir/src/math/gen/f16-tanh-aarch64-neonfp16arith-expm1minus-rr1-p3h2ts-div.c.o
[ 78%] Building C object CMakeFiles/microkernels-all.dir/src/bf16-gemm/gen/bf16-gemm-1x4c8-minmax-neonbf16-bfdot.c.o
Assembler messages:
Error: unknown architectural extension `bf16'
Error: unrecognized option -march=armv8.2-a+bf16
CMakeFiles/microkernels-all.dir/build.make:51735: recipe for target 'CMakeFiles/microkernels-all.dir/src/bf16-gemm/gen/bf16-gemm-1x4c8-minmax-neonbf16-bfdot.c.o' failed
make[2]: *** [CMakeFiles/microkernels-all.dir/src/bf16-gemm/gen/bf16-gemm-1x4c8-minmax-neonbf16-bfdot.c.o] Error 1
CMakeFiles/Makefile2:209: recipe for target 'CMakeFiles/microkernels-all.dir/all' failed
make[1]: *** [CMakeFiles/microkernels-all.dir/all] Error 2
Makefile:135: recipe for target 'all' failed
make: *** [all] Error 2

My gcc and g++ versions:

feiticeir0@JetsonNano:~/XNNPACK/build$ g++ --version
g++ (Ubuntu 10.3.0-1ubuntu1~18.04~1) 10.3.0
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

feiticeir0@JetsonNano:~/XNNPACK/build$ gcc --version
gcc (Ubuntu 10.3.0-1ubuntu1~18.04~1) 10.3.0
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

feiticeir0@JetsonNano:~/XNNPACK/build$

I even did a make clean so XNNPACK could start fresh, but still, no luck... :(