Open tiki-2 opened 1 year ago
hi,
I don't think memory would be an issue, as SD 1.5 requires 1.1GB of RAM in OnnxStream in its least conservative configuration.
I've tried the latest version of OnnxStream on my RPI Zero 2 and in Windows, and it works, so this isn't a problem introduced with the latest commit.
My guess is that it could be an incompatibility between XNNPACK and the Amlogic S905X.
Unfortunately I can do little without having the SBC in my hands.
This is a great guide that I have in my bookmarks:
https://jvns.ca/blog/2018/04/28/debugging-a-segfault-on-linux/
Let me know if you need help,
Thanks, Vito
I get the same (or similar) segmentation error, a fresh build as of 7th April 2024; On a raspberry Pi 5; I dont believe its a memory issue either; I did note the build returned a number of warning messages
free command shows total used free shared buff/cache available Mem: 8241616 256128 426704 5616 7663600 7985488 Swap: 102384 0 102384
time ./sd --turbo --rpi --models-path $baseDir/stable-diffusion-xl-turbo-1.0-onnxstream --prompt "An astronaut riding a horse" --steps 1 --output astronaut.png ----------------[start]------------------ positive_prompt: An astronaut riding a horse SDXL turbo doesn't support negative_prompts output_png_path: astronaut.png steps: 1 seed: 526336 ----------------[prompt]------------------ Token: "A" Token: "n" Token: "astronaut" Token: "riding" Token: "a" Token: "horse" ----------------[diffusion]--------------- step:0 101016ms ----------------[decode]------------------ Segmentation fault
real 3m47.597s user 9m8.696s sys 0m33.962s
Warning messages shown on compilation :-
/home/pi/onxx-stream/XNNPACK/build/cpuinfo-source/src/arm/linux/chipset.c: In function ‘cpuinfo_arm_linux_decode_chipset’:
/home/pi/onxx-stream/XNNPACK/build/cpuinfo-source/src/arm/linux/chipset.c:3931:25: warning: ‘cpuinfo_arm_fixup_raspberry_pi_chipset’ reading 64 bytes f rom a region of size 9 [-Wstringop-overread]
3931 | cpuinfo_arm_fixup_raspberry_pi_chipset(&chipset, revision);
| ^~~~~~~~~~~~~~
/home/pi/onxx-stream/XNNPACK/build/cpuinfo-source/src/arm/linux/chipset.c:3931:25: note: referencing argument 2 of type ‘const char[64]’
/home/pi/onxx-stream/XNNPACK/build/cpuinfo-source/src/arm/linux/chipset.c:3855:14: note: in a call to function ‘cpuinfo_arm_fixup_raspberry_pi_chipset’
3855 | void cpuinfo_arm_fixup_raspberry_pi_chipset(
| ^~~~~~~~~~
/home/pi/onxx-stream/XNNPACK/build/cpuinfo-source/src/arm/linux/chipset.c:3931:25: warning: ‘cpuinfo_arm_fixup_raspberry_pi_chipset’ reading 64 bytes f rom a region of size 9 [-Wstringop-overread]
3931 | cpuinfo_arm_fixup_raspberry_pi_chipset(&chipset, revision);
| ^~~~~~~~~~~~~~
/home/pi/onxx-stream/XNNPACK/build/cpuinfo-source/src/arm/linux/chipset.c:3931:25: note: referencing argument 2 of type ‘const char[64]’
/home/pi/onxx-stream/XNNPACK/build/cpuinfo-source/src/arm/linux/chipset.c:3855:14: note: in a call to function ‘cpuinfo_arm_fixup_raspberry_pi_chipset’
3855 | void cpuinfo_arm_fixup_raspberry_pi_chipset(
| ^~~~~~~~~~
/home/pi/onxx-stream/XNNPACK/build/cpuinfo-source/src/arm/linux/chipset.c: In function ‘cpuinfo_arm_linux_decode_chipset’:
/home/pi/onxx-stream/XNNPACK/build/cpuinfo-source/src/arm/linux/chipset.c:3931:25: warning: ‘cpuinfo_arm_fixup_raspberry_pi_chipset’ reading 64 bytes f rom a region of size 9 [-Wstringop-overread]
3931 | cpuinfo_arm_fixup_raspberry_pi_chipset(&chipset, revision);
| ^~~~~~~~~~~~~~
/home/pi/onxx-stream/XNNPACK/build/cpuinfo-source/src/arm/linux/chipset.c:3931:25: note: referencing argument 2 of type ‘const char[64]’
/home/pi/onxx-stream/XNNPACK/build/cpuinfo-source/src/arm/linux/chipset.c:3855:14: note: in a call to function ‘cpuinfo_arm_fixup_raspberry_pi_chipset’
3855 | void cpuinfo_arm_fixup_raspberry_pi_chipset(
| ^~~~~~~~~~
/home/pi/onxx-stream/XNNPACK/build/cpuinfo-source/src/arm/linux/chipset.c:3931:25: warning: ‘cpuinfo_arm_fixup_raspberry_pi_chipset’ reading 64 bytes f rom a region of size 9 [-Wstringop-overread]
3931 | cpuinfo_arm_fixup_raspberry_pi_chipset(&chipset, revision);
| ^~~~~~~~~~~~~~
/home/pi/onxx-stream/XNNPACK/build/cpuinfo-source/src/arm/linux/chipset.c:3931:25: note: referencing argument 2 of type ‘const char[64]’
/home/pi/onxx-stream/XNNPACK/build/cpuinfo-source/src/arm/linux/chipset.c:3855:14: note: in a call to function ‘cpuinfo_arm_fixup_raspberry_pi_chipset’
3855 | void cpuinfo_arm_fixup_raspberry_pi_chipset(
| ^~~~~~~~~~
hi,
can you try without the "--rpi" option? On the RPI 5 it shouldn't be necessary, but I haven't had the opportunity to test it.
Can you then do a second test by adding the "--not-tiled" option?
Thanks, Vito
Same error as before this time without the --rpi parm
pi@raspberrypi:~/onxx-stream/OnnxStream/src/build $ time ./sd --turbo --models-path $baseDir/stable-diffusion-xl-turbo-1.0-onnxstream --prompt "An astronaut riding a horse on mars" --steps 1 --output astronaut.png ----------------[start]------------------ positive_prompt: An astronaut riding a horse on mars SDXL turbo doesn't support negative_prompts output_png_path: astronaut.png steps: 1 seed: 0 ----------------[prompt]------------------ Token: "A" Token: "n" Token: "astronaut" Token: "riding" Token: "a" Token: "horse" Token: "on" Token: "mars" ----------------[diffusion]--------------- step:0 67418ms ----------------[decode]------------------ Segmentation fault
real 3m31.399s user 7m33.962s sys 0m22.208s pi@raspberrypi:~/onxx-stream/OnnxStream/src/build $
I will try this on a PI Zero 2, as soon as I can find it
have you tried adding the "--not-tiled" option?
I just remembered that LivingLinux managed to run OnnxStream on the RPI5, without any problems:
https://www.youtube.com/watch?v=D0qG2OIpbUk
Vito
let me try that, I found my Zero2 and been trying to get that working; different problems! cant get it to complete build, forget running the sd command; no matter you learn more when it fails :) Thanks for you help
Same error; same with and without --rpi parameter
pi@raspberrypi:~/onxx-stream/OnnxStream/src/build $ time ./sd --turbo --not-tiled --models-path $baseDir/stable-diffusion-xl-turbo-1.0-onnxstream --prompt "an astronaut riding a horse on mars" --steps 1 --output astronaut.png ----------------[start]------------------ positive_prompt: an astronaut riding a horse on mars SDXL turbo doesn't support negative_prompts output_png_path: astronaut.png steps: 1 seed: 882688 ----------------[prompt]------------------ Token: "an" Token: "astronaut" Token: "riding" Token: "a" Token: "horse" Token: "on" Token: "mars" ----------------[diffusion]--------------- step:0 35758ms ----------------[decode]------------------ Segmentation fault
real 2m3.277s user 4m11.563s sys 0m16.113s pi@raspberrypi:~/onxx-stream/OnnxStream/src/build $
I also tried and failed to get it working on PI Zero 2 (see screenshot), it go so far into the build and got stuck filtering content 15% 406/2640; left it running overnight and did not progress further, could not break out of it had to pull the power out
I will take a break and follow the instructions u shared previously, the steps are a little different; I will keep everyone posted
As for the RPI5, it could be a XNNPACK problem. We could try with the same version of OnnxStream used by LivingLinux in his video, which is commit 580cd677310a70fe35c8aecbffbbaa012ae54855. The XNNPACK version is different, so you should follow the instructions here: https://github.com/vitoplantamura/OnnxStream/tree/580cd677310a70fe35c8aecbffbbaa012ae54855
regarding the RPIZero2, this problem happened to me too, and it depends on Git LFS when run on the RPIZero2. Solved by adding 1GB of swap.
Vito
Thanks Vito, appreciate the feedback, I found myself going round in circles and each turn was a failure; I'm taking a breather and will try again over the weekend. the livinglinux example you shared the website lists a command which downloaded a windows package, which threw me somewhat when it failed. ill do a clean build later to ensure I dont get any errors from the past installs
Hi, I'm getting a segmentation fault when attempting to run. I'm using a SweetPotato (Raspberry Pi clone), running Raspbian Bullseye 64-bit OS. This device has 2GB of DDR4 memory.
I was able to compile XNNPACK successfully and without errors following your instructions
I was able to compile your OnnxStream code - this gives an information/warning by appears to compile:
$ cmake --build . --config Release [ 33%] Building CXX object CMakeFiles/sd.dir/sd.cpp.o [ 66%] Building CXX object CMakeFiles/sd.dir/onnxstream.cpp.o In file included from /usr/include/c++/10/bits/stl_algobase.h:64, from /usr/include/c++/10/vector:60, from /home/frenchfry/projects/XNNPACK/OnnxStream/src/onnxstream.h:4, from /home/frenchfry/projects/XNNPACK/OnnxStream/src/onnxstream.cpp:1: /usr/include/c++/10/bits/stl_pair.h: In instantiation of ‘constexpr std::pair<typename std::__strip_reference_wrapper<typename std::decay<_Tp>::type>::__type, typename std::__strip_reference_wrapper<typename std::decay<_Tp2>::type>::__type> std::make_pair(_T1&&, _T2&&) [with _T1 = float&; _T2 = float&; typename std::__strip_reference_wrapper<typename std::decay<_Tp2>::type>::__type = float; typename std::__strip_reference_wrapper<typename std::decay<_Tp>::type>::__type = float]’: /home/frenchfry/projects/XNNPACK/OnnxStream/src/onnxstream.cpp:2085:60: required from here /usr/include/c++/10/bits/stl_pair.h:567:5: note: parameter passing for argument of type ‘std::pair<float, float>’ when C++17 is enabled changed to match C++14 in GCC 10.1 567 | make_pair(_T1&& __x, _T2&& __y) | ^~~~~~~~~ [100%] Linking CXX executable sd [100%] Built target sd
But, when I run, I get a segmentation fault within a few seconds:
$ ./sd --rpi --steps 3 --rpi-lowmem ----------------[start]------------------ positive_prompt: a photo of an astronaut riding a horse on mars negative_prompt: ugly, blurry output_png_path: ./result.png steps: 3 ----------------[prompt]------------------ Segmentation fault
I added the command line switch to print operations and it shows the following: ` $ ./sd --rpi --steps 3 --rpi-lowmem --ops-printf ----------------[start]------------------ positive_prompt: a photo of an astronaut riding a horse on mars negative_prompt: ugly, blurry output_png_path: ./result.png steps: 3 ----------------[prompt]------------------
0) Reshape (Reshape_113)
1) Gather (Gather_114)
2) Add (Add_116)
3) ReduceMean (ReduceMean_123)
4) Sub (Sub_124)
5) Pow (Pow_126)
6) ReduceMean (ReduceMean_127)
7) Add (Add_129)
8) Sqrt (Sqrt_130)
9) Div (Div_131)
10) Mul (Mul_132)
11) Add (Add_133)
12) MatMul (MatMul_134)
13) Add (Add_135)
14) Mul (Mul_137)
15) MatMul (MatMul_138)
16) Add (Add_139)
17) Reshape (Reshape_140)
18) Transpose (Transpose_141)
19) MatMul (MatMul_142)
20) Add (Add_143)
21) Reshape (Reshape_144)
22) Transpose (Transpose_145)
23) Reshape (Reshape_146)
24) Transpose (Transpose_147)
25) Reshape (Reshape_148)
26) Reshape (Reshape_149)
27) Reshape (Reshape_150)
28) Transpose (Transpose_151)
29) MatMul (MatMul_152)
30) Reshape (Reshape_153)
31) Add (Add_154)
32) Reshape (Reshape_155)
33) Softmax (Softmax_156)
34) MatMul (MatMul_157)
35) Reshape (Reshape_158)
36) Transpose (Transpose_159)
37) Reshape (Reshape_160)
38) MatMul (MatMul_161)
39) Add (Add_162)
40) Add (Add_163)
41) ReduceMean (ReduceMean_164)
42) Sub (Sub_165)
43) Pow (Pow_167)
44) ReduceMean (ReduceMean_168)
45) Add (Add_170)
46) Sqrt (Sqrt_171)
47) Div (Div_172)
48) Mul (Mul_173)
49) Add (Add_174)
50) MatMul (MatMul_175)
51) Add (Add_176)
52) Mul (Mul_178)
53) Sigmoid (Sigmoid_179)
54) Mul (Mul_180)
55) MatMul (MatMul_181)
Segmentation fault `
I have tried with --rpi-lowmem set and unset (same results). I know --lowmem was meant to be <500MB, but not sure how much memory is required if this switch is not set.
I have tried re-compiling OnnxStream with -DMAX_SPEED=ON and -DMAX_SPEED=OFF (same results).
Any ideas on how do debug?