Open goetzrieger opened 1 month ago
Hi @goetzrieger, since the build is multi-threaded, the actual error may be higher in the output usually surrounded by ***
. Not sure if you can either search for those instances or search for Error:
. If it's too high up in the terminal output you can try piping the build to a file so that you can grep
through it as well. Something along the lines of:
# '> out.txt' pipes stdout to out.txt
# '2>&1' redirects stderr to stdout where stdout is writing to out.txt
$ ./build.py --backend onnxruntime -v > out.txt 2>&1
Thanks Kyle! I did that and had a better look at the output... but TBH this is not my area of expertise so I can't really find anything even if there are some more Errors.
Don't know if this makes sense but I'm sharing my output here, would be great if you could have a quick look.
As said we plan on using Triton on some kind of edge device, actually a Raspi 4 powered robot for event workshops. And the full image is just too big with 11GB or so.
Looking through your logs, it appears there is an issue converting a uint64_t
to size_t
.
[K/tmp/tritonbuild/tritonserver/build/_deps/repo-core-src/src/cache_entry.cc:177:34:[m[K [01;31m[Kerror: [m[Kcannot convert '[01m[Kuint64_t*[m[K' {aka '[01m[Klong long unsigned int*[m[K'} to '[01m[Ksize_t*[m[K' {aka '[01m[Kunsigned int*[m[K'}
177 | output, base + position, [01;31m[K&packed_output_byte_size[m[K));
| [01;31m[K^~~~~~~~~~~~~~~~~~~~~~~~[m[K
| [01;31m[K|[m[K
| [01;31m[Kuint64_t* {aka long long unsigned int*}[m[K
(Sorry for adding the control chars. I don't really want to spend time scrubbing the logs.) This looks to be because your system defines size_t
as an unsigned integer
while we have explicitly defined packed_output_byte_size
to uint64_t
. I know this is not the answer you want to hear, but this is a "system dependent" error.
@rmccorm4 I think this stems from the fact that we are not consistent with our use of size types in our code base. Probably some tech debt that needs to be addressed. Do you have any further suggestions?
I'm quite happy you are looking into this in the first place.
What I don't really get is, that the full 11GB Triton Arm image I can grab from the Nvidia registry works fine on the Raspi. And if I understand correctly the whole build happens in a container with all deps anyway. So is this error due to trying to build on Raspi?
Should I try to build on some other Arm system like an AWS instance?
I was curious about what is going on with the compiler and I think there is something funny going on with the way size_t
is getting sized (no pun intended). I couldn't reproduce the error my minimal godbolt example unfortunately although you can observe that y
has a width of 4 bytes and x
has a width of 8 bytes. I guess this also shows off the non-deterministic width of the standard types. Perhaps you can play around with my example and get something that reproduces? I'm not sure which compiler to use so I used gcc 13 and gcc 11. I had MSVC just for fun to compare :)
I'm afraid this is way past my skills with cpp etc... ;)
I'm wondering how and where the Nvidia Arm Docker image has been build that works on Raspi?
Description I'm trying to build a custom CPU-only Triton server for Edge usage to limit image size
Errors out with:
I'm by no means a developer, anything obviously wrong I'm doing?
Thanks!