Closed cahlen closed 8 months ago
@cahlen onnx likes to build its own version of protobuf that it provides as a submodule - does it work if you omit protobuf
from your build.sh command?
Thanks for your quick response @dusty-nv. Here is what I've tried after your comment,
docker system prune -a
)--depth=1
) so that I could checkout dev
pip3 install -r requirements.txt
) just in case.$ ./build.sh --name=my_container pytorch:2.1 opencv torchvision torchaudio tensorflow2 gstreamer deepstream
without the protobuf layer as you requested.It made it pretty far, here are the layers it ended up successfully building
debian:~/jetson-containers$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
my_container r35.4.1 47451fccb1c2 14 minutes ago 17.3GB
my_container r35.4.1-deepstream 47451fccb1c2 14 minutes ago 17.3GB
my_container r35.4.1-tritonserver 1e0a80400896 17 minutes ago 16.4GB
my_container r35.4.1-gstreamer 20dbad91ec4a 23 minutes ago 13.9GB
my_container r35.4.1-tensorflow2 f8a828531b22 24 minutes ago 13.5GB
my_container r35.4.1-protobuf_cpp 0b357613a2b4 26 minutes ago 12GB
my_container r35.4.1-torchaudio 5066715ea516 34 minutes ago 11.5GB
my_container r35.4.1-torchvision f02b96837586 41 minutes ago 11.4GB
my_container r35.4.1-opencv 4c1258031f48 43 minutes ago 11.3GB
my_container r35.4.1-pytorch_2.1 a2bfd988d82c 44 minutes ago 11GB
my_container r35.4.1-onnx 2f3938cd6b30 45 minutes ago 10GB
my_container r35.4.1-cmake 6b6aa3c435cf 50 minutes ago 9.96GB
my_container r35.4.1-numpy 0a910d095e80 51 minutes ago 9.9GB
my_container r35.4.1-python a0b6c760ee6c 51 minutes ago 9.85GB
my_container r35.4.1-tensorrt a0b6c760ee6c 51 minutes ago 9.85GB
my_container r35.4.1-build-essential 00409d96cad4 52 minutes ago 9.76GB
my_container r35.4.1-cuda 00409d96cad4 52 minutes ago 9.76GB
my_container r35.4.1-cudnn 00409d96cad4 52 minutes ago 9.76GB
nvcr.io/nvidia/l4t-jetpack r35.4.1 5c923ac521a3 5 months ago 9.71GB
So it looks like it built all the containers and the final container, however it still fails on that ONNX test. Here is the last log
-- Testing container my_container:r35.4.1 (onnx/test.py)
docker run -t --rm --runtime=nvidia --network=host \
--volume /home/debian/jetson-containers/packages/onnx:/test \
--volume /home/debian/jetson-containers/data:/data \
--workdir /test \
my_container:r35.4.1 \
/bin/bash -c 'python3 test.py' \
2>&1 | tee /home/debian/jetson-containers/logs/20231222_114450/test/my_container_r35.4.1_test.py.txt; exit ${PIPESTATUS[0]}
testing onnx...
/usr/local/lib/python3.8/dist-packages/google/protobuf/internal/api_implementation.py:87: UserWarning: Selected implementation cpp is not available.
warnings.warn(
Traceback (most recent call last):
File "test.py", line 4, in <module>
import onnx
File "/usr/local/lib/python3.8/dist-packages/onnx/__init__.py", line 75, in <module>
from onnx import serialization
File "/usr/local/lib/python3.8/dist-packages/onnx/serialization.py", line 16, in <module>
import google.protobuf.json_format
File "/usr/local/lib/python3.8/dist-packages/google/protobuf/json_format.py", line 30, in <module>
from google.protobuf.internal import type_checkers
File "/usr/local/lib/python3.8/dist-packages/google/protobuf/internal/type_checkers.py", line 28, in <module>
from google.protobuf.internal import decoder
File "/usr/local/lib/python3.8/dist-packages/google/protobuf/internal/decoder.py", line 64, in <module>
from google.protobuf.internal import encoder
File "/usr/local/lib/python3.8/dist-packages/google/protobuf/internal/encoder.py", line 48, in <module>
from google.protobuf.internal import wire_format
File "/usr/local/lib/python3.8/dist-packages/google/protobuf/internal/wire_format.py", line 13, in <module>
from google.protobuf import descriptor
File "/usr/local/lib/python3.8/dist-packages/google/protobuf/descriptor.py", line 28, in <module>
from google.protobuf.pyext import _message
ImportError: cannot import name '_message' from 'google.protobuf.pyext' (/usr/local/lib/python3.8/dist-packages/google/protobuf/pyext/__init__.py)
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/debian/jetson-containers/jetson_containers/build.py", line 102, in <module>
build_container(args.name, args.packages, args.base, args.build_flags, args.simulate, args.skip_tests, args.test_only, args.push, args.no_github_api)
File "/home/debian/jetson-containers/jetson_containers/container.py", line 160, in build_container
test_container(name, package, simulate)
File "/home/debian/jetson-containers/jetson_containers/container.py", line 320, in test_container
status = subprocess.run(cmd.replace(_NEWLINE_, ' '), executable='/bin/bash', shell=True, check=True)
File "/usr/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'docker run -t --rm --runtime=nvidia --network=host --volume /home/debian/jetson-containers/packages/onnx:/test --volume /home/debian/jetson-containers/data:/data --workdir /test my_container:r35.4.1 /bin/bash -c 'python3 test.py' 2>&1 | tee /home/debian/jetson-containers/logs/20231222_114450/test/my_container_r35.4.1_test.py.txt; exit ${PIPESTATUS[0]}' returned non-zero exit status 1.
Since the final container looks to have been built, even thought the test failed, I'm going to try to use it. I'll update with results.
It seems like an order thing. Switched back to master
and was able to successfully excercise this build
$ ./build.sh --name=my_container deepstream pytorch:2.1 opencv torchaudio torchvision tensorflow2
instead of deepstream
at the end, I put it up front, and the ordering of the container layers seems to work like this.
OK interesting, thanks @cahlen. It would appear the order in this case did in fact matter. The protobuf stuff is tricky to figure out sometimes. But hey if it works! Glad that you got it built 👍
I'm trying to build this container where
deepstream
depends onprotobuf:protobuf_cpp
, but for some reason it's erroring out on the ONNX container build because of the protobuf extension. I should note that I'm building this from thedev
branch also because of a comment I saw in another issue thread.I'm unclear on how to fix this.