Open TangmereCottage opened 1 month ago
Ah, so, the docker: unknown server OS: .
was my fault - forgot to run newgrp docker
after sudo usermod -aG docker $USER
. After that jetson-containers run $(autotag l4t-pytorch)
works on 12.6! Yay.
Hi-level skinny for everyone:
$ git pull # to get all the most recent bugfixes
$ CUDA_VERSION=12.6 jetson-containers build transformers
Hi @TangmereCottage, for ROS I have pushed dustynv/ros:humble-desktop-l4t-r36.4.0
awhile, the ROS builds seem to be working fine after we got OpenCV squared. MLC I have pushed the wheels for 0.1.0 so far, so if you try to build that container, it should install them from my pip server as opposed to needing to compile it all.
NanoLLM I won't be able to fully check out until next week due to some other obligations I have, sorry about that and thanks for your understanding and support. You could try building it and giving it a go, most things have been working fine so far. Llama.cpp and ollama are up, both those and MLC have OpenAI-compliant servers. Good luck!
Thanks for the rapid roll out Dustin. I suggest you add the newgrp docker
information to the setup page information as I had the same issue and saw it resolved here.
@dusty-nv I echo the previous comments on the awesome work that you have been doing and understand that it will take sometime before everything normalizes. While creating the image for ros2 with vlm, i have been able to build the following images
The build process stops at Jetson-Inference - see attached build log. The error "opt/jetson-inference/c/tensorNet.cpp:29:10: fatal error: NvCaffeParser.h: No such file or directory" i presume is due to Tensorrt10 nano_llm_iron-r36.4.0-cu126-jetson-inference_main.txt
Best
Yea, jetson-inference is still needing updated for TRT10, I am most of the way there with it (hopefully) but had to leave for a trip. Should be back on it later this week. In this case... jetson-inference itself doesn't actually get used in the NanoLLM software (but jetson_utils does). So you might be able to rig it to pass the build for now.
jetson-containers run $(autotag nano_llm) python3 -m nano_llm.vision.video --model Efficient-Large-Model/VILA1.5-3b
All nano_llm
with VILA models seem to die in some version of this (corrupted size vs. prev_size
):
Finish exporting to ...
corrupted size vs. prev_size
Yea, jetson-inference is still needing updated for TRT10, I am most of the way there with it (hopefully) but had to leave for a trip. Should be back on it later this week. In this case... jetson-inference itself doesn't actually get used in the NanoLLM software (but jetson_utils does). So you might be able to rig it to pass the build for now.
I tried building by changing line 33 in config.py and replacing Jetson-inference with jetson-utils. The build process started and it went on to build the transformers image but the Jetson-inference build process restarted. I presume there is more rigging to be done - I couldn't figure that out though
Yea, jetson-inference is still needing updated for TRT10, I am most of the way there with it (hopefully) but had to leave for a trip. Should be back on it later this week. In this case... jetson-inference itself doesn't actually get used in the NanoLLM software (but jetson_utils does). So you might be able to rig it to pass the build for now.
I tried building by changing line 33 in config.py and replacing Jetson-inference with jetson-utils. The build process started and it went on to build the transformers image but the Jetson-inference build process restarted. I presume there is more rigging to be done - I couldn't figure that out though
I will check it
@johnnynunez Any update on this?
@johnnynunez Any update on this?
I'm still checking
my next idea was to be change jetson-inference dockerfile to depend on tensorrt:8.6
instead and see if that could work in the meantime... I just got back from a conference and will be looking into this soon but am underwater, sorry to keep you guys waiting.
my next idea was to be change jetson-inference dockerfile to depend on
tensorrt:8.6
instead and see if that could work in the meantime... I just got back from a conference and will be looking into this soon but am underwater, sorry to keep you guys waiting.
I'm building docker with jetson utils instead of jetson-inference with cuda 12.6. I'm checking if all is working. I have problems with onnxruntime
@dusty-nv onnxruntime-gpu is not compiling. I don’t know if it is tensorrt or what (1.19.2) is pointing to 10.2
@dusty-nv jax also is not building
@dusty-nv jetson-inference, torch2trt, onnxruntime are not building - i presume there is a connect with tensorrt
@dusty-nv jax also is not building
this is fixed with hermetic cuda
@dusty-nv jetson-inference, torch2trt, onnxruntime are not building - i presume there is a connect with tensorrt
jetson-inference because is still using tensorrt8 torch2trt is building for me with latest 36.4 onnxruntime, I have to check why. I think that now jetson is more recent cuda stack than onnxruntime stack, but onnxruntime is matching version with jetson in 1.20.0 https://onnxruntime.ai/roadmap
ok should i start the build by deleting the previous containers? I have already deleted the cache
Here is the initial build of NanoLLM container for JP6.1, let me know if it works for you: dustynv/nano_llm:r36.4.0
Thanks it does. I notice that Ros is not included - i presume i can use this as the base image and build on it - is that correct?
Struggling here with
NanoLLM
,mlc llm
,torch
, andtorchvision
on CUDA 12.6 and 36.4.0.Ask: I would be grateful for high level status info - will 12.6 be broadly supported soon, or, shall I downgrade to 12.2 (e.g.) and wait a few weeks for the dust to settle?
Main issue - trying out various "hello world" type commands across
jetson-containers
results in error messages and little else.BTW thanks @dusty-nv for everything you are doing and this heroic effort, especially during the CUDA 12.2->12.6 transition. Seems like step 1 is to:
Additional info - this is a super-boring, no modifications, bare metal fresh install using the Nvidia SDK, with everything docker moved to SSD
mnt/docker
per your instructions.