Open doruksonmez opened 9 months ago
It looks like an issue generated by TVM library under the hood. When I manually started python cli and imported TVM, it throws CUDA_ERROR_NOT_INITIALIZED
just when I want to see CUDA device details.
$ python3
Python 3.8.10 (default, May 26 2023, 14:05:08)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tvm
>>> device = tvm.runtime.cuda(0)
>>> print(device.device_name)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.8/dist-packages/tvm/_ffi/runtime_ctypes.py", line 403, in device_name
return self._GetDeviceAttr(self.device_type, self.device_id, 5)
File "/usr/local/lib/python3.8/dist-packages/tvm/_ffi/runtime_ctypes.py", line 303, in _GetDeviceAttr
return tvm.runtime._ffi_api.GetDeviceAttr(device_type, device_id, attr_id)
File "tvm/_ffi/_cython/./packed_func.pxi", line 332, in tvm._ffi._cy3.core.PackedFuncBase.__call__
File "tvm/_ffi/_cython/./packed_func.pxi", line 263, in tvm._ffi._cy3.core.FuncCall
File "tvm/_ffi/_cython/./packed_func.pxi", line 252, in tvm._ffi._cy3.core.FuncCall3
File "tvm/_ffi/_cython/./base.pxi", line 182, in tvm._ffi._cy3.core.CHECK_CALL
File "/usr/local/lib/python3.8/dist-packages/tvm/_ffi/base.py", line 481, in raise_last_ffi_error
raise py_err
tvm._ffi.base.TVMError: Traceback (most recent call last):
[bt] (4) /usr/local/lib/python3.8/dist-packages/tvm/libtvm.so(TVMFuncCall+0x64) [0xffff7d24af54]
[bt] (3) /usr/local/lib/python3.8/dist-packages/tvm/libtvm.so(+0x3537044) [0xffff7d24c044]
[bt] (2) /usr/local/lib/python3.8/dist-packages/tvm/libtvm.so(tvm::runtime::CUDADeviceAPI::GetAttr(DLDevice, tvm::runtime::DeviceAttrKind, tvm::runtime::TVMRetValue*)+0x12e4) [0xffff7d394f3c]
[bt] (1) /usr/local/lib/python3.8/dist-packages/tvm/libtvm.so(tvm::runtime::detail::LogFatal::Entry::Finalize()+0x78) [0xffff7aef6f58]
[bt] (0) /usr/local/lib/python3.8/dist-packages/tvm/libtvm.so(tvm::runtime::Backtrace[abi:cxx11]()+0x30) [0xffff7d29f6f0]
File "/opt/mlc-llm/3rdparty/tvm/src/runtime/cuda/cuda_device_api.cc", line 72
CUDAError: cuDeviceGetName(&name[0], name.size(), dev.device_id) failed with error: CUDA_ERROR_NOT_INITIALIZED
Further test results:
Extracted the code part which generates the error and placed it in a separate script:
import tvm
target = tvm.target.cuda(arch='sm_87')
device = tvm.runtime.cuda(0)
assert(device.exist)
print(device.device_name)
When I run:
$ python3 test_tvm.py
> Orin
However, it still throws the same error when I run:
$ python3 -m local_llm.agents.video_query --api=mlc --model liuhaotian/llava-v1.5-7b --max-new-tokens 32 --video-input /dev/video0 --video-output display://0 --prompt "How many fingers am I holding up?"
I have also changed the --model value as follows but I don't think it is relevant anyways:
$ python3 -m local_llm.agents.video_query --api=mlc --model /data/models/huggingface/models--liuhaotian--llava-v1.5-7b/snapshots/12e054b30e8e061f423c7264bc97d4248232e965/ --max-new-tokens 32 --video-input /dev/video0 --video-output display://0 --prompt "How many fingers am I holding up?"
/usr/local/lib/python3.8/dist-packages/transformers/utils/hub.py:124: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
warnings.warn(
12:50:37 | INFO | loading /data/models/huggingface/models--liuhaotian--llava-v1.5-7b/snapshots/12e054b30e8e061f423c7264bc97d4248232e965/ with MLC
12:50:37 | INFO | running MLC quantization:
python3 -m mlc_llm.build --model /data/models/mlc/dist/models//llava-v1.5-7b --quantization q4f16_ft --target cuda --use-cuda-graph --use-flash-attn-mqa --sep-embed --max-seq-len 4096 --artifact-path /data/models/mlc/dist
Using path "/data/models/mlc/dist/models/llava-v1.5-7b" for model "llava-v1.5-7b"
Target configured: cuda -keys=cuda,gpu -arch=sm_87 -max_num_threads=1024 -max_shared_memory_per_block=49152 -max_threads_per_block=1024 -registers_per_block=65536 -thread_warp_size=32
Load cached module from /data/models/mlc/dist/llava-v1.5-7b-q4f16_ft/mod_cache_before_build.pkl and skip tracing. You can use --use-cache=0 to retrace
Finish exporting to /data/models/mlc/dist/llava-v1.5-7b-q4f16_ft/llava-v1.5-7b-q4f16_ft-cuda.so
SET TARGET CUDA
SET RUNTIME CUDA
Process Process-1:
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/opt/local_llm/local_llm/agents/video_query.py", line 115, in <module>
agent = VideoQuery(**vars(args)).run()
File "/opt/local_llm/local_llm/agents/video_query.py", line 22, in __init__
self.llm = ProcessProxy((lambda **kwargs: ChatQuery(model, drop_inputs=True, **kwargs)), **kwargs)
File "/opt/local_llm/local_llm/plugins/process_proxy.py", line 31, in __init__
raise RuntimeError(f"subprocess has an invalid initialization status ({init_msg['status']})")
RuntimeError: subprocess has an invalid initialization status (<class 'AssertionError'>)
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/opt/local_llm/local_llm/plugins/process_proxy.py", line 62, in run_process
raise error
File "/opt/local_llm/local_llm/plugins/process_proxy.py", line 59, in run_process
plugin = factory(**kwargs)
File "/opt/local_llm/local_llm/agents/video_query.py", line 22, in <lambda>
self.llm = ProcessProxy((lambda **kwargs: ChatQuery(model, drop_inputs=True, **kwargs)), **kwargs)
File "/opt/local_llm/local_llm/plugins/chat_query.py", line 63, in __init__
self.model = LocalLM.from_pretrained(model, **kwargs)
File "/opt/local_llm/local_llm/local_llm.py", line 72, in from_pretrained
model = MLCModel(model_path, **kwargs)
File "/opt/local_llm/local_llm/models/mlc.py", line 67, in __init__
assert(self.device.exist) # this is needed to initialize CUDA?
AssertionError
Hi @doruksonmez, are you running on a Jetson Orin device? MLC requires SM_80 or newer
Hi @dusty-nv,
Yes, I'm running on Jetson AGX Orin Dev Kit. I actually specified arch='sm_87'
for tvm.target.cuda(arch='sm_87')
as well.
I also tried ./build.sh local_llm
too for a local build with no luck. It still throws the same error on the local_llm:r35.4.1
image build.
OK gotcha - let's back up a sec and try some more basic usage of MLC to see if you can get that running. Are you able to run any of these?
If not, have you been able to use other GPU stuff in containers on your JetPack build (like in PyTorch, ect). If you are still having problems, I might recommend updating to JetPack 6 to get the latest.
Of course! Here are the results for both tests:
$ ./run.sh $(./autotag mlc)
# python3 -m mlc_llm.build --model Llama-2-7b-hf --quantization q4f16_ft --artifact-path /data/models/mlc/dist --max-seq-len 4096 --target cuda --use-cuda-graph --use-flash-attn-mqa
Using path "/data/models/mlc/dist/models/Llama-2-7b-hf" for model "Llama-2-7b-hf"
Target configured: cuda -keys=cuda,gpu -arch=sm_87 -max_num_threads=1024 -max_shared_memory_per_block=49152 -max_threads_per_block=1024 -registers_per_block=65536 -thread_warp_size=32
Load cached module from /data/models/mlc/dist/Llama-2-7b-hf-q4f16_ft/mod_cache_before_build.pkl and skip tracing. You can use --use-cache=0 to retrace
Finish exporting to /data/models/mlc/dist/Llama-2-7b-hf-q4f16_ft/Llama-2-7b-hf-q4f16_ft-cuda.so
I also tested it with llava-v1.5-7b
which is the main subject to actual live demo:
# python3 -m mlc_llm.build --model llava-v1.5-7b --quantization q4f16_ft --artifact-path /data/models/mlc/dist --max-seq-len 4096 --target cuda --use-cuda-graph --use-flash-attn-mqa
Using path "/data/models/mlc/dist/models/llava-v1.5-7b" for model "llava-v1.5-7b"
Target configured: cuda -keys=cuda,gpu -arch=sm_87 -max_num_threads=1024 -max_shared_memory_per_block=49152 -max_threads_per_block=1024 -registers_per_block=65536 -thread_warp_size=32
Load cached module from /data/models/mlc/dist/llava-v1.5-7b-q4f16_ft/mod_cache_before_build.pkl and skip tracing. You can use --use-cache=0 to retrace
Finish exporting to /data/models/mlc/dist/llava-v1.5-7b-q4f16_ft/llava-v1.5-7b-q4f16_ft-cuda.so
Finally, the benchmark with the Llama-2-7b-hf
:
# python3 /opt/mlc-llm/benchmark.py --model /data/models/mlc/dist/Llama-2-7b-hf-q4f16_ft/params --prompt /data/prompts/completion_16.json --max-new-tokens 128
Namespace(chat=False, max_new_tokens=128, max_num_prompts=None, model='/data/models/mlc/dist/Llama-2-7b-hf-q4f16_ft/params', prompt=['/data/prompts/completion_16.json'], save='', streaming=False)
-- loading /data/models/mlc/dist/Llama-2-7b-hf-q4f16_ft/params
PROMPT: Once upon a time, there was a little girl who loved to read.
февруари 2015 г.
Another fun read. I'm a huge fan of this author.
A heartwarming story of hope, redemption, and love.
...
AVERAGE OVER 9 RUNS, input=16, output=128
/data/models/mlc/dist/Llama-2-7b-hf-q4f16_ft/params: prefill_time 0.027 sec, prefill_rate 589.1 tokens/sec, decode_time 2.767 sec, decode_rate 46.3 tokens/sec
Peak memory usage: 654.12 MB
I logged in to my HuggingFace account as follows:
$ ./run.sh $(./autotag local_llm)
# huggingface-cli login
...
Token has not been saved to git credential helper.
Your token has been saved to /data/models/huggingface/token
Login successful
# python3 -m local_llm --api=mlc --model=meta-llama/Llama-2-7b-chat-hf --prompt 'hi, how are you?' --prompt 'whats the square root of 900?' --prompt 'whats the previous answer times 4?' --prompt 'can I get a recipie for french onion soup?'
...
Cannot access gated repo for url https://huggingface.co/api/models/meta-llama/Llama-2-7b-chat-hf/revision/main.
Access to model meta-llama/Llama-2-7b-chat-hf is restricted and you are not in the authorized list. Visit https://huggingface.co/meta-llama/Llama-2-7b-chat-hf to ask for access.
So, I don't think I have access to that model's repo but earlier today, I was successful on running your other Text and Vision based demos. So whatever the problem is, it should be the way of using MLC or TVM in that particular Live LLaVA demo.
So whatever the problem is, it should be the way of using MLC or TVM in that particular Live LLaVA demo.
OK, interesting - in that Live Llava demo, I had to run MLC/TVM in a subprocess (hence those exceptions about ProcessProxy, which is a wrapper that forwards/receives requests from that subprocess) in order to get everything running smoothly at the same time (like the continuous video stream and VLM simultaneously). I've mostly migrated to JP6 at this point and not tested it on JP5 - I would recommend either disabling that ProcessProxy stuff in the agent (you can mount your local jetson-containers/local_llm tree into the container for easier editing), or trying JetPack 6 on it.
Got the same problem on Orin NX 16 GB with JP 5.1.2 and can't upgrade to JP6 so far, how would you disable the ProcessProxy in video_query.py?
@leon-seidel @doruksonmez try changing this line to the following:
self.llm = ChatQuery(model, drop_inputs=True, **kwargs)
And when you start the container, mount your local copy of the code into the container like so:
./run.sh \
-v /mnt/NVME/jetson-containers/packages/llm/local_llm:/opt/local_llm/local_llm \
$(./autotag local_llm)
(then any code changes you make to local_llm package will be reflected inside the container without needing to rebuild it)
@dusty-nv I think it is working now but there is an issue related to X display as far as I understand from the logs:
....
13:04:37 | INFO | loading mm_projector weights from /data/models/huggingface/models--liuhaotian--llava-v1.5-7b/snapshots/12e054b30e8e061f423c7264bc97d4248232e965/mm_projector.bin
mm_projector Sequential(
(0): Linear(in_features=1024, out_features=4096, bias=True)
(1): GELU(approximate='none')
(2): Linear(in_features=4096, out_features=4096, bias=True)
)
┌─────────────┬───────────────────┐
│ name │ llava-v1.5-7b │
├─────────────┼───────────────────┤
│ api │ mlc │
├─────────────┼───────────────────┤
│ quant │ q4f16_ft │
├─────────────┼───────────────────┤
│ type │ llama │
├─────────────┼───────────────────┤
│ max_length │ 4096 │
├─────────────┼───────────────────┤
│ vocab_size │ 32000 │
├─────────────┼───────────────────┤
│ load_time │ 9.100941032986157 │
├─────────────┼───────────────────┤
│ params_size │ 3232.7265625 │
└─────────────┴───────────────────┘
13:04:37 | INFO | using chat template 'llava-v1' for model llava-v1.5-7b
13:04:37 | DEBUG | connected PrintStream to on_eos on channel=0
13:04:37 | DEBUG | connected ChatQuery to PrintStream on channel=0
13:04:37 | DEBUG | processing chat entry 0 role='system' template='${MESSAGE}\n\n' open_user_prompt=False cached=false text='A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.'
13:04:37 | DEBUG | embedding text (1, 32, 4096) float16 -> ```A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.\n\n```
13:04:37 | DEBUG | processing chat entry 1 role='user' template='USER: ${MESSAGE}\n' open_user_prompt=False cached=false text='What is 2+2?'
13:04:37 | DEBUG | embedding text (1, 11, 4096) float16 -> ```USER: What is 2+2?\n```
2+2 is 4.
...
Model successfully initiliazed but right after that, it fails to open the camera display.
2+2 is 4.
[gstreamer] initialized gstreamer, version 1.16.3.0
[gstreamer] gstCamera -- attempting to create device v4l2:///dev/video0
[gstreamer] gstCamera -- found v4l2 device: C505e HD Webcam
[gstreamer] v4l2-proplist, device.path=(string)/dev/video0, udev-probed=(boolean)false, device.api=(string)v4l2, v4l2.device.driver=(string)uvcvideo, v4l2.device.card=(string)"C505e\ HD\ Webcam", v4l2.device.bus_info=(string)usb-3610000.xhci-4.4, v4l2.device.version=(uint)330360, v4l2.device.capabilities=(uint)2225078273, v4l2.device.device_caps=(uint)69206017;
[gstreamer] gstCamera -- found 38 caps for v4l2 device /dev/video0
[gstreamer] [0] video/x-raw, format=(string)YUY2, width=(int)1280, height=(int)960, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 15/2, 5/1 };
[gstreamer] [1] video/x-raw, format=(string)YUY2, width=(int)1280, height=(int)720, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 15/2, 5/1 };
[gstreamer] [2] video/x-raw, format=(string)YUY2, width=(int)1184, height=(int)656, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 10/1, 5/1 };
[gstreamer] [3] video/x-raw, format=(string)YUY2, width=(int)960, height=(int)720, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 10/1, 5/1 };
[gstreamer] [4] video/x-raw, format=(string)YUY2, width=(int)1024, height=(int)576, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 10/1, 5/1 };
[gstreamer] [5] video/x-raw, format=(string)YUY2, width=(int)960, height=(int)544, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 15/1, 10/1, 5/1 };
[gstreamer] [6] video/x-raw, format=(string)YUY2, width=(int)800, height=(int)600, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [7] video/x-raw, format=(string)YUY2, width=(int)864, height=(int)480, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [8] video/x-raw, format=(string)YUY2, width=(int)800, height=(int)448, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [9] video/x-raw, format=(string)YUY2, width=(int)752, height=(int)416, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [10] video/x-raw, format=(string)YUY2, width=(int)640, height=(int)480, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [11] video/x-raw, format=(string)YUY2, width=(int)640, height=(int)360, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [12] video/x-raw, format=(string)YUY2, width=(int)544, height=(int)288, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [13] video/x-raw, format=(string)YUY2, width=(int)432, height=(int)240, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [14] video/x-raw, format=(string)YUY2, width=(int)352, height=(int)288, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [15] video/x-raw, format=(string)YUY2, width=(int)320, height=(int)240, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [16] video/x-raw, format=(string)YUY2, width=(int)320, height=(int)176, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [17] video/x-raw, format=(string)YUY2, width=(int)176, height=(int)144, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [18] video/x-raw, format=(string)YUY2, width=(int)160, height=(int)120, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [19] image/jpeg, width=(int)1280, height=(int)960, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [20] image/jpeg, width=(int)1280, height=(int)720, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [21] image/jpeg, width=(int)1184, height=(int)656, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [22] image/jpeg, width=(int)960, height=(int)720, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [23] image/jpeg, width=(int)1024, height=(int)576, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [24] image/jpeg, width=(int)960, height=(int)544, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [25] image/jpeg, width=(int)800, height=(int)600, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [26] image/jpeg, width=(int)864, height=(int)480, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [27] image/jpeg, width=(int)800, height=(int)448, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [28] image/jpeg, width=(int)752, height=(int)416, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [29] image/jpeg, width=(int)640, height=(int)480, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [30] image/jpeg, width=(int)640, height=(int)360, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [31] image/jpeg, width=(int)544, height=(int)288, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [32] image/jpeg, width=(int)432, height=(int)240, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [33] image/jpeg, width=(int)352, height=(int)288, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [34] image/jpeg, width=(int)320, height=(int)240, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [35] image/jpeg, width=(int)320, height=(int)176, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [36] image/jpeg, width=(int)176, height=(int)144, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] [37] image/jpeg, width=(int)160, height=(int)120, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction){ 30/1, 25/1, 20/1, 15/1, 10/1, 5/1 };
[gstreamer] gstCamera -- selected device profile: codec=MJPEG format=unknown width=1280 height=720 framerate=30
[gstreamer] gstCamera pipeline string:
[gstreamer] v4l2src device=/dev/video0 do-timestamp=true ! image/jpeg, width=(int)1280, height=(int)720, framerate=30/1 ! jpegdec name=decoder ! video/x-raw ! appsink name=mysink sync=false
[gstreamer] gstCamera successfully created device v4l2:///dev/video0
[video] created gstCamera from v4l2:///dev/video0
------------------------------------------------
gstCamera video options:
------------------------------------------------
-- URI: v4l2:///dev/video0
- protocol: v4l2
- location: /dev/video0
-- deviceType: v4l2
-- ioType: input
-- codec: MJPEG
-- codecType: cpu
-- width: 1280
-- height: 720
-- frameRate: 30
-- numBuffers: 4
-- zeroCopy: true
-- flipMethod: none
------------------------------------------------
[OpenGL] glDisplay -- X screen 0 resolution: 1920x1080
[OpenGL] glDisplay -- X window resolution: 1920x1080
[OpenGL] failed to create X11 Window.
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/opt/local_llm/local_llm/agents/video_query.py", line 116, in <module>
agent = VideoQuery(**vars(args)).run()
File "/opt/local_llm/local_llm/agents/video_query.py", line 39, in __init__
self.video_output = VideoOutput(**kwargs)
File "/opt/local_llm/local_llm/plugins/video.py", line 102, in __init__
self.stream = videoOutput(video_output, options=options)
Exception: jetson.utils -- failed to create videoOutput device
I actually tested the container if I can get display output from the below test script:
import numpy as np
import cv2 as cv
cap = cv.VideoCapture(0)
if not cap.isOpened():
print("Cannot open camera")
exit()
while True:
# Capture frame-by-frame
ret, frame = cap.read()
# if frame is read correctly ret is True
if not ret:
print("Can't receive frame (stream end?). Exiting ...")
break
# Display the resulting frame
cv.imshow('frame', frame)
if cv.waitKey(1) == ord('q'):
break
# When everything done, release the capture
cap.release()
cv.destroyAllWindows()
This is my command to run the container:
xhost + && sudo docker run --runtime nvidia -it --rm --network host --volume /tmp/argus_socket:/tmp/argus_socket --volume /etc/enctune.conf:/etc/enctune.conf --volume /etc/nv_tegra_release:/etc/nv_tegra_release --volume /tmp/nv_jetson_model:/tmp/nv_jetson_model --volume /mnt/orin/JetsonGenAI/jetson-containers/data:/data -v /mnt/orin/JetsonGenAI/jetson-containers/packages/llm/local_llm:/opt/local_llm/local_llm --device /dev/snd --device /dev/bus/usb -e DISPLAY=$DISPLAY -v /tmp/.X11-unix/:/tmp/.X11-unix --device=/dev/video0 --device=/dev/video1 dustynv/local_llm:r35.3.1
Hi @doruksonmez, are you able to run video-viewer.py /dev/video0 display://0
inside the container?
If so, can you try running this next:
python3 -m local_llm.agents.video_stream \
--video-input /dev/video0 \
--video-output display://0
Hi @dusty-nv, sorry for late responses due to time zones.
I'm able to run video-viewer.py /dev/video0 display://0
but the other one results the same.
OK thanks for letting me know @doruksonmez - you are on JetPack 5.1.2 / L4T R35.4.1 right?
Yes, that is correct. I don’t think it would be the cause but also I’m using your Docker image r35.3.1
on it.
I'm having the same issue (can not see the video via webrtc or X). However, I was able to work around it using video-viewer.
container video output:
--video-output rtsp://@:1234/output \
on the host:
video-viewer.py rtsp://localhost:1234/output display://0
Hi,
I'm just trying test out Live LLaVA using the following command:
However, it throws the following error from
local_llm.agents.video_query
module about some assertion error:What would be the reason for this error? Thanks.