dusty-nv / NanoLLM

Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector DB, and RAG.
https://dusty-nv.github.io/NanoLLM/
MIT License
196 stars 31 forks source link

Using RTSP as --video-input #10

Open zw303 opened 5 months ago

zw303 commented 5 months ago

In Live LLaVA, NanoVLM and Nanodb, we use video files as video-input, since we don't have V4L2 USB webcam at the moment. Based on the demo descriptions, seems it supports network stream (like RTSP). We have IPCAMs can stream live rtsp video stream. It's H264, mainstream. VLC can view the stream correctly. We try to use it in the demo, but got error messages. Guess the command line format is wrong? Thanks!

jetson-containers run $(autotag nano_llm) \ python3 -m nano_llm.agents.video_query --api=mlc \ –model Efficient-Large-Model/VILA-2.7b \ –max-context-len 256 \ –max-new-tokens 32 \ –video-input rtsp://admin:labtest1@10.51.170.55/stream1 \ –video-output webrtc://@:8554/output \ –nanodb /data/nanodb/coco/2017 \

Namespace(packages=['nano_llm'], prefer=['local', 'registry', 'build'], disable=[''], user='dustynv', output='/tmp/autotag', quiet=False, verbose=False) -- L4T_VERSION=36.2.0 JETPACK_VERSION=6.0 CUDA_VERSION=12.2 -- Finding compatible container image for ['nano_llm']

dustynv/nano_llm:r36.2.0

===SKIP some logs in the middle===

│ name │ VILA-2.7b │ ├────────────────────────────┼─────────────────────────────────────────────────────────┤ │ api │ mlc │ ├────────────────────────────┼─────────────────────────────────────────────────────────┤ │ quant │ q4f16_ft │ ├────────────────────────────┼─────────────────────────────────────────────────────────┤ │ type │ llama │ ├────────────────────────────┼─────────────────────────────────────────────────────────┤ │ max_length │ 256 │ ├────────────────────────────┼─────────────────────────────────────────────────────────┤ │ prefill_chunk_size │ -1 │ ├────────────────────────────┼─────────────────────────────────────────────────────────┤ │ load_time │ 6.525660961000085 │ ├────────────────────────────┼─────────────────────────────────────────────────────────┤ │ params_size │ 1300.8330078125 │ └────────────────────────────┴─────────────────────────────────────────────────────────┘

15:28:28 | INFO | using chat template 'vicuna-v1' for model VILA-2.7b 15:28:28 | INFO | model 'VILA-2.7b', chat template 'vicuna-v1' stop tokens: [''] -> [2] 15:28:28 | INFO | ProcessProxy initialized, output_channels=5 15:28:28 | INFO | subprocess output could not be pickled (<class 'nano_llm.chat.stream.StreamingResponse'>), disabling channel 3

The answer is 4 URI -- missing/invalid IP port from rtsp://admin:LabSys101@10.51.170.55/stream1, default to port 554

(gst-plugin-scanner:84): GLib-GObject-WARNING **: 15:28:30.355: cannot register existing type 'GstRtpSrc'

(gst-plugin-scanner:84): GLib-GObject-CRITICAL **: 15:28:30.355: g_type_add_interface_static: assertion 'G_TYPE_IS_INSTANTIATABLE (instance_type)' failed

(gst-plugin-scanner:84): GLib-CRITICAL **: 15:28:30.355: g_once_init_leave: assertion 'result != 0' failed

(gst-plugin-scanner:84): GStreamer-CRITICAL **: 15:28:30.355: gst_element_register: assertion 'g_type_is_a (type, GST_TYPE_ELEMENT)' failed

(gst-plugin-scanner:84): GLib-GObject-WARNING **: 15:28:30.355: cannot register existing type 'GstRtpSink'

(gst-plugin-scanner:84): GLib-GObject-CRITICAL **: 15:28:30.355: g_type_add_interface_static: assertion 'G_TYPE_IS_INSTANTIATABLE (instance_type)' failed

(gst-plugin-scanner:84): GLib-CRITICAL **: 15:28:30.355: g_once_init_leave: assertion 'result != 0' failed

(gst-plugin-scanner:84): GStreamer-CRITICAL **: 15:28:30.355: gst_element_register: assertion 'g_type_is_a (type, GST_TYPE_ELEMENT)' failed (Argus) Error FileOperationFailed: Connecting to nvargus-daemon failed: Connection refused (in src/rpc/socket/client/SocketClientDispatch.cpp, function openSocketConnection(), line 204) (Argus) Error FileOperationFailed: Cannot create camera provider (in src/rpc/socket/client/SocketClientDispatch.cpp, function createCameraProvider(), line 106) sh: 1: lsmod: not found sh: 1: modprobe: not found [gstreamer] initialized gstreamer, version 1.20.3.0 [gstreamer] gstDecoder -- creating decoder for admin sh: 1: lsmod: not found sh: 1: modprobe: not found Opening in BLOCKING MODE NvMMLiteOpen : Block : BlockType = 261 NvMMLiteBlockCreate : Block : BlockType = 261

(python3:1): GStreamer-CRITICAL **: 15:28:30.888: gst_debug_log_valist: assertion 'category != NULL' failed

(python3:1): GStreamer-CRITICAL **: 15:28:30.888: gst_debug_log_valist: assertion 'category != NULL' failed

(python3:1): GStreamer-CRITICAL **: 15:28:30.888: gst_debug_log_valist: assertion 'category != NULL' failed

(python3:1): GStreamer-CRITICAL : 15:28:30.888: gst_debug_log_valist: assertion 'category != NULL' failed [gstreamer] gstDecoder -- failed to discover stream info [gstreamer] gstDecoder -- resource discovery and auto-negotiation failed [gstreamer] gstDecoder -- try manually setting the codec with the --input-codec option [gstreamer] gstDecoder -- failed to create decoder for rtsp://admin:LabSys101@10.51.170.55/stream1 Traceback (most recent call last): File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/opt/NanoLLM/nano_llm/agents/video_query.py", line 358, in agent = VideoQuery(vars(args)).run() File "/opt/NanoLLM/nano_llm/agents/video_query.py", line 59, in init self.video_source = VideoSource(**kwargs) #: The video source plugin File "/opt/NanoLLM/nano_llm/plugins/video.py", line 52, in init self.stream = videoSource(video_input, options=options) Exception: jetson.utils -- failed to create videoSource device