Grounded SAM 2 for streaming video tracking using natural language queries.
This system is comprised of three components:
conda create -n sam2 python=3.10 -y
conda activate sam2
pip install -e .
cd checkpoints
./download_ckpts.sh
cd gdino_checkpoints
./download_ckpts.sh
or huggingface version (recommend)
cd gdino_checkpoints
huggingface-cli download IDEA-Research/grounding-dino-tiny --local-dir grounding-dino-tiny
4.1 GPT4-o (recommend)
cd llm
touch .env
past your API_KEY or API_BASE (Azure only)
API_KEY="xxx"
API_BASE = "xxx"
4.2 Qwen2
cd llm_checkpoints
huggingface-cli download Qwen/Qwen2-7B-Instruct-AWQ --local-dir Qwen2-7B-Instruct-AWQ
install the corresponding packages
Step-1: Check available camera
python cam_detect.py
If a camera is detected, modify it in demo.py
.
Step-2: run demo
currently available model: Qwen2-7B-Instruct-AWQ
, gpt-4o-2024-05-13
python demo.py --model gpt-4o-2024-05-13