An open source chat bot architecture for voice/vision (and multimodal) assistants, local and remote to run; if u run achatbot by yourself, u can learn more, star and fork to contribute~
BSD 3-Clause "New" or "Revised" License
15
stars
2
forks
source link
feat: add cv2 capture videos and images for vision llm to inference #60
feat:
LLM_TAG=llm_transformers_manual_vision_qwen LLM_DEVICE=cuda \ LLM_MODEL_NAME_OR_PATH="./models/Qwen/Qwen2-VL-2B-Instruct" \ LLM_CHAT_HISTORY_SIZE=0 \ python -m unittest test.integration.processors.test_vision_processor.TestVisionProcessor