Closed ZackBradshaw closed 18 hours ago
Support for Video and Image understanding in search. I'm looking to use this with VILA. I'd love to use lmdeploy but as vila is not supported I'm wondering how feasible it would be to swap out inference engine to some thing like tiny chat
nvm found https://huggingface.co/OpenGVLab/InternVL2-40B Still wondering if this works with mllm
Support for Video and Image understanding in search. I'm looking to use this with VILA. I'd love to use lmdeploy but as vila is not supported I'm wondering how feasible it would be to swap out inference engine to some thing like tiny chat