Closed whisper-bye closed 2 weeks ago
Unfortunately, llama.cpp
currently only supports fast inference for generative models and does not yet support embedding models, which means that VisRAG-Ret
cannot use llama.cpp
. As for VisRAG-Gen
, the models we used in the paper—MiniCPM-V-2
, MiniCPM-Llama3-V-2_5
, and MiniCPM-V-2_6
—are also not supported by llama.cpp
. However, we’re pleased to share that the newly released MiniCPM3-4B
does support llama.cpp
, and we welcome you to try it out!
as the title