Open lyj157175 opened 1 week ago
@lyj157175 Which models (huggingface link) are you talking about?
I think https://huggingface.co/openbmb/MiniCPM-V-2 is a text-generation model (chat). For multi-model chat, I would recommend using https://github.com/vllm-project/vllm . This repo is for multi-model embeddings and reranking.
I mean these two models. These two are embedding and reranker models. Can vllm only load the chat model? https://huggingface.co/openbmb/MiniCPM-Embedding https://huggingface.co/openbmb/MiniCPM-Reranker
Yes, via pip install infinity_emb[all] flash-attn
Model description
minicpm3's embedding and reranker models can support?
Open source status
pip install infinity_emb[all] --upgrade
Provide useful links for the implementation
No response