Open mevince opened 3 months ago
Yep - we would welcome a PR
Sounds fun! I'm on this Any notes or hints on that one? @DarkLight1337 @robertgshaw2-neuralmagic
Thanks!
I also have interest in this task. Referring to #3734, I just use transformers.BertModel to implement the BertEmbeddingModel class(https://github.com/vllm-project/vllm/compare/main...laishzh:vllm:feat/bert). The code is in very early version, but it can output the embedding which I think is wrong~ The reason maybe is that the weights are not loaded correctly. This is my first development. I'm not sure whether is the right way to implement, or need to reimplement BertModel? Suggestions or cooperation are welcome.
@Etelis Also hope it helps.
The main thing you have to do is implement the BERTModel
or XLMRobertaModel
in the vllm/model_executor/models
directory using the layers in vllm/model_exeuctor/layers
. And then register the model in the Registry.
You can look at how llama and others are implemented in that directory as inspiration
Now with the introduction of embeddings: https://github.com/vllm-project/vllm/pull/3734, are there plans on the roadmap to support BERT models?