mlc-ai / web-llm

High-performance In-browser LLM Inference Engine
https://webllm.mlc.ai
Apache License 2.0
11.87k stars 747 forks source link

[embeddings] Any plans for adding other transformer models like sentence-transformers? #67

Open jlia0 opened 1 year ago

jlia0 commented 1 year ago

Would be really nice to have WebGPU support for running other transformer models like sbert and embeddings models. For example, here's transformer.js

Thanks!

@jinhongyii

jinhongyii commented 1 year ago

Thank you for advice. We are happy to see more and more model support in web-llm. There are already open PR about ChatGLM and Dolly model support. If you are interested in bringing embedding models in, you can take them as a reference and bring up new PR.

jlia0 commented 1 year ago

Thank you! Are there any specific modifications needed for embeddings models? For example, does the TensorIR technique still apply to embeddings models?

jinhongyii commented 1 year ago

Yes of course embedding can be represented in TensorIR. So basically what you need is to translate the model (pytorch implementation) into corresponding relax operator. If there's no direct translation, write TensorIR manually or write te expressions that can be converted to TensorIR