Open jlia0 opened 1 year ago
Thank you for advice. We are happy to see more and more model support in web-llm. There are already open PR about ChatGLM and Dolly model support. If you are interested in bringing embedding models in, you can take them as a reference and bring up new PR.
Thank you! Are there any specific modifications needed for embeddings models? For example, does the TensorIR technique still apply to embeddings models?
Yes of course embedding can be represented in TensorIR. So basically what you need is to translate the model (pytorch implementation) into corresponding relax operator. If there's no direct translation, write TensorIR manually or write te expressions that can be converted to TensorIR
Would be really nice to have WebGPU support for running other transformer models like sbert and embeddings models. For example, here's transformer.js
Thanks!
@jinhongyii