Closed xiamengzhou closed 2 months ago
Hey there! You would be correct. We currently only support generating answers to the prompt using API endpoints. However, you can set up an endpoint using vLLM. This process should be fairly simple.
@xiamengzhou would a vLLM example in README help?
Hi! Thanks for releasing this awesome benchmark :)
I was interested in evaluating this benchmark using local models that I have trained or models that are available on Hugging Face. From what I understand, it appears that I would need to develop the generation pipeline myself, possibly using tools like vLLM or similar services. Do I miss anything here?