JohnSnowLabs / langtest

Deliver safe & effective language models
http://langtest.org/
Apache License 2.0
490 stars 37 forks source link

Make integration with API-based LLMs generic and extendable by users #953

Closed dcecchini closed 5 months ago

dcecchini commented 8 months ago

Some users may use OpenAI or HF inference that we currently support as hubs, but other users may be working on a custom LLM and have the model running on a different hub. We could extend the API implementation to be generic such that the user can specify the parameters to connect to any API and how to parse the results.

For example, the user may define the URL and parameters of their API, and a function to parse the results. In this way, any API system could be supported by LangTest.

Two tools to have in mind: vLLM and HugginFace Text Generation Inference

HCTsai commented 7 months ago

I found a trick that can test on vLLM HTTP endpoint.

langtest use the default settings of OpenAI python library. The default setting of OpenAI python library tries to read environment variable(OPENAI_BASE_URL) first to determinate connetion endpoint.

So, you can control langtest and underline OpenAI behaviors by setting OS environment variables.

Following sample code works for me: (You can chage OPENAI_BASE_URL or model_name based on your needs)

from langtest import Harness
import os
os.environ["OPENAI_API_KEY"] = "EMPTY"
os.environ["OPENAI_BASE_URL"] = "your.vllm.endpoint"
model_name = "vllm_model_name"

h = Harness(task="question-answering",
                    model={"model": model_name, "hub":'openai'},
                    data={"data_source" :"BoolQ","split":"test-tiny"}, 
                    config='./test.yml')

h.generate()
h.run()
h.report(format="html", save_dir="./report.html")

I don't know much about Huggingface TGI. Maybe the same trick could works.

dcecchini commented 7 months ago

Thats very helpful, @HCTsai. We start creating a generic endpoint implementation with that logic.

Prikshit7766 commented 7 months ago

DeepInfra: