Adapt with huggingface_hub change in ChatCompletionInputMessage
Fix sth in tgi_model
Fix tiny bugs
Adapt integration test to new Pipeline
Adapt PR to new PromptManager
Hi there!
This PR attempts to address the need for evaluating endpoint models on chat completion tasks, i.e. using chat templating. BaseModel and NanotronModel
supported it through FewshotManager.fewshot_context() which applies chat template to the fewshot & query examples. For endpoint models we could either use
the very InferenceClient.text_generation() or the native IneferenceClient.chat_completion() apis. This PR attempts to use the latter.
Generally, could be fruitful if Lighteval makes use of huggingface_hub types extensively? At least for GenerativeResponse's result attribute to be of type ChatcompletionOutput|TextGenerationOutput and metrics work with inputs of these types as well so that we could evaluate function calling and tools easily. Or for GreedyUntilRequest's context attribute to be of type Conversation : TypeAlias = List[ChatCompletionInputMessage] to be able to feed tools params.
Pipeline
PromptManager
Hi there!
This PR attempts to address the need for evaluating endpoint models on chat completion tasks, i.e. using chat templating.
BaseModel
andNanotronModel
supported it throughFewshotManager.fewshot_context()
which applies chat template to the fewshot & query examples. For endpoint models we could either use the veryInferenceClient.text_generation()
or the nativeIneferenceClient.chat_completion()
apis. This PR attempts to use the latter.Generally, could be fruitful if Lighteval makes use of
huggingface_hub
types extensively? At least forGenerativeResponse
'sresult
attribute to be of typeChatcompletionOutput|TextGenerationOutput
and metrics work with inputs of these types as well so that we could evaluate function calling and tools easily. Or forGreedyUntilRequest
'scontext
attribute to be of typeConversation : TypeAlias = List[ChatCompletionInputMessage]
to be able to feed tools params.