Inference time is too long

bala1802 commented 8 months ago

I am currently using the Sessions - taskweaver_as_a_lib to execute my request.

The inference time is almost ~2 minutes for a simple query like `What is the Revenue of XYZ?". The task is to extract the data from the uploaded excel sheets.

My Config:

{ "llm.api_base": "<API BASE>", "llm.api_key": "<API KEY>", "llm.api_type":"<API TYPE>", "llm.auth_mode":"api-key", "llm.model": "<MODEL>", "session.roles": ["planner", "code_interpreter"] }

I am aware that configuring session.roles: ["code_interpreter"] (Only Code Interpreter) will reduce the processing time, but my use case is demanding me to configure both the Planner and Code Interpreter.

Device Specification

Any suggestions or best practices to follow ?

Thanks in advance

liqul commented 8 months ago

One idea is to configure different models for different roles. Please refer to https://microsoft.github.io/TaskWeaver/docs/llms/multi-llm. So, you can configure a quicker model for the planner.

liqul commented 7 months ago

close inactive issues

microsoft / TaskWeaver

Inference time is too long #269