Open benrito opened 2 months ago
One way of speeding up the responsiveness of the program is to start generating the next LLM call while the current agent is speaking.
This could be tricky but let's investigate...
Could be related to #9
One way of speeding up the responsiveness of the program is to start generating the next LLM call while the current agent is speaking.
This could be tricky but let's investigate...