sxflynn / TeacherGPT

A proposed GPT chatbot for teachers that uses retrieval-augmentation to answer questions about their students.
MIT License
8 stars 1 forks source link

Adopted more async logic for better parallel processing #5

Closed sxflynn closed 6 months ago

sxflynn commented 6 months ago

Where part of the program generates a list of API tasks, it is important to run them in parallel to save time.

In chatlogic/src/orchestrator.py you can see where each _handle_call which makes a gql query will run in sequence, thus consuming a lot of time if there are many tasks to perform.

api_task_list = self._prompt_for_apis()
for api_call in api_task_list:
    await self._handle_call(api_call)

This PR implements new LLMClient and LLMPrompt constructors that allows you to select async as a method, which will make it instantiate the AsyncOpenAI class. This combined with the use of async in all potentially blocking LLM prompt calls, and the use of this updated asyncio code has increased the execution time:

api_task_list = await self._prompt_for_apis()
await asyncio.gather(*(self._handle_call(api_call) for api_call in api_task_list))

These api_call will now operate in parallel. There is still an issue with potentially catching the TransportAlreadyConnected so more work will have to be done to implement gql over Websockets on the Spring server.

Performance benchmark

This is not comprehensive but I ran 10 manual tests both before and after this PR.

This is an 18% decrease in average processing time, and will be even more dramatic as the queries become more complex and teachers ask general questions that require looking up data in 5+ APIs.