run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
36.75k stars 5.27k forks source link

[Feature Request]: improve support for async in llama_index #6591

Closed jjmachan closed 1 year ago

jjmachan commented 1 year ago

Description

Currently, LlamaIndex has support for asynchronous query requests but it is not implemented fully. the Retriever does not have an asynchronous retrieve part and the Synthesiser even though it has an async version. it is not implemented for a few.

The following components have to be made async since depending on implementation they might be called externally.

Todos

Motivation

So I'm working on revamping the Playground module so that user can quickly and cheaply prototype different LlamaIndex configurations and evaluate which ones work best. However, since a lot of the calls are blocking (esp in synthesis) the UX is bad.

The 2 ways to fix this are

  1. Batching: allow users to send a batch of queries in the playground. This means changes to the internals so that we have batching but offer significant speedups, especially at the synthesis stage. But this is a more significant change internally and might be harder to implement for just this use-case of evaluations.
  2. Async: a truly optimised async is as good as batching since the requests would be sent out in parallel. We have partial implementations for this so personally I think it would be easier to

Now having a complete async functionality can bring optimisation for user hosting LlamaIndex also since ASGI servers can be leveraged, increasing throughput.

Value of Feature

I'll let the numbers speak for themselves (although it's a crude implementation) using default vector_store and doc_store

with #6587 Pasted image 20230626131416

with #6590 - this has the most bang for the buck and I'll get this merged asap Pasted image 20230626142533

concerns

A solution I had in mind was to make async methods default internally while exposing something that feels synchronous to the users. query() will call aquery and execute the coroutines for the users.

The neat thing here is that all we have to do is make non-async functions async - even if they are blocking in the implementation stage. The developer is free to implement blocking/async methods.

jon-chuang commented 1 year ago

Definitely agreed. For instance, one aspect of this is

jon-chuang commented 1 year ago

A downstream benefit of async support is better webserver performance. Currently, tracing and callbacks would completely fail on such async serving use cases.

jjmachan commented 1 year ago

exactly @jon-chuang I feel the same too and I am happy to help contribute some of the things required but I guess we need the team's opinion on this

logan-markewich commented 1 year ago

@jjmachan you are free to modify and propose as you please! If it's a useful feature, it will be merged :)

dosubot[bot] commented 1 year ago

Hi, @jjmachan! I'm Dosu, and I'm helping the LlamaIndex team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, this issue is a feature request to improve support for async functionality in the LlamaIndex library. The author wants to revamp the Playground module to allow users to prototype different LlamaIndex configurations, but the current blocking calls make the user experience bad. Implementing async functionality would improve the speed and throughput of the library. @jon-chuang agrees and mentions a downstream benefit of better webserver performance.

It seems like there has been some discussion on this issue, and you have expressed your willingness to contribute and have asked for the team's opinion. @logan-markewich has encouraged you to modify and propose the feature.

Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LlamaIndex repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.

Thank you for your understanding and contribution to the LlamaIndex project!