run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
35.36k stars 4.98k forks source link

[Feature Request]: Add progress bar feature for query pipelines #10217

Closed xjw1001002 closed 7 months ago

xjw1001002 commented 7 months ago

Feature Description

Add progress bar feature for query pipelines or summarizer when it call LLM multiple times, or it's using some tools.

Reason

Now for a query pipeline (especially using summarizer), it will take a long time to get the output, as it will call LLM internally several times.

Value of Feature

Let user know the pipeline is working, without waiting a long time and worrying it's bugged.

dosubot[bot] commented 7 months ago

🤖

Thank you for your feature request. The LlamaIndex already has a feature to show a progress bar. This feature is controlled by the show_progress field in the QueryPipeline class. By default, this field is set to False, but you can set it to True to show a progress bar. However, the progress bar is currently only available for asynchronous operations.

Here is the relevant code:

class QueryPipeline(QueryComponent):
    """A query pipeline that can allow arbitrary chaining of different modules.

    A pipeline itself is a query component, and can be used as a module in another pipeline.

    """

    show_progress: bool = Field(
        default=False,
        description="Whether to show progress bar (currently async only).",
    )

To enable the progress bar, you can set the show_progress field to True when you create an instance of the QueryPipeline class. For example:

pipeline = QueryPipeline(show_progress=True)

Please note that the progress bar is currently only available for asynchronous operations. If you want to use it for synchronous operations, you would need to modify the code to support this. You can find more details in the llama_index/query_pipeline/query.py file.

The current implementation of the progress bar in the QueryPipeline class does provide real-time updates on the progress of the query pipeline. This is evident from the run_jobs function call in the arun method, where show_progress=self.show_progress is passed as an argument. This suggests that if self.show_progress is set to True, the progress bar will be displayed, providing real-time updates on the progress of the tasks being run.

Here is the relevant code:

# run tasks
output_dicts = await run_jobs(
    tasks, show_progress=self.show_progress, workers=self.num_workers
)

You can find more details in the llama_index/query_pipeline/query.py file.

I hope this information helps. If you have any other questions or need further clarification, please let me know.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

xjw1001002 commented 7 months ago

@dosu-bot If I use the pipeline in a streamlit app, how can I show the progress bar?

Current code:

 def run(self, history: Optional[str] = None, *args: Any, **kwargs: Any) -> Any:
    """Run the pipeline."""
    if history:
        kwargs["history"] = history
    else:
        kwargs["history"] = ""
    return self.query_pipeline.run(*args, **kwargs)

In APP:

pack = RAGFusionPipelinePack(path = os.path.join(T_path_base, T_doc_path),
                             llm = llm, llm_new=llm_for_app,
                             embed_model = embed_model,
                             summarizer = summarizer)

st.session_state.response_raw = st.session_state.pack.run(history = str(st.session_state.history_working), query=st.session_state.query_working)

dosubot[bot] commented 7 months ago

🤖

Thank you for your feature request. I understand that you would like to add a progress bar feature to the query pipelines in the LlamaIndex, particularly for the summarizer, to let users know the pipeline is working when using the pipeline in a Streamlit app.

Currently, the LlamaIndex codebase does support a progress bar feature for query pipelines. This is indicated by the show_progress field in the QueryPipeline class. Here is the relevant code:

class QueryPipeline(QueryComponent):
    """A query pipeline that can allow arbitrary chaining of different modules.

    A pipeline itself is a query component, and can be used as a module in another pipeline.

    """

    show_progress: bool = Field(
        default=False,
        description="Whether to show progress bar (currently async only).",
    )

The show_progress field is a boolean that defaults to False. If set to True, it will show a progress bar. However, the description indicates that this feature is currently only available for asynchronous operations.

To enable the progress bar, you need to set show_progress to True when you create an instance of the QueryPipeline class. Here is an example of how you can create an instance of the QueryPipeline class with the progress bar enabled:

pipeline = QueryPipeline(show_progress=True)

However, the provided context does not include the implementation of how the progress bar is displayed when show_progress is set to True. Therefore, you might need to implement this functionality yourself.

For example, you could modify the _run_multi and _arun_multi methods in the QueryPipeline class to update the progress bar after each module in the pipeline is processed. You could use a library like tqdm to create the progress bar.

Please note that the progress bar might not be visible in a Streamlit app if you simply print it to the console. You might need to use Streamlit's built-in progress bar functionality to display the progress bar in the app. You can do this by calling st.progress in Streamlit and updating the progress value as the pipeline processes the modules.

Please note that the provided context does not include enough information to provide a complete solution for displaying a progress bar in a Streamlit app. You might need to refer to the Streamlit documentation or other resources for more information on how to do this.

I hope this information is helpful. If you have any further questions or need more clarification, please let me know.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.