[Question]: Value Prop of LlamaIndex Workflows over LangGraph?

brycecf commented 1 month ago

Question Validation

[X] I have searched both the documentation and discord for an answer.

Question

Both are quite verbose and seem to be converging in relative complexity, especially compared to the existing query pipelines.

Given that and as a non-LlamaParse user, what's the benefit of using Workflows over LangGraph (given LangChain's overall momentum)?

Particularly, given the current examples seem to be largely reimplementing existing functionality, but with more code...

dosubot[bot] commented 1 month ago

LlamaIndex Workflows offer several unique features and functionalities that differentiate them from LangGraph:

Data Connectors: Tools to ingest data from various native sources and formats, such as APIs, PDFs, and SQL databases.
Data Indexes: Structures data in intermediate representations that are easy and performant for LLMs to consume.
Engines: Different types of engines for natural language access to data:
- Query Engines: For question-answering interfaces (e.g., RAG pipelines).
- Chat Engines: For conversational interfaces with multi-message interactions.
Agents: LLM-powered knowledge workers augmented by tools, including simple helper functions and API integrations. The ReActAgent implementation allows for defining a set of tools, which can be Python functions or LlamaIndex query engines, to perform automated reasoning over data.
Observability/Evaluation: Integrations for rigorous experimentation, evaluation, and monitoring of applications.
LlamaCloud: Managed services for data parsing, ingestion, indexing, and retrieval, including LlamaParse, a state-of-the-art document parsing solution.
Community and Ecosystem: Strong community presence and related projects like LlamaHub for custom data connectors and create-llama for quickly scaffolding projects.
Integration Flexibility: Allows for both starter and customized builds. Users can start with the llama-index package for a quick setup or use llama-index-core to add specific integrations from LlamaHub, which offers over 300 integration packages.
Advanced Retrieval/Query Interface: Provides an advanced interface for feeding LLM input prompts and retrieving context and knowledge-augmented outputs.
Ease of Use for Beginners and Advanced Users: High-level APIs for beginners to ingest and query data in just a few lines of code, while also providing lower-level APIs for advanced users to customize and extend modules.
Agentic Components: Core modules capable of automated reasoning for different use cases over your data, making them essentially agents. Examples include SubQuestionQueryEngine for multi-document analysis, query transformations, routing, and LLM reranking.
Native OpenAIAgent: Includes an OpenAIAgent implementation built on the OpenAI API for function calling, allowing rapid agent development.
Integration with Other Frameworks: Can be used as a tool within other agent frameworks like LangChain and ChatGPT, providing deep integrations and additional functionalities ^[1]^[2]^[3].

These features collectively make LlamaIndex a comprehensive framework for building context-augmented generative AI applications with LLMs.

To continue talking to Dosu, mention @dosu.

logan-markewich commented 1 month ago

@brycecf Keep in mind this might be biased because I (and the other members of our fantastic team) worked on this, but!

Graph/DAG based approaches like langgraph (and our previous query pipelines) have to do a ton of heavy lifting under the hood to figure out "what runs next, what's the inputs, etc.". All this logic comes with tons of edge cases, which was becoming very evident in our increasingly complicated query pipeline codebase.

Workflows (and event-driven programming in general) offer a more robust solution. The code is much more simple under the hood, because now it's entirely in the user's hands what steps run next, how a user debug things, and removes a lot of the "black box" issues that other approaches like our query pipelines have.

Workflows have one job. If an event gets emitted, that event triggers any step that handles it.

This also impacts the dex-ux directly as well

in graphs/dags, a lot of Orchestration happens in the "edges", which can make the code hard to read, and can also be restrictive
Async is more straightforward in an event based system, while still allowing the user to define synchronization of previous steps
event driven approaches also allow you to dynamically add/remove/swap producers and consumers, which is pretty hard to do with a static dag/graph

I believe event driven programming offers an alternative to the dag/graph/pipeline based approaches out there, offering the mixture of flexibility, interpretability, and debuggability that agentic applications need today.

brycecf commented 1 month ago

@logan-markewich Thank you for the thorough reply. I'm appeased :)

run-llama / llama_index

[Question]: Value Prop of LlamaIndex Workflows over LangGraph? #15234

Question Validation

Question