run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
35.55k stars 5.02k forks source link

[Bug]: Workflows "hasattr" not functioning as intended. #16063

Open gsikes0501 opened 4 days ago

gsikes0501 commented 4 days ago

Bug Description

I was working through the Workflows Cookbook "Maintaining state between events" section. When I run the code on the website the check for the data attribute is not passed.

The data variable is set within the context variable in the setup step. This data is not recognized as an attribute when checked for within the query step.

I cannot get the script to recognize the attribute and output the part 3 from the example.

part 1

part 2

Running step query Step query produced event StopEvent Running step setup Step setup produced event StopEvent Got the data Yes you can

Version

0.11.6

Steps to Reproduce

class WaitExampleFlow(Workflow):
    @step
    async def setup(self, ctx: Context, ev: StartEvent) -> StopEvent:
        if hasattr(ev, "data"):
            await ctx.set("data", ev.data)

        return StopEvent(result=None)

    @step
    async def query(self, ctx: Context, ev: StartEvent) -> StopEvent:
        if hasattr(ev, "query"):
            # do we have any data?
            if hasattr(self, "data"):
                data = await ctx.get("data")
                return StopEvent(result=f"Got the data {data}")
            else:
                # there's non data yet
                return None
        else:
            # this isn't a query
            return None

Running Code

w = WaitExampleFlow(verbose=True)
result = await w.run(query="Can I kick it?")
if result is None:
    print("No you can't")
print("---")
result = await w.run(data="Yes you can")
print("---")

result = await w.run(query="Can I kick it?")
print(result)

Output

Running step query Step query produced no event Running step setup Step setup produced event StopEvent No you can't

Running step query Step query produced no event Running step setup Step setup produced event StopEvent

Running step query Step query produced no event Running step setup Step setup produced event StopEvent None

Relevant Logs/Tracbacks

No response

dosubot[bot] commented 4 days ago

Hey @gsikes0501! I'm here to help you with any issues you're facing. I can assist with bugs, answer questions, and even guide you on becoming a contributor. Let me know how I can assist you!

The hasattr check for the data attribute is not functioning as intended because the data attribute is being set in the context (ctx), not as an attribute of the workflow instance (self). Therefore, the hasattr(self, "data") check will always return False.

To fix this, you should retrieve the data from the context (ctx) instead of checking for it as an attribute of the workflow instance. Here is the corrected code:

class WaitExampleFlow(Workflow):
    @step
    async def setup(self, ctx: Context, ev: StartEvent) -> StopEvent:
        if "data" in ev:
            await ctx.set("data", ev["data"])

        return StopEvent(result=None)

    @step
    async def query(self, ctx: Context, ev: StartEvent) -> StopEvent:
        if "query" in ev:
            # do we have any data?
            data = await ctx.get("data", None)
            if data is not None:
                return StopEvent(result=f"Got the data {data}")
            else:
                # there's no data yet
                return None
        else:
            # this isn't a query
            return None

In this corrected version, ev["data"] is used to check if the data exists in the event's _data dictionary. If it does, it retrieves the data; otherwise, it returns None [1].

To continue talking to Dosu, mention @dosu.

gsikes0501 commented 4 days ago

The suggested code errors out without ever entering the second iteration where a value for data is passed. It does not find the data attribute and returns ValueError

w = WaitExampleFlow(verbose=True)
result = await w.run(query="Can I kick it?")
if result is None:
    print("No you can't")
print("---")
result = await w.run(data="Yes you can")
print("---")
result = await w.run(query="Can I kick it?")
print(result)

output: Running step query Running step setup Step setup produced event StopEvent

ValueError: Key 'data' not found in Context

logan-markewich commented 4 days ago

Yea this should be updated, seems like a bug?

Here's what it should look like

class WaitExampleFlow(Workflow):
    @step
    async def setup(self, ctx: Context, ev: StartEvent) -> StopEvent:
        if hasattr(ev, "data"):
            self.data = data  # <- notice that this gets set on self

        return StopEvent(result=None)

    @step
    async def query(self, ctx: Context, ev: StartEvent) -> StopEvent:
        if hasattr(ev, "query"):
            # do we have any data?
            if hasattr(self, "data"):
                data = self.data
                return StopEvent(result=f"Got the data {data}")
            else:
                # there's non data yet
                return None
        else:
            # this isn't a query
            return None

In the latest version v0.11.10 we also made this possible to do without using self

Here's how I might do this in the latest release, reusing the existing context between runs

class WaitExampleFlow(Workflow):
    @step
    async def setup(self, ctx: Context, ev: StartEvent) -> StopEvent:
        if hasattr(ev, "data"):
            await ctx.set("data", data)

        return StopEvent(result=None)

    @step
    async def query(self, ctx: Context, ev: StartEvent) -> StopEvent:
        if hasattr(ev, "query"):
            # do we have any data?
            data = await ctx.get("data", default=None)
            if data is not None:
                return StopEvent(result=f"Got the data {data}")
            else:
                # there's non data yet
                return None
        else:
            # this isn't a query
            return None

w = WaitExampleFlow(verbose=True)
handler = w.run(query="Can I kick it?")
result = await handler
if result is None:
    print("No you can't")
print("---")
handler = w.run(ctx=handler.ctx, data="Yes you can")
_ = await handler
print("---")

result = await w.run(ctx=handler.ctx, query="Can I kick it?")
print(result)
dosubot[bot] commented 4 days ago

The ValueError: Key 'data' not found in Context occurs because the query step is checking for the attribute self.data instead of retrieving the data from the context using ctx.get("data"). The corrected code should retrieve the data from the context.

Here is the corrected WaitExampleFlow class:

class WaitExampleFlow(Workflow):
    @step
    async def setup(self, ctx: Context, ev: StartEvent) -> StopEvent:
        if hasattr(ev, "data"):
            await ctx.set("data", ev.data)

        return StopEvent(result=None)

    @step
    async def query(self, ctx: Context, ev: StartEvent) -> StopEvent:
        if hasattr(ev, "query"):
            # do we have any data?
            if await ctx.has("data"):
                data = await ctx.get("data")
                return StopEvent(result=f"Got the data {data}")
            else:
                # there's no data yet
                return None
        else:
            # this isn't a query
            return None

In this corrected version, the query step uses await ctx.has("data") to check if the data exists in the context and then retrieves it using await ctx.get("data"). This ensures that the data is correctly retrieved from the context, avoiding the ValueError [1][2].

To continue talking to Dosu, mention @dosu.

gsikes0501 commented 4 days ago

@dosu AttributeError: 'Context' object has no attribute 'has'

dosubot[bot] commented 4 days ago

To fix the AttributeError: 'Context' object has no attribute 'has' in the WaitExampleFlow class, you need to replace the hasattr(self, "data") check with a check that uses the ctx object to see if the data exists. The Context object should be used to manage the state.

Here is the corrected code:

class WaitExampleFlow(Workflow):
    @step
    async def setup(self, ctx: Context, ev: StartEvent) -> StopEvent:
        if hasattr(ev, "data"):
            await ctx.set("data", ev.data)

        return StopEvent(result=None)

    @step
    async def query(self, ctx: Context, ev: StartEvent) -> StopEvent:
        if hasattr(ev, "query"):
            # do we have any data?
            try:
                data = await ctx.get("data")
                return StopEvent(result=f"Got the data {data}")
            except ValueError:
                # there's no data yet
                return None
        else:
            # this isn't a query
            return None

In this corrected version, await ctx.get("data") is used to check if the data exists in the context. If the data does not exist, a ValueError is raised, which is caught to handle the case where there is no data yet [1][2].

To continue talking to Dosu, mention @dosu.

gsikes0501 commented 4 days ago

@logan-markewich I tried the first version. When I run the line that passes data I get: NameError: name 'data' is not defined

I upgraded my llama-index version to v0.11.10 and tried the second suggestion but receive the following: ValueError: Key 'data' not found in Context