wandb / weave

Weave is a toolkit for developing AI-powered applications, built by Weights & Biases.
https://wandb.me/weave
Apache License 2.0
659 stars 49 forks source link

Model `predict` input format changed in `v0.50.15` #2255

Open naingthet opened 2 weeks ago

naingthet commented 2 weeks ago

I had code in weave v0.50.14 that successfully passed data into models through the Evaluation class. These objects are passed into the model predict method as dict-like objects, allowing usage like so:

model = FindingsSearchModel(...)
evaluation = weave.Evaluation(
    name=f"{finding_type}-evaluation", dataset=payloads, scorers=[LLMJudge(...)]
)
results = await evaluation.evaluate(model)

Where model looks like:

class Model(weave.Model):
    @weave.op()
    def predict(self, query: dict[str, Any]) -> QueryResult:
        arg1 = query["arg1"] # This works in v0.50.14
        ...

However, in v0.50.15, this code no longer works and I receive an error message indicating that the query is now of type BoxedStr. I'm not sure how to handle this since it does not appear to be a dict-like or object-like type. Any help would be greatly appreciated.

jamie-rasmussen commented 2 weeks ago

Thank you for reporting this, we're investigating this regression. Downgrading to v0.50.14 would be the best short term workaround.

tssweeney commented 2 weeks ago

@naingthet can yu share the shape of payloads?

naingthet commented 2 weeks ago

@naingthet can yu share the shape of payloads?

Thanks for investigating. Loving weave so far and excited to see where it goes!

payloads is of type list[dict[str, Any]].

andrewtruong commented 2 weeks ago

Hey @naingthet, does this repro in 0.51.0?. If yes, can you help me repro this? I've tried a few cases but can't seem to get the behaviour you're seeing. Here's a minimal example that seems to work fine:

from typing import Any
import weave

class Model(weave.Model):
    @weave.op()
    def predict(self, query: dict[str, Any]):
        arg1 = query["arg1"]
        return arg1 + 1

m = Model()

payloads = [
    {"query": {"arg1": 1}},
    {"query": {"arg1": 2}},
    {"query": {"arg1": 3}},
    {"query": {"arg1": 4}},
    {"query": {"arg1": 5}},
]

evaluation = weave.Evaluation(name=f"evaluation", dataset=payloads)
res = await evaluation.evaluate(m)