Open TeoZosa opened 1 week ago
Hi @TeoZosa, I can raise this to the team tomorrow.
@jwlee64 sounds good, thanks for the prompt reply 👍
FWIW for feedback, I'm working around the issue by vendoring the code with this change:
- @weave.op()
+ @weave.op(postprocess_output=lambda output: output[0])
async def evaluate(self, model: Union[Callable, Model]) -> dict:
eval_results = await self.get_eval_results(model)
summary = await self.summarize(eval_results)
print("Evaluation summary", summary)
- return summary
+ return summary, eval_results
Hi @TeoZosa, we are going to have someone tackle this in the next week or so, or at minimum document a better way to get the eval_results with the current api.
Got it. Thanks for the update @jwlee64, keep me posted! 🙏
For this code: https://github.com/wandb/weave/blob/3eea8df108274ff22e4a9a382c66cc1e54f7213d/weave/flow/eval.py#L493-L510
Only
summary
is returned. For our use-case, we also want to grabeval_results
for downstream rendering and storage (to present user-friendly results to non-technical stakeholders; the Weave UI is information overload for those folks). Would this change make sense?As a workaround, I can call
get_eval_results
andsummarize
separately, but lose eval tracking in Evaluations since onlyEvaluation.evaluate
Calls are picked up.