Arize-ai / phoenix

AI Observability & Evaluation
https://docs.arize.com/phoenix
Other
3.32k stars 237 forks source link

[ENHANCEMENT] Checkpoint (persist) evals results #2149

Closed trevor-laviale-arize closed 3 weeks ago

trevor-laviale-arize commented 6 months ago

Is your feature request related to a problem? Please describe. When running evals across a large dataframe, it'd be useful to temporarily checkpoint/persist the results in case something happens (e.g., connection is closed / error is raised / etc.) midway through so you don't have to rerun the evals across the entire dataframe again.

Describe the solution you'd like Add native functionality for checkpointing evals results when calling llm_classify

Describe alternatives you've considered Using the below:

# initialize result Series
result = input_df.loc[:, []]
result[["label", "explanation"]] = None

missing = result.label.isna()
result[missing] = llm_classify(
    dataframe=input_df[missing],
    template=RAG_RELEVANCY_PROMPT_TEMPLATE,
    model=model,
    rails=rails,
    provide_explanation=True,
)
dosubot[bot] commented 6 months ago

🤖

Hey @trevor-laviale-arize, good to bump into you again here in the digital realm. I see you've come up with an enhancement idea for Arize Phoenix. Hang tight, I'm diving into this one and will get back to you shortly with a more comprehensive response.

dosubot[bot] commented 6 months ago

🤖

Hey @trevor-laviale-arize, good to bump into you again here in the digital realm. I see you've come up with an enhancement idea for Arize Phoenix. Hang tight, I'm diving into this one and will get back to you shortly with a more comprehensive response.

dosubot[bot] commented 6 months ago

🤖

It seems like you've got this under control, if you want help or have specific questions, let me know what I can do for you!


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.