ai-evaluation Search Results

manisnesan/til #98

Inspect ai evaluation

https://ukgovernmentbeis.github.io/inspect_ai/ via [Hamel post](https://x.com/hamelhusain/status/1788936691576975382?s=46&t=aOEVGBVv9ICQLUYL4fQHlQ)

manisnesan updated 6 months ago

elastic/kibana #200064

[Observability AI Assistant] [Root Cause Analysis] Create an…

As we begin to evaluate LLM assisted root cause analysis, we need a way to be able to evaluate the validity and usefulness of the results. Historically, our process for evaluating these results has …

dominiqueclarke updated 3 days ago

jacob-ai-bot/jacob #313

Implement 'Evaluation' Component in TodoDetails to Display …

## Description Enhance the `TodoDetails` page within the dashboard by introducing a new `Evaluation` component. This component will display data generated from `evaluateIssue.ts`, providing users wit…

kleneway updated 3 days ago

ErinvanderVeen/OOOO #19

Evaluation of AI

We should see if there is a good way to evaluate the chess engine.

ErinvanderVeen updated 5 years ago

jacob-ai-bot/jacob #322

Enhance Evaluation Component UI in Todo Section of Dashboar…

## Description The evaluation component within the todo section of the dashboard currently provides the necessary functionality but lacks visual polish. To improve user engagement and make the compon…

kleneway updated 2 days ago

All-Hands-AI/OpenHands #4042

[Bug]: While running remote runtime evaluation, I encountere…

### Is there an existing issue for the same bug? - [X] I have checked the troubleshooting document at https://docs.all-hands.dev/modules/usage/troubleshooting - [X] I have checked the existing issues…

aglassoforange updated 2 weeks ago

run-llama/llama_index #16735

[Documentation]: Unexpected keyword argument 'base_query_eng…

### Documentation Issue Description I encountered an error when trying to use RetryGuidelineQueryEngine in my code with the keyword argument base_query_engine. The error message is as follows: __ini…

MatMaxMatrix updated 1 week ago

microsoft/teams-ai #2119

[Feature Request]: Add observability to model input and outp…

### Scenario It is important to add **observability** to an AI bot built with teams-ai SDK since the AI may have non-deterministic behaviors. Currently it is quite hard to evaluate against an AI bot …

dooriya updated 2 weeks ago

serratus-bio/open-virome #142

[LLM] Define a falsifiable, measurable hypothesis

### Task 3: Define a falsifiable, measurable hypothesis. > Our first hypothesis questions the validity of using an AI model for querying a database > at all, and whether an LLM can effectively retrie…

ababaian updated 3 days ago

angrycaptain19/Tic-tac-toe-2 #25

Computer move decisions creates an infinite loop and freezes

Anytime I play against a computer it gets stuck in the "thinking" state and the page freezes. Need to fix to allow the computer to correctly calculate and make it's move. We also need to correc…

angrycaptain19 updated 1 week ago

1000+ results for ai-evaluation

1000+ results
for ai-evaluation