Azure-Samples / ai-rag-chat-evaluator

Tools for evaluation of RAG Chat Apps using Azure AI Evaluate SDK and OpenAI
MIT License
163 stars 59 forks source link

Review tool errors "No such file or directory: 'my_results/experiment1705604697/parameters.json'" #26

Closed diberry closed 5 months ago

diberry commented 5 months ago

Please provide us with the following information:

This issue is for a: (mark with an x)

- [x ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

  1. Create evaluations in my_results directory.
image
  1. python3 -m review_tools summary my_results
  2. Notice error is looking for a different file than the 1 the tool created. parameters.json versus evaluate_parameters.json

Any log messages given by the failure

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /workspaces/ai-rag-chat-evaluator/review_tools/cli.py:20 in summary                              │
│                                                                                                  │
│   17                                                                                             │
│   18 @app.command()                                                                              │
│   19 def summary(results_dir: Path = typer.Argument(exists=True, dir_okay=True, file_okay=Fal    │
│ ❱ 20 │   summary_app.main(results_dir)                                                           │
│   21                                                                                             │
│   22                                                                                             │
│   23 def cli():                                                                                  │
│                                                                                                  │
│ ╭─────────────── locals ────────────────╮                                                        │
│ │ results_dir = PosixPath('my_results') │                                                        │
│ ╰───────────────────────────────────────╯                                                        │
│                                                                                                  │
│ /workspaces/ai-rag-chat-evaluator/review_tools/summary_app.py:78 in main                         │
│                                                                                                  │
│   75                                                                                             │
│   76                                                                                             │
│   77 def main(directory: Path):                                                                  │
│ ❱ 78 │   app = TableApp(directory)                                                               │
│   79 │   app.run()                                                                               │
│   80                                                                                             │
│                                                                                                  │
│ ╭────────────── locals ───────────────╮                                                          │
│ │ directory = PosixPath('my_results') │                                                          │
│ ╰─────────────────────────────────────╯                                                          │
│                                                                                                  │
│ /workspaces/ai-rag-chat-evaluator/review_tools/summary_app.py:59 in __init__                     │
│                                                                                                  │
│   56 │   │   │   │   │   │   summary.get("answer_length", {}).get("mean", "Unknown"),            │
│   57 │   │   │   │   │   )                                                                       │
│   58 │   │   │   │   )                                                                           │
│ ❱ 59 │   │   │   with open(Path(results_dir) / folder / "parameters.json") as f:                 │
│   60 │   │   │   │   self.row_parameters[folder] = json.load(f)                                  │
│   61 │                                                                                           │
│   62 │   def compose(self) -> ComposeResult:                                                     │
│                                                                                                  │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │     citation = 1.0                                                                           │ │
│ │    coherence = {'mean_rating': 5.0, 'pass_count': 14, 'pass_rate': 1.0}                      │ │
│ │            f = <_io.TextIOWrapper name='my_results/experiment1705604697/summary.json'        │ │
│ │                mode='r' encoding='UTF-8'>                                                    │ │
│ │       folder = 'experiment1705604697'                                                        │ │
│ │      folders = ['experiment1705604697', 'experiment1705605065', 'experiment1705605215']      │ │
│ │ groundedness = {'mean_rating': 5.0, 'pass_count': 14, 'pass_rate': 1.0}                      │ │
│ │    relevance = {'mean_rating': 5.0, 'pass_count': 14, 'pass_rate': 1.0}                      │ │
│ │  results_dir = PosixPath('my_results')                                                       │ │
│ │         self = TableApp(title='TableApp', classes={'-dark-mode'})                            │ │
│ │      summary = {                                                                             │ │
│ │                │   'gpt_coherence': {                                                        │ │
│ │                │   │   'mean_rating': 5.0,                                                   │ │
│ │                │   │   'pass_count': 14,                                                     │ │
│ │                │   │   'pass_rate': 1.0                                                      │ │
│ │                │   },                                                                        │ │
│ │                │   'gpt_relevance': {                                                        │ │
│ │                │   │   'mean_rating': 5.0,                                                   │ │
│ │                │   │   'pass_count': 14,                                                     │ │
│ │                │   │   'pass_rate': 1.0                                                      │ │
│ │                │   },                                                                        │ │
│ │                │   'gpt_groundedness': {                                                     │ │
│ │                │   │   'mean_rating': 5.0,                                                   │ │
│ │                │   │   'pass_count': 14,                                                     │ │
│ │                │   │   'pass_rate': 1.0                                                      │ │
│ │                │   },                                                                        │ │
│ │                │   'answer_length': {                                                        │ │
│ │                │   │   'total': 22932,                                                       │ │
│ │                │   │   'mean': 1638.0,                                                       │ │
│ │                │   │   'max': 2615,                                                          │ │
│ │                │   │   'min': 705                                                            │ │
│ │                │   },                                                                        │ │
│ │                │   'answer_has_citation': {'total': 14, 'rate': 1.0}                         │ │
│ │                }                                                                             │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
FileNotFoundError: [Errno 2] No such file or directory: 'my_results/experiment1705604697/parameters.json'

Expected/desired behavior

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

@pamelafox - love the new output for errors - much easier to figure out.

pamelafox commented 5 months ago

Oh oops, I'll rename it in one of the places (either the existing examples or the evaluate script).

That's actually the output from any textual app, I think the errors come from rich. They're really pretty though sometimes I think they're overkill, too many levels.

diberry commented 5 months ago

@pamelafox can I PR this fix in. I need it for the doc.

pamelafox commented 5 months ago

Merged fix in #30