Keyword Matching - Githubissues

This PR introduces a new flag in the evaluate command called eval-mode. This helps to support keyword matching where we try to match rows that have a text content which has at least 3 rows.

You can run the text search by doing

rag-app evaluate from-jsonl --input-file-path ./output.jsonl --db-path ./db --table-name pg --eval-mode fts

When we generate bad keywords using a llm, then we get much fewer rows ( Less than 25 ) since our query only returns rows that have at least one match. This results in some queries returning 0 or 1 rows which results in a 'n/a' or 0/1 value for the ndcg respectively.

I'm not sure if this is the behaviour that we might want so will explore other queries.

	:rocket: This PR description was created by Ellipsis for commit 184dcaf9c9f424c195672d97aa51ca0cbd156029.

Summary:

This PR introduces a new fts mode to the evaluate command in rag_app for keyword matching, updates the output.jsonl file, and modifies the calculate_ndcg function in rag_app/src/metrics.py to handle cases with 0 or 1 predictions.

Key points:

Introduced a new flag eval-mode in the evaluate command of rag_app to support keyword matching.
Added fts (Full Text Search) mode that generates keywords for questions and matches chunks with these keywords.
Updated output.jsonl file.
Modified calculate_ndcg function in rag_app/src/metrics.py to handle cases with 0 or 1 predictions.
Added uncertainty about the behavior of the new mode and plans to explore other queries.

Generated with :heart: by ellipsis.dev

Seems like the python lib also supports FTS using an index through tantivity (https://lancedb.github.io/lancedb/fts/#index-multiple-columns )

Will look into this

Current FTS results

                                  MRR@3  MRR@5  MRR@10  MRR@20 NDCG@3 NDCG@5 NDCG@10 NDCG@20  retrieved_size
chunk_id                                                                                                    
09369eb77f4c743034d01d13744faf6d    1.0    1.0     1.0    1.00    1.0    1.0     1.0     1.0               7
b6bb3ecf14a22bc3144b3c8bb101a1e8    1.0    1.0     1.0    1.00    1.0    1.0     1.0     1.0              25
65d0dc53f68d4e5a9b95e6268673cc09    1.0    1.0     1.0    1.00    1.0    1.0     1.0     1.0              25
f8270d09dace076cb5dbd79309d08fa4    0.0    0.0     0.0    0.05    0.0    0.0     0.0    0.23              25
de494946e19340c348b991e5845cd8c4    1.0    1.0     1.0    1.00    1.0    1.0     1.0     1.0               4
afc4b95c2537f788fe5711c9835b58bf    0.0    0.0     0.0    0.00      0      0       0       0               1
66288d8fe7e8f36b7f4c2bf4d5af7b18    1.0    1.0     1.0    1.00    1.0    1.0     1.0     1.0               9
3bb3e61a043cd396acc7669d021ab532    0.0    0.0     0.0    0.00    N/A    N/A     N/A     N/A               0
ddd2c319f6b2bae1f9583b497bc615e4    0.0    0.0     0.0    0.00      0      0       0       0               1
965e5abc2b1409f099518c51d13d7a5a    0.0    0.0     0.0    0.00      0      0       0       0               1

              Mean Values              
┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┓
┃ Metric         ┃ Value              ┃
┡━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━┩
│ MRR@3          │ 0.5                │
│ MRR@5          │ 0.5                │
│ MRR@10         │ 0.5                │
│ MRR@20         │ 0.505              │
│ NDCG@3         │ 0.5555555555555556 │
│ NDCG@5         │ 0.5555555555555556 │
│ NDCG@10        │ 0.5555555555555556 │
│ NDCG@20        │ 0.5811111111111111 │
│ retrieved_size │ 9.8                │

This is the respective semantic search result

                                  MRR@3  MRR@5  MRR@10  MRR@20  NDCG@3  NDCG@5  NDCG@10  NDCG@20  retrieved_size
chunk_id                                                                                                        
09369eb77f4c743034d01d13744faf6d    1.0    1.0     1.0     1.0    1.00    1.00     1.00     1.00              25
b6bb3ecf14a22bc3144b3c8bb101a1e8    1.0    1.0     1.0     1.0    1.00    1.00     1.00     1.00              25
65d0dc53f68d4e5a9b95e6268673cc09    1.0    1.0     1.0     1.0    1.00    1.00     1.00     1.00              25
f8270d09dace076cb5dbd79309d08fa4    1.0    1.0     1.0     1.0    1.00    1.00     1.00     1.00              25
de494946e19340c348b991e5845cd8c4    1.0    1.0     1.0     1.0    1.00    1.00     1.00     1.00              25
afc4b95c2537f788fe5711c9835b58bf    0.5    0.5     0.5     0.5    0.63    0.63     0.63     0.63              25
66288d8fe7e8f36b7f4c2bf4d5af7b18    1.0    1.0     1.0     1.0    1.00    1.00     1.00     1.00              25
3bb3e61a043cd396acc7669d021ab532    1.0    1.0     1.0     1.0    1.00    1.00     1.00     1.00              25
ddd2c319f6b2bae1f9583b497bc615e4    0.0    0.2     0.2     0.2    0.00    0.39     0.39     0.39              25
965e5abc2b1409f099518c51d13d7a5a    0.5    0.5     0.5     0.5    0.63    0.63     0.63     0.63              25

       Mean Values        
┏━━━━━━━━━━━━━━━━┳━━━━━━━┓
┃ Metric         ┃ Value ┃
┡━━━━━━━━━━━━━━━━╇━━━━━━━┩
│ MRR@3          │ 0.8   │
│ MRR@5          │ 0.82  │
│ MRR@10         │ 0.82  │
│ MRR@20         │ 0.82  │
│ NDCG@3         │ 0.826 │
│ NDCG@5         │ 0.865 │
│ NDCG@10        │ 0.865 │
│ NDCG@20        │ 0.865 │
│ retrieved_size │ 25.0  │

Implemented the Bm25 search! It's a bit slower than embedding search but performs quite well tbh

       Mean Values        
┏━━━━━━━━━━━━━━━━┳━━━━━━━┓
┃ Metric         ┃ Value ┃
┡━━━━━━━━━━━━━━━━╇━━━━━━━┩
│ MRR@3          │ 0.8   │
│ MRR@5          │ 0.81  │
│ MRR@10         │ 0.81  │
│ MRR@20         │ 0.82  │
│ NDCG@3         │ 0.83  │
│ NDCG@5         │ 0.85  │
│ NDCG@10        │ 0.85  │
│ NDCG@20        │ 0.86  │
│ retrieved_size │ 25.0  │

jxnl / n-levels-of-rag

Keyword Matching #12

Summary: