run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
33.25k stars 4.65k forks source link

[Question]: Testing of Text to SqL #13317

Open sreenivasanm6 opened 1 month ago

sreenivasanm6 commented 1 month ago

Question Validation

Question

Hi Team,

I am looking for evaluation criteria in case of Text to sql conversion parts. I know there are few criterial available in llama index for normal RAG parts, with given input and output like below: https://docs.llamaindex.ai/en/stable/optimizing/evaluation/evaluation/

In case of text to sql, input will be text (in normal english) and output is an SQL command.

dosubot[bot] commented 1 month ago

For evaluating Text to SQL conversion tasks, consider these key criteria:

  1. Accuracy of SQL Generation: The primary measure is how accurately the system translates natural language queries into SQL queries that reflect the intended action.

  2. Complex Query Handling: The system's capability to process complex queries involving multiple tables, joins, subqueries, and advanced functions is essential for real-world applicability.

  3. Security: It's crucial to ensure the system guards against SQL injection attacks and does not compromise database integrity or expose sensitive data.

  4. Performance: Evaluate the system's response time and efficiency in generating SQL queries from natural language inputs.

  5. Scalability: The system should effectively scale with increasing data sizes and query complexity, maintaining performance without degradation.

  6. Usability: For user-facing applications, the ease with which users can formulate queries and understand results is important.

  7. Adaptability: The system's flexibility in handling different SQL dialects, database schemas, and domain-specific languages is vital for broad application.

  8. Error Handling and Feedback: The ability to provide meaningful feedback for unprocessable queries or those yielding no results is crucial for enhancing user experience.

These criteria are foundational for assessing a Text to SQL conversion system's effectiveness and readiness for real-world scenarios.

#### Details **Sources** - [llama-index-core/llama_index/core/indices/struct_store/sql_query.py](https://github.com/run-llama/llama_index/blob/main/llama-index-core/llama_index/core/indices/struct_store/sql_query.py) - [docs/docs/examples/index_structs/struct_indices/SQLIndexDemo.ipynb](https://github.com/run-llama/llama_index/blob/main/docs/docs/examples/index_structs/struct_indices/SQLIndexDemo.ipynb) - [docs/docs/examples/query_engine/SQLJoinQueryEngine.ipynb](https://github.com/run-llama/llama_index/blob/main/docs/docs/examples/query_engine/SQLJoinQueryEngine.ipynb) - [llama-index-core/llama_index/core/query_engine/sql_join_query_engine.py](https://github.com/run-llama/llama_index/blob/main/llama-index-core/llama_index/core/query_engine/sql_join_query_engine.py) - [benchmarks/struct_indices/spider/evaluate.py](https://github.com/run-llama/llama_index/blob/main/benchmarks/struct_indices/spider/evaluate.py) - [llama-index-legacy/llama_index/legacy/indices/struct_store/sql_query.py](https://github.com/run-llama/llama_index/blob/main/llama-index-legacy/llama_index/legacy/indices/struct_store/sql_query.py) - [llama-index-core/llama_index/core/utilities/sql_wrapper.py](https://github.com/run-llama/llama_index/blob/main/llama-index-core/llama_index/core/utilities/sql_wrapper.py) - [llama-index-core/llama_index/core/query_engine/jsonalyze_query_engine.py](https://github.com/run-llama/llama_index/blob/main/llama-index-core/llama_index/core/query_engine/jsonalyze_query_engine.py) - [benchmarks/struct_indices/spider/README.md](https://github.com/run-llama/llama_index/blob/main/benchmarks/struct_indices/spider/README.md) - [llama-index-core/llama_index/core/langchain_helpers/agents/tools.py](https://github.com/run-llama/llama_index/blob/main/llama-index-core/llama_index/core/langchain_helpers/agents/tools.py)
**About Dosu** This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.