Dynamically add table aliases without an LLM + de-duplicate columns in pandas

This automatically generates relevant table aliases and appends it to a prompt. Doing so transfers the onus of creating table aliases away from the LLM. We may have to retrain our LLM to expect this kind of prompting, so that it expects a more varied source of inputs.

Here's an example of how to run this.

python main.py \
-db postgres \
-q "data/questions_gen_postgres.csv" "data/instruct_basic_postgres.csv" "data/instruct_advanced_postgres.csv" "data/idk.csv" \
-o results/classic_new_reprompt.csv results/basic_new_reprompt.csv results/advanced_new_reprompt.csv results/idk_new_reprompt.csv \
-g api \
-b 1 \
-f prompts/prompt_cot.md \
--api_url "YOUR_API_ENDPOINT" \
--api_type "vllm" \
-p 10 \
-c 0 --logprobs --cot_table_alias

defog-ai / sql-eval

Dynamically add table aliases without an LLM + de-duplicate columns in pandas #155