Chat App doesn't give correct answer

barbarian23 commented 2 months ago

Please provide us with the following information:

This issue is for a: (mark with an `x`)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Deploy ChatApp
Prepare file excel containing population over year of a list of cities with data like

City	Population (2011)	Population (2001)
Mumbai	12,442,373	11,978,450
Delhi	11,034,555	9,879,172
Bangalore	8,443,675	5,682,293
Hyderabad	6,993,262	5,496,960

Index documents with command
```
scripts/prepdocs.ps1
```

Ask ChatApp with some question like

Population of Mumbai
Which cities has population than 9,000,000 in 2011?

Some of question like which cities has population than 9,000,000 in 2011 will have population of city more less than 9,000,000 like below


Based on the information provided in the city_india.pdf file, the cities in India with a population of more than 9,000,000 in 2011 are:

Mumbai: 12,442,373 (correct answer)
Delhi: 11,034,555 (correct answer)
Bangalore: 8,443,675 (wrong answer)

Any log messages given by the failure

Expected/desired behavior

I need ChatApp give answer correctly

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?) Window 10

azd version?

1.7.0

Versions

Mention any other details that might be useful

Thanks! We'll be in touch soon.

pamelafox commented 2 months ago

This repository is currently optimized for RAG on documents with fairly unstructured queries, like the example ones. It looks like you are hoping to be able to ask very structured queries, with strict constraints like comparisons and numeric quantities. In that case, I think you'll want to use a different approach, such as:

1) Asking the LLM to generate Python code (like pandas code) and executing that. You'd need to be careful about executing arbitrary code however. 2) Asking the LLM to generate search queries with filters, and store that data in an Azure AI search index with additional fields or a database table with columns. You could even try storing in a local read-only SQLite database for small amounts of data. That's similar to option 1, generating pandas code, but more secure since you can limit how much a SQL query can do.

For both those options, you'd likely want to use OpenAI function calling, which we already use to some extent for query rewriting. I have more details on how to use function calling in this blog post: https://blog.pamelafox.org/2024/03/rag-techniques-using-function-calling.html

barbarian23 commented 2 months ago

Hello, I would like to thank you for your suggestion. It help me to under stand more about LLM Could you help me to ask LLM to generate Python code (like pandas code) for me? I really need an example to follow

Thank you.

evogelpohl commented 2 months ago

Hello, I would like to thank you for your suggestion. It help me to under stand more about LLM Could you help me to ask LLM to generate Python code (like pandas code) for me? I really need an example to follow

Thank you.

Dig into https://github.com/Sinaptik-AI/pandas-ai. It's one of the best Question-to-Py/Pandas solution out there. Specifically, it builds a local .log file of all the data + question -> llm <- return for you to see exactly what's happening.

Azure-Samples / azure-search-openai-demo