Azure-Samples / azure-search-openai-demo

A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences.
https://azure.microsoft.com/products/search
MIT License
5.88k stars 4.03k forks source link

Chunking and Tokens - General Help #410

Open TroyHostetter opened 1 year ago

TroyHostetter commented 1 year ago

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [x] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

General Question

The chunking code appears to split our PDFs into single pages, and gpt-35-turbo can find the section/page we ask it to find. We have a References section that is split across 3 chunks. However, the model only returns the first 2.5 references. We are asking it to return all references split across the 3 chunks. Should we be splitting the chunks into smaller pieces? Should we increase the token count? Can you provide some suggestions on areas of the cookbook we should focus on? Lastly, would gpt-35-turbo-16k help, or maybe even gpt-4?

OS and Version?

Windows 11

azd version?

azd version 1.0.2 (commit 145e046b1ea9394bd4e1b1d539eb32e860d692fb)


Thanks! We'll be in touch soon.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this issue will be closed.