Azure-Samples / azure-search-openai-demo-csharp

A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure Cognitive Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences.
MIT License
555 stars 346 forks source link

not only documents as source? #227

Closed PhilippTP closed 6 months ago

PhilippTP commented 8 months ago

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [x] feature request
- [x] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Mention any other details that might be useful

Right now in the default version, it only supports the uploaded documents. Is it possible to get information from the normal ChatGPT database, which are not included in my uploaded documents? So that it can provide answers, ChatGPT could also give me?

If so, how does the citations work in this scenario?


Thanks! We'll be in touch soon.

luisquintanilla commented 6 months ago

Hi @PhilippTP,

There is no ChatGPT database that I'm aware of. However, if more generally, you mean having the ability to tap into other data sources, you should be able to do that by querying that other data source, and adding the results for relevant documents to your prompt.

https://github.com/Azure-Samples/azure-search-openai-demo-csharp/blob/64a4bcd0d7fa4c64565727cd74577b466a378e6d/app/backend/Services/ReadRetrieveReadChatService.cs#L80

You might need to tweak your prompts as well to account for these other data sources.

Citations should work in a similar way as they do today, except now you'd be referencing the other data sources, not just the single Azure Search index shown in this application.

Note though that the more context and data sources you add to your prompt as context, the less tokens you have for your conversation. You might want to collate the various data sources and do further summarization before submitting them as context to maximize the information in your context while also not using a lot of tokens.