microsoft / PubSec-Info-Assistant

Information Assistant, built with Azure OpenAI Service, Industry Accelerator
MIT License
319 stars 677 forks source link

Governance Infused Ingestion, Embedding and RAG #734

Open agonzalez2014 opened 4 months ago

agonzalez2014 commented 4 months ago

Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

I work with the Microsoft ISE Government industry. A common ask that we get from customers is around ensuring that any copilot, agent, or assistant have data governance awareness infused into it. Many customers have their own policies around data governance, defining data access based on each individual's profile spanning: what sub-organization they work in, what their title/role is, what project's they are working on, and what custom scopes/permissions they are granted for custom secured data sources

Describe the solution you'd like

A platform that performs any RAG operation (Ingestion, Embedding, Retrieval) and infuses it with an existing Data Governance Platform (such as Purview) where any data source has been registered and conforms with an enterprise's set of policies.

Whenever a user send a query to the copilot / agent / assistant, that the solution first take into account that user's profile to determine which embeddings produce the highest relevance that the use is authorized to consume, in order for the solution to augment the prompt to an LLM.

Describe alternatives you've considered

An alternative architecture is one that embodies a Data Mesh. Where each Data Domain has its own platform to ingest, embed, store, and retrieve for copilot + agent + assistance solutions.

With possible synergy with architecture patterns such as:

Additional context NA

dayland commented 3 months ago

@agonzalez2014 , Thank for the write up. This is on our backlog, but rather than developing a custom implementation of this our goal is to be able to leverage Microsoft Fabric as the backbone for data access and acquisition going forward. Since this is the future of Microsoft's data platform it should have those auto integrations included. There were several announcements at Build related to RAG + Microsoft Fabric that we are keenly watching.