Azure-Samples / azure-search-openai-demo

A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences.
https://azure.microsoft.com/products/search
MIT License
5.69k stars 3.84k forks source link

Insufficient Permissions for Enabling Document Level Access Control #1787

Open DSOTM-RSA opened 3 weeks ago

DSOTM-RSA commented 3 weeks ago

Please provide us with the following information:

This issue is for a: (mark with an x)

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Successfully setup optional login functionality. Successfully use ADLS2 Data Lake Storage for processing of additional documents based on a highwater mark (custom functionality) using AZURE_ADLS_GEN2_FILESYSTEM, AZURE_ADLS_GEN2_STORAGE_ACCOUNT etc.

Attempt to setup document level access control, to then enable recent user data upload functionality, by running azd auth login. and then adlsgen2setup.ps1.

Any log messages given by the failure

INFO:root:Creating groups... INFO:azure.identity.aio._internal.decorators:AzureDeveloperCliCredential.get_token succeeded INFO:root:Searching for group GPTKB_AdminTest... INFO:root:Could not find group GPTKB_AdminTest, creating... Traceback (most recent call last): File "C:\Users\ddgray\Downloads\DEVELOPMEMT\ACTIVE\intern-search\scripts\adlsgen2setup.py", line 193, in asyncio.run(main(args)) File "C:\Users\ddgray\AppData\Local\Programs\Python\Python310\lib\asyncio\runners.py", line 44, in run return loop.run_until_complete(main) File "C:\Users\ddgray\AppData\Local\Programs\Python\Python310\lib\asyncio\base_events.py", line 649, in run_until_complete return future.result() File "C:\Users\ddgray\Downloads\DEVELOPMEMT\ACTIVE\intern-search\scripts\adlsgen2setup.py", line 164, in main await command.run() File "C:\Users\ddgray\Downloads\DEVELOPMEMT\ACTIVE\intern-search\scripts\adlsgen2setup.py", line 65, in run group_id = await self.create_or_get_group(group) File "C:\Users\ddgray\Downloads\DEVELOPMEMT\ACTIVE\intern-search\scripts\adlsgen2setup.py", line 146, in create_or_get_group raise Exception(content) Exception: {'error': {'code': 'Authorization_RequestDenied', 'message': 'Insufficient privileges to complete the operation.', 'innerError': {'date': '2024-07-04T17:50:25', 'request-id': 'c502f5d1-6262-4a5b-99e0-5fc588278f9e', 'client-request-id': 'c502f5d1-6262-4a5b-99e0-5fc588278f9e'}}} PS C:\Users\ddgray\Downloads\DEVELOPMEMT\ACTIVE\intern-search>

Expected/desired behavior

Expect the ability to create groups, as access permissions are established, i.e. Owner on subscription and Storage Blob Data Owner on the ADLS storage account.

OS and Version?

Windows 10, Local,

azd version?

1.94

Versions

Tag: 2024-05-29

Mention any other details that might be useful

MSFT documentation suggests other privileges are needed to deploy Security Groups. Is this the case?

image

Side-Note: Trying to run the document access control scripts in my typical Dev Container environment fails, with the error '_AZURE_ADLS_GEN2_STORAGEACCOUNT must be set to continue', hence being forced to run it as-is. This despite deploying dozens of successful deployment iterations of the solution, using this and all other environment variables from .env in the Dev Container environment.

cforce commented 2 weeks ago

This additional env var introduced does not make sense to me. . Better just expect user wants to use same. storage created by bicep for Gen2 as well and therefore use the same env var like exported by bicep for blob storage. This way just loading and env will fulfill the required. What is missing to put this if statement for gen2 to bicep as well instead leave it up to the user to create additional dedicated storage for himself. Why not always use gen2, doesn’t it anyway handle all requirements?

DSOTM-RSA commented 2 weeks ago

This additional env var introduced does not make sense to me. . Better just expect user wants to use same. storage created by bicep for Gen2 as well and therefore use the same env var like exported by bicep for blob storage. This way just loading and env will fulfill the required. What is missing to put this if statement for gen2 to bicep as well instead leave it up to the user to create additional dedicated storage for himself. Why not always use gen2, doesn’t it anyway handle all requirements?

My primary issue, is having a clear idea of the permissions to deploy additional functionalities in the solution - like the User Data Upload. The general documented solution-permissions suggested by the product developer team meet all my past requirements, but in this case seem to be insufficient. I would just like clarity which permissions I need, rather than attempting ad-hoc to set appropriate permissions.

mattgotteiner commented 2 weeks ago

Thanks, this is good feedback, agreed that the bicep can provision / set permissions on the ADLS Gen 2 storage account