microsoft / chat-copilot

MIT License
1.95k stars 665 forks source link

Issue with AI search(No index created) #877

Closed Riya-Chaudhary-dev closed 1 month ago

Riya-Chaudhary-dev commented 4 months ago

No index(chatmemory) is being created in Azure AI Search because of which citation/reference isn't working. I've used azure blob for storage, cosmosdb for chat session and it is successfully uploading the files. The indexing was working in the earlier version of code. Manually creating the index is also not working. Please help.

ag-advania commented 4 months ago

Hi, we have the exact same problem and same configuration here:

The index 'chatmemory' for service 'acs-xxxxxx' was not found. Status: 404 (Not Found)

Update: It's working if I upload a txt file instead of a PDF

gtewksbury commented 4 months ago

We're have this same issue. The document information is uploaded to Cosmos, but the index is never created.

ag-advania commented 4 months ago

We resolved it by adding "DocumentMemory:XXXX" app settings, now it's working great

Dmw789 commented 4 months ago

I'm having the same issue: would you mind elaborating on what you mean?

We resolved it by adding "DocumentMemory:XXXX" app settings, now it's working great

I have DocumentMemory being set in the web-api appsettings.

ag-advania commented 4 months ago

I'm having the same issue: would you mind elaborating on what you mean?

We resolved it by adding "DocumentMemory:XXXX" app settings, now it's working great

I have DocumentMemory being set in the web-api appsettings.

What we've done is pushing a txt file instead of a PDF, and added theses 4 appsettings: "DocumentMemory:DocumentChunkMaxTokens" = "1000" "DocumentMemory:DocumentLineSplitMaxTokens" = "150" "DocumentMemory:FileCountLimit" = "100" "DocumentMemory:FileSizeLimit" = "500000000"

And restarted all web apps, I'm not sure which one solved the problem.

Riya-Chaudhary-dev commented 4 months ago

did you replace "DocumentMemory": { "DocumentLineSplitMaxTokens": 72, "DocumentChunkMaxTokens": 512, "FileSizeLimit": 40000000, "FileCountLimit": 10 }, with "DocumentMemory:DocumentChunkMaxTokens" : "1000", "DocumentMemory:DocumentLineSplitMaxTokens" : "150", "DocumentMemory:FileCountLimit" : "100", "DocumentMemory:FileSizeLimit" :"500000000", Can you please confirm?

ag-advania commented 4 months ago

did you replace "DocumentMemory": { "DocumentLineSplitMaxTokens": 72, "DocumentChunkMaxTokens": 512, "FileSizeLimit": 40000000, "FileCountLimit": 10 }, with "DocumentMemory:DocumentChunkMaxTokens" : "1000", "DocumentMemory:DocumentLineSplitMaxTokens" : "150", "DocumentMemory:FileCountLimit" : "100", "DocumentMemory:FileSizeLimit" :"500000000", Can you please confirm?

Yes we've used the inline app settings format to be compatible with the Web App environment variables in Azure.

Riya-Chaudhary-dev commented 4 months ago

I tried testing it locally without redeploying, it doesn't work. Was that the case for you too?

ag-advania commented 4 months ago

Have you tried using a txt file instead of a pdf to see if the indexes get created ?

Riya-Chaudhary-dev commented 4 months ago

Yes, tried with text files, doesn't work. Did you pull the new changes?

ag-advania commented 4 months ago

We are running everything from the "main" branch pulled recently so it should be the latest version.

Riya-Chaudhary-dev commented 4 months ago

image

Riya-Chaudhary-dev commented 4 months ago

did you do anything else?

glahaye commented 4 months ago

@crickman Any insight on this?

aaronba commented 4 months ago

I did get this to work with AzureAISearch and Azure OpenAI

First pull down a fresh copy appsettings.json from main

Edits: Line 186. Replace "SimpleVectorDb" with "AzureAISearch" in the MemoryDbTypes array. Line 198 Change MemoryDbType to "AzureAISearch" Line 277-280 Update AzureAISearch connection info Line 301-306 Update AzureOpenAIText connection info Line 315-320 Update AzureOpenAIEmbedding connection info.

The thing that I missed was making sure that the "DataIngestion" section matched the "Retrieval" section (specifically the MemoryDbTypes array on Line 186)

Riya-Chaudhary-dev commented 4 months ago

Thank you, I can run it locally now but it only retrieves information from document in the first few prompts after that it keeps prompting that it doesn't have access to the specific content so it makes up the answers. How do I solve this issue? I tried increasing the long memory and it performs much better but for some sections of the document it doesn't answer completely.

Riya-Chaudhary-dev commented 4 months ago

AI search doesn't work when I deploy it. I notice this comment in the powershell: App settings have been redacted. Use az webapp/logicapp/functionapp config appsettings list to view.

Riya-Chaudhary-dev commented 4 months ago

Are these suppose to be null? image image

aaronba commented 4 months ago

@Riya-Chaudhary-dev , are you still setting these other variables?

(from above)

"DocumentMemory:DocumentChunkMaxTokens" : "1000", "DocumentMemory:DocumentLineSplitMaxTokens" : "150", "DocumentMemory:FileCountLimit" : "100", "DocumentMemory:FileSizeLimit" :"500000000",

Maybe you are exceeding some limits. Try setting them back to the defaults.

Also, I highly suggest that you run the deploy.ps1 script vs. the ARM template. Last time I checked, the ARM template wasn't updated with the latest config values (like those for AISearch)

Riya-Chaudhary-dev commented 4 months ago

@aaronba I have kept these value set at default. I've only updated the connection string in the appsetting file. I'm following the deployment instructions from this page https://github.com/microsoft/chat-copilot/blob/main/scripts/deploy/README.md specifically the following steps:

./deploy-azure.ps1 -Subscription {YOUR_SUBSCRIPTION_ID} -DeploymentName {YOUR_DEPLOYMENT_NAME} -AIService {AzureOpenAI or OpenAI} -AIApiKey {YOUR_AI_KEY} -AIEndpoint {YOUR_AZURE_OPENAI_ENDPOINT} -BackendClientId {YOUR_BACKEND_APPLICATION_ID} -FrontendClientId {YOUR_FRONTEND_APPLICATION_ID} -TenantId {YOUR_TENANT_ID}

./package-webapi.ps1

./deploy-webapi.ps1 -Subscription {YOUR_SUBSCRIPTION_ID} -ResourceGroupName {YOUR_RESOURCE_GROUP_NAME} -DeploymentName {YOUR_DEPLOYMENT_NAME} There are services being deployed in the resource group and file upload works but AI search doesn't.

adamruderman commented 4 months ago

@Riya-Chaudhary-dev ,

Do a package and deploy of the memory pipeline as well. The deploy-azure step pulls the packages from the releases for the repo, but they are not up to date with the code. So, if you are using "Distributed", the memory pipeline service is likely is not working properly. Ensure the appsettings for the newly deployed memory pipeline service point to the same storage account/containers, AI Search. they should by default. Hit the service url and you should get a simple message saying the memory pipeline is running.

Riya-Chaudhary-dev commented 3 months ago

@adamruderman I deployed the memory pipeline as well and it works me. The index was created and populated with uploaded documents but it didn't work for other users. Couldn't figure out why that would happen. Also, it doesn't retrieve information for some pdfs.

Riya-Chaudhary-dev commented 3 months ago

image The article is in the reference but the prompt shows it doesn't have access.

glahaye commented 2 months ago

All the packages (webapi, memorypipeline, websearcher) has been updated for deployments. You can give it another spin and see whether things now work for you out of the box.

glahaye commented 1 month ago

One month after providing fix and no more reports of this problem still existing. Closing.