Azure-Samples / azure-search-openai-demo

A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences.
https://azure.microsoft.com/products/search
MIT License
6.03k stars 4.12k forks source link

Issue with Migrating from Form Recognizer SDK to Document Intelligence SDK: "404 Resources Not Found" Error #1976

Open Anguschang582 opened 1 month ago

Anguschang582 commented 1 month ago

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [x] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Hi, all

Background

I am currently developing an web app that allows users to upload their files and ask questions based on the uploaded data. At the moment, I am using an older version of the code (tag rel071723) that has a Flask backend and, due to personal reasons I am not able to update it to the newest(even the one with quart backend) one.

Issue

Recently, I am now working on supporting additional file types (.docx, .xlsx, .pptx) by switching from the Form Recognizer SDK to the Document Intelligence SDK. However, when I attempt to make this change, I encounter a "404 resources not found" error.

I have already checked the endpoint, API version, and the code itself, and have also checked the migration guide, but the issue still remains.

Code

old version (which works perfectly fine)

from azure.ai.formrecognizer import DocumentAnalysisClient
# === other import ===

service = cog-fr-xxxxxx
azure_credential = DefaultAzureCredential()

client = DocumentAnalysisClient(
    endpoint=f"https://{service}.cognitiveservices.azure.com/",
    header={"x-ms-useragent": "azuredemo/1.0"}
    credential=AzureKeyCredential(THE_KEY)
)

// data is download from azure blob
fraw = io.BytesIO(BlobServiceClient(f"https://{STORAGE_ACCOUNT_NAME}.blob.core.windows.net", azure_credential)
                  .get_blob_client(container = source_container, blob = source)
                  .download_blob().readall())

poller = client.begin_analyze_document(model_id="prebuilt-layout", document=fraw)
results = poller.result()

New code (which throw 404 resource not found error)

from azure.ai.documentintelligence import DocumentIntelligenceClient
# === other import ===

service = cog-fr-xxxxxx  
azure_credential = DefaultAzureCredential()

client = DocumentIntelligenceClient(
    endpoint=f"https://{service}.cognitiveservices.azure.com/",
    credential=AzureKeyCredential(THE_KEY)
)

// data is download from azure blob
fraw = io.BytesIO(BlobServiceClient(f"https://{STORAGE_ACCOUNT_NAME}.blob.core.windows.net", azure_credential)
                  .get_blob_client(container = source_container, blob = source)
                  .download_blob().readall())

poller = client.begin_analyze_document(
    model_id="prebuilt-layout", analyze_request=fraw, content_type="application/octet-stream"
)
results = poller.result()

Notice that i am using exact same endpoint/key on both code

Versions

azure-ai-formrecognizer: 3.2.1 azure-ai-documentintelligence: 1.0.0b2

Questions

  1. From my understanding, the Form Recognizer and Document Intelligence services are basically the same (using the same endpoint). Therefore, there should be no need to revise the endpoint and key or changing any configuration on Azure portal when migrating to the Document Intelligence SDK. Is this correct?
  2. If this is not the cause of the error, what other potential issues could be triggering the "404 resources not found" error?

Any help would be greatly appreciated. Thank you!

pamelafox commented 3 weeks ago

Here's the PR where I switched to Doc Intelligence: https://github.com/Azure-Samples/azure-search-openai-demo/pull/1224/files#diff-7ef659fc9cf6968e718894d300490b14ea7a52091e7d4bcffae3a5029ac721d4

It doesnt look like I changed the endpoint. However, there is limited region availability for the new Document Intelligence, are you sure yours is in a supported region?

Anguschang582 commented 3 weeks ago

Hi, @pamelafox thank you so much for the reply and pointing that out. Unfortunately, it seems that the new Document Intelligence service still does not support resources deployed in the Japan East. Also have tried re-creating 2 different service with East US and Japan East, but only former one works.