Azure / azure-sdk-for-python

This repository is for active development of the Azure SDK for Python. For consumers of the SDK we recommend visiting our public developer docs at https://learn.microsoft.com/python/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-python.
MIT License
4.53k stars 2.76k forks source link

Migration from formrecognizer to DocumentIntelligence : unexpected 400 error #36626

Open preste-thsou opened 1 month ago

preste-thsou commented 1 month ago

Describe the bug I'm trying to migrate from formrecognizer to DocumentIntelligence. I could successfuly update my environment with the new package, but I get a 400 error (invalid argument) when running begin_analyze_document with prebuilt-receipt model. The same code and same docs with previous library was working ok. I'm using a new resource created in West Europe region to be able to use the model v4.0, and I udpated API endpoint and key accordingly.

I tried with an endpoint url with or without trailing '/', since I noticed that the post url contains two '//', but it gives the same result in both cases. With a trailing '/' I see in the logs the following POST :
https://REDACTED_DOMAIN.cognitiveservices.azure.com//documentintelligence/documentModels/prebuilt-receipt:analyze?api-version=REDACTED&locale=REDACTED'

To Reproduce Steps to reproduce the behavior:

  1. create a resource in West Europe for Document Intelligence
  2. pip install azure-ai-documentintelligence-1.0.0b2 ( or b3)
  3. from azure.ai.documentintelligence.aio import DocumentIntelligenceClient
  4. put valid endpoint & key credentials in ENV variables MS_ENDPOINT and MS_KEY
  5. document_client = DocumentIntelligenceClient(endpoint=os.getenv("MS_ENDPOINT", None) , credential=AzureKeyCredential(os.getenv("MS_KEY", None))) async with document_client: poller = await document_client.begin_analyze_document(analyze_mode, page, locale=self._locale) processed_documents = await poller.result() await document_client.close() return processed_documents

Expected behavior Azure API to process my document, as it was withform-recognizer

Screenshots Request method: 'POST' Request headers: 'content-type': 'application/json' 'x-ms-client-request-id': '82483726-4a65-11ef-ad80-0dd1140f25ff' 'User-Agent': 'azsdk-python-ai-documentintelligence/1.0.0b3 Python/3.10.12 (Linux-6.5.0-44-generic-x86_64-with-glibc2.35)' 'Ocp-Apim-Subscription-Key': 'REDACTED' A body is sent with the request 07/25/2024 11:08:51 AM Response status: 400 Response headers: 'Content-Length': '172' 'Content-Type': 'application/json; charset=utf-8' 'ms-azure-ai-errorcode': 'REDACTED' 'x-ms-error-code': 'InvalidArgument' 'apim-request-id': 'REDACTED' 'Strict-Transport-Security': 'REDACTED' 'x-content-type-options': 'REDACTED' 'x-ms-region': 'REDACTED' 'Date': 'Thu, 25 Jul 2024 09:08:51 GMT'

Additional context Add any other context about the problem here.

github-actions[bot] commented 1 month ago

Thank you for your feedback. Tagging and routing to the team member best able to assist.

swathipil commented 1 month ago

Hi @preste-thsou - Thanks for opening an issue! We'll take a look asap.

YalinLi0312 commented 1 month ago

Hi @preste-thsou , can you share how you prepare analyze_mode and page in your sample code so that we can reproduce the issue? Also, you can follow these samples for analyzing document on prebuilt-receipt model: this is for analyzing local file, this is for analyzing remote file.

github-actions[bot] commented 1 month ago

Hi @preste-thsou. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.

github-actions[bot] commented 4 weeks ago

Hi @preste-thsou, we're sending this friendly reminder because we haven't heard back from you in 7 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 14 days of this comment the issue will be automatically closed. Thank you!

preste-thsou commented 2 weeks ago

Hello, Thanks for your reply. To answer your question, analyze_mode is defined with the following code,

    if self._page_class == 'receipt':
        analyze_mode = 'prebuilt-receipt'
    else:
        analyze_mode = 'prebuilt-invoice'

The value is chosen based on the result of a classification task which is performed independantly. In the error case, the result of the classification is "receipt", I don't remember if I tested the case with invoices already.

The page object is a single page in-memory pdf document, created using BytesIO and PyPDF2.PDFwriter() => the code is a bit complex because it adapts to a variety of incoming formats and converts to pdf in case the initial input was an image, but it always ends with :

        tmp = BytesIO()
        pdf_page.write(tmp)
        tmp.seek(0)

where the pdf_page contains a valid PyPDF2.PDFwriter() object. This tmp is then passed on, via a class argument _page, and then to the azure API call via a page = copy.deepcopy(self._page)

This code works without any issue with formrecognizer.