Azure-Samples / azure-ai-vision-sdk

SDK for Microsoft's Azure AI Vision
MIT License
84 stars 50 forks source link

ImageAnalysisErrorReason.CONNECTION_FAILURE, Error code: 3 on Linux Machines #64

Closed MaurusGubser closed 9 months ago

MaurusGubser commented 10 months ago

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

When using the azure-ai-vision package to connect to a vision resource, the connection to the resource fails. However, running the same Python script on a windows machine works. Moreover, when using the REST API directly, the connection can be established. I am aware of the tickets https://github.com/Azure-Samples/azure-ai-vision-sdk/issues/40 and https://github.com/Azure-Samples/azure-ai-vision-sdk/issues/43. Also, the steps described in https://github.com/Azure-Samples/azure-ai-vision-sdk/blob/main/docs/ubuntu2204-notes.md were performed and solved the issues until today (2023-12-05).

Minimal steps to reproduce

Run the quickstart.py example from: https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/quickstarts-sdk/image-analysis-client-library-40?tabs=visual-studio%2Clinux&pivots=programming-language-python#tabpanel_1_linux

Any log messages given by the failure

 Analysis failed.
   Error reason: ImageAnalysisErrorReason.CONNECTION_FAILURE
   Error code: 3
   Error message: Failed with error: HTTPAPI_OPEN_REQUEST_FAILED [0x3 | 2550]
POST https://ocr-gasmeter.cognitiveservices.azure.com/computervision/imageanalysis:analyze?api-version=2023-02-01-preview&features=caption%2Cread&gender-neutral-caption=true&language=en

Expected/desired behavior

Some JSON containing the extracted information, something like:

{"captionResult":{"text":"a man pointing at a screen","confidence":0.7767596244812012},"readResult":{"stringIndexType":"TextElements","content":"9:35 AM\nE Conference room 154584354\n#: 555-173-4547\nTown Hall\n9:00 AM - 10:00 AM\nAaron Buaion\nDaily SCRUM\n10:00 AM 11:00 AM\nChurlette de Crum\nQuarterly NI Hands\n11.00 AM-12:00 PM\nBebek Shaman\nWeekly stand up\n12:00 PM-1:00 PM\nDelle Marckre\nProduct review","pages":...}

OS and Version?

Failure on: Ubuntu 22.04.3 LTS on Intel® Core™ i7-10510U CPU @ 1.80GHz × 8 Debian 11 on Intel® Core™ i7-10510U CPU @ 1.80GHz × 8

Success on: Windows 10

Versions

Tested with the following versions: azure-ai-vision 0.15.1b1 azure-ai-vision 0.13.0b1 Python 3.10.12

Mention any other details that might be useful


Thanks! We'll be in touch soon.

dargilco commented 10 months ago

@MaurusGubser thank you for reporting this! Can you please enable SDK logs, run your scenario, and share the log file? To enable logs define these two environment variables (choose your own log file name), and run your Python script:

export AZAC_DIAGNOSTICS_LOG=file export AZAC_DIAGNOSTICS_LOG_FILE=vision-sdk-log.txt

MaurusGubser commented 10 months ago

@dargilco Yes of course. Here's the log file:

vision-sdk-log.txt

NB: Don't know if this helps, but I looked at my apt upgrade history on my Ubuntu machine and Debian VM.

dargilco commented 10 months ago

@MaurusGubser these are the relevant log lines:

[32455]: 118ms SPX_TRACE_INFO: AZ_LOG_INFO: httpapi_compact.c:837 Waiting for TLS connection [32455]: 128ms SPX_TRACE_INFO: AZ_LOG_INFO: httpapi_compact.c:837 Waiting for TLS connection [32455]: 140ms SPX_TRACE_INFO: AZ_LOG_INFO: tlsio_openssl.c:1441 Not using CRL cache directory. [32455]: 185ms SPX_TRACE_ERROR: AZ_LOG_ERROR: tlsio_openssl.c:1015 Error loading CRL from http://crl3.digicert.com/DigicertSHA2SecureServerCA-1.crl [32455]: 228ms SPX_TRACE_ERROR: AZ_LOG_ERROR: tlsio_openssl.c:1015 Error loading CRL from http://crl4.digicert.com/DigicertSHA2SecureServerCA-1.crl [32455]: 228ms SPX_TRACE_ERROR: AZ_LOG_ERROR: tlsio_openssl.c:1601 Unable to retrieve CRL, CRL check may fail. [32455]: 228ms SPX_TRACE_ERROR: AZ_LOG_ERROR: tlsio_openssl.c:691 error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed [32455]: 228ms SPX_TRACE_ERROR: AZ_LOG_ERROR: tlsio_openssl.c:2441 FORCE-Closing tlsio instance. [32455]: 228ms SPX_TRACE_INFO: AZ_LOG_INFO: httpapi_compact.c:837 Waiting for TLS connection [32455]: 238ms SPX_TRACE_ERROR: AZ_LOG_ERROR: httpapi_compact.c:1322 Open HTTP connection failed (result = HTTPAPI_OPEN_REQUEST_FAILED, error = 2550) [32455]: 239ms SPX_TRACE_ERROR: default_http_error_handler.cpp:106 Failed with error: HTTPAPI_OPEN_REQUEST_FAILED [0x3 | 2550] POST https://ocr-gasmeter.cognitiveservices.azure.com/computervision/imageanalysis:analyze?api-version=2023-02-01-preview&features=caption%2Cread&gender-neutral-caption=true&language=en [32455]: 239ms SPX_TRACE_SCOPE_ENTER: compact_http_adapter.cpp:108 CloseHttpConnection [32455]: 239ms SPX_TRACE_INFO: AZ_LOG_INFO: tlsio_openssl.c:2410 Closing tlsio from a state other than TLSIO_STATE_EXT_OPEN or TLSIO_STATE_EXT_ERROR [32455]: 239ms SPX_TRACE_ERROR: AZ_LOG_ERROR: tlsio_openssl.c:2441 FORCE-Closing tlsio instance. [32455]: 239ms SPX_TRACE_SCOPE_EXIT: compact_http_adapter.cpp:108 CloseHttpConnection

There is an issue accessing the Certificate Revocation List (CRL) http://crl3.digicert.com/DigicertSHA2SecureServerCA-1.crl . I'm not sure if that's the source of the error, but please check if you have access to this URL on your Linux machines. Does your environment have firewall rules to allow outbound calls to only specific Certificate Revocation List (CRL) URLs?

This SDK shares the same networking stack as the Microsoft Azure Speech SDK. See this issue for Speech SDK: https://learn.microsoft.com/en-us/answers/questions/1192205/connection-failed-error-with-speech-sdk .

Also this section in the document https://learn.microsoft.com/en-us/azure/security/fundamentals/tls-certificate-changes

If you have an environment where firewall rules are set to allow outbound calls to only specific Certificate Revocation List (CRL) download and/or Online Certificate Status Protocol (OCSP) verification locations, you'll need to allow the following CRL and OCSP URLs. For a complete list of CRL and OCSP URLs used in Azure, see the Azure CA details article.

http://crl3.digicert.com http://crl4.digicert.com http://ocsp.digicert.com http://crl.microsoft.com http://oneocsp.microsoft.com http://ocsp.msocsp.com

MaurusGubser commented 9 months ago

Thanks for the clarification and your suggestions.

  1. I can download the crl http://crl3.digicert.com/DigicertSHA2SecureServerCA-1.crl on my machines, I tried it via browser and via curl, both worked.
  2. Azure Speech SDK, see below.
  3. I have a firewall on my Ubuntu machine, however, when I check, its status is "inactive". So I think, it should not infer with the SDK. Same goes for the Debian machine.

There seems to be a problem as described in the Azure Speech SDK (https://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-configure-openssl-linux?pivots=programming-language-python). On my Ubuntu machine, I get:

$ openssl version -d
OPENSSLDIR: "/usr/local/ssl"

I should get OPENSSLDIR: "/usr/lib/ssl". The directory /usr/local/ssl/certs/ contains no files, while the directory /usr/lib/ssl/certs/ contains a lot of .pem files, which is desired. I tried to solve the problem by setting

$ export OPENSSLDIR=/usr/lib/ssl/certs
$ printenv | grep SSL
SSL_CERT_DIR=/etc/ssl/certs
OPENSSLDIR=/usr/lib/ssl/certs

as described in the documentation linked above. However, when running the quickstart.py script, I get the same error

 Analysis failed.
   Error reason: ImageAnalysisErrorReason.CONNECTION_FAILURE
   Error code: 3
   Error message: Failed with error: HTTPAPI_OPEN_REQUEST_FAILED [0x3 | 2550]

I'm not sure, setting the env variable OPENSSLDIR=/usr/lib/ssl/certs is the correct solution, because when I check the openssl version afterwards I get:

$ openssl version -d
OPENSSLDIR: "/usr/local/ssl"

Am I doing this wrong (fixing the OPENSSLDIR problem)?

NB: On the Debian 11 VM, the OPENSSLDIR is set correctly:

$ openssl version -d
OPENSSLDIR: "/usr/lib/ssl"

The directory /usr/lib/ssl/certs/ does contain lots of .pem files, but the quickstart.py also fails.

dargilco commented 9 months ago

@MaurusGubser Based on the documentation seems like you should be setting: OPENSSLDIR=/usr/lib/ssl And you need to clear this environment variable: SSL_CERT_DIR= since by default, certs are assumed to be in the certs folder under OPENSSLDIR. Have you tried that?

MaurusGubser commented 9 months ago

Sorry, maybe it is unclear, what I did. In short, when I follow your suggestion and unset SSL_CERT_DIR and export OPENSSLDIR=/usr/lib/ssl, the script fails with the same error. Three remarks:

  1. SSL_CERT_DIR is set based on the document https://github.com/Azure-Samples/azure-ai-vision-sdk/blob/main/docs/ubuntu2204-notes.md There, it is suggested to set
$export SSL_CERT_DIR=/etc/ssl/certs
  1. I don't understand the OPENSSLDIR variable: is this just a normal environment variable? Because when I check with
$printenv | grep SSL
OPENSSLDIR=/usr/lib/ssl

However, when I check

$ openssl version -d
OPENSSLDIR: "/usr/local/ssl"
  1. I tried to solve the problem by setting a symbolic link from /usr/local/ssl/certs (which is empty) to /usr/lib/ssl/certs (where the prm files are). But still, the execution of the script fails.

I'm a bit lost :)

dargilco commented 9 months ago

@MaurusGubser the next version of the SDK (due out in January or early February) will be based on a new networking library that should handle your issues better. For you I suggest using Python script to call the REST API directly instead of using the SDK, and parse the resulting JSON response. You can adapt the example code below. Let me know if that worked for you.

def azure_image_desc(image_url):
    import requests

    url = f"{azure_endpoint}/computervision/imageanalysis:analyze?api-version=2023-10-01&features=caption"

    headers = {
        "Ocp-Apim-Subscription-Key": azure_subscription_key,
        "Content-Type": "application/json"
    }

    data = {
        "url": image_url
    }

    response = requests.post(url, headers=headers, json=data)

    if response.status_code == 200:
        data = response.json()
        return data['captionResult']['text']
    else:
        return response.text
MaurusGubser commented 9 months ago

@dargilco Ok, thank you. Yes, the solution via REST-API works for me. I'm looking forward to the new version of the library.