aws / aws-iot-device-sdk-cpp-v2

Next generation AWS IoT Client SDK for C++ using the AWS Common Runtime
Apache License 2.0
183 stars 109 forks source link

AWS_ERROR_SYS_CALL_FAILURE when building the MqttClientConnectionConfigBuilder #598

Closed sweta98 closed 4 weeks ago

sweta98 commented 1 year ago

Describe the bug

Hi, I'm upgrading our thing's SDK version from 1 to 2 and trying to set up the MQTT Client connection similar to the pub sub sample. We are getting the client config ready with the following:

Aws::Iot::MqttClientConnectionConfigBuilder::Build() On debugging and stepping into the build function, we found it errors out here:

auto tlsContext = Crt::Io::TlsContext(m_contextOptions, Crt::Io::TlsMode::CLIENT, m_allocator);

            if (!tlsContext)

            {

                return MqttClientConnectionConfig::CreateInvalid(tlsContext.GetInitializationError());

            }

We dont use BYO_CRYPTO and Metrics collection is set to false. We stepped into the TLSOptions.cpp to see what is the exact error and here is where we get AWS_ERROR_SYS_CALL_FAILURE:

 if (mode == TlsMode::CLIENT)
                {
                    aws_tls_ctx *underlying_tls_ctx = aws_tls_client_ctx_new(allocator, &options.m_options);
                    if (underlying_tls_ctx != nullptr)
                    {
                        m_ctx.reset(underlying_tls_ctx, aws_tls_ctx_release);
                    }
                }
                else
                {
                    aws_tls_ctx *underlying_tls_ctx = aws_tls_server_ctx_new(allocator, &options.m_options);
                    if (underlying_tls_ctx != nullptr)
                    {
                        m_ctx.reset(underlying_tls_ctx, aws_tls_ctx_release);
                    }
                }
                if (!m_ctx)
                {
                    m_initializationError = Aws::Crt::LastErrorOrUnknown();
                }

the mode is CLIENT but it never enters m_ctx.reset(), we get an error with code 46 in the last if block for checking m_ctx. Can you please advise? thanks

Expected Behavior

I expected the MQTTClientCOnfigBuilder to build successfully and establish a TLS connection

Current Behavior

On building the MQTTClientCOnfigBuilder using the build(), it errors out with AWS_ERROR_SYS_CALL_FAILURE.

Reproduction Steps

Here's the code snippet for the MQTT connection for our device:

IMqttClient::ResponseCode MqttClient::Connect(const string &endpoint, const vector<uint8_t> &privateKey,
   const vector<uint8_t> &certificate, const string &clientId, const boost::shared_ptr<LastWill_t> &lastWill,
   const Time &timeout)
{
   // Do the global initialization for the API move to MQTTInitilization
   ApiHandle apiHandle;
   ResponseCode result = MQTT_SUCCESS;
   Aws::Crt::String cert_pem = convertDerToPem(certificate);
   Aws::Crt::String key_pem = convertPrivateKeyDerToPem(privateKey);
   // convert certificate using ByteCursorFromString
   ByteCursor cert_b = ByteCursorFromString(cert_pem);
   ByteCursor key_b = ByteCursorFromString(key_pem);
   ByteCursor root_b = ByteCursorFromCString(ROOT_CERTIFICATES);
   auto stdTimeout = milliseconds(timeout.Convert(Time::MILLISECONDS));
      if (!m_certificate->Exists() || !m_key->Exists() || !m_root_certificates->Exists())
   {
      return MQTT_FAILURE;
   }

   // Create the MQTT builder and populate it
   Aws::Iot::MqttClientConnectionConfigBuilder clientConfigBuilder(cert_b, key_b);
   clientConfigBuilder.WithEndpoint(endpoint.c_str());
   clientConfigBuilder.WithMetricsCollection(false);
   // override port
  // clientConfigBuilder.WithPortOverride(443);
   clientConfigBuilder.WithCertificateAuthority(root_b);
   boost::optional<Aws::Iot::MqttClientConnectionConfig> clientConfig = clientConfigBuilder.Build();

   if (!clientConfig.get())
   {
      fprintf(
         stderr,
         "Client Configuration initialization failed with error %s\n",
         Aws::Crt::ErrorDebugString(clientConfig.get().LastError()));
      return MQTT_FAILURE;
   }
   Aws::Iot::MqttClient client = Aws::Iot::MqttClient();
   auto connection = client.NewConnection(clientConfig.get());
   if (!*connection)
   {
      fprintf(stderr, "Connection creation failed with error %s\n", Aws::Crt::ErrorDebugString(connection->LastError()));
      return MQTT_FAILURE;
   }        

   if(lastWill != nullptr)
   {

      bool retained = lastWill->retained;
      IOT::IMqttClient::QoS qos = lastWill->qos;
      const char* topic = lastWill->topic.c_str();
      Aws::Crt::ByteBuf payload = Aws::Crt::ByteBufFromArray(lastWill->payload.data(), lastWill->payload.size() );
      connection->SetWill(topic, ConvertFromQos(qos), retained, payload);
   }
std::promise<bool> connectionCompletedPromise;
 // Invoke when MQTT connection was interrupted
 auto onInterrupted = [&](Aws::Crt::Mqtt::MqttConnection &, int error) {
    //fprintf(stdout, "Connection interrupted with error %s\n", Aws::Crt::ErrorDebugString(error));
    HandleDisconnect(error);

 };
 // Invoked when a MQTT connection was interrupted/lost, but then reconnected successfully
    auto onResumed = [&](Aws::Crt::Mqtt::MqttConnection &, Aws::Crt::Mqtt::ReturnCode, bool) {
      result =  MQTT_SUCCESS;
    };

 // Invoked when a MQTT connect has completed or failed
 auto onConnectionCompleted =
        [&](Aws::Crt::Mqtt::MqttConnection &, int errorCode, Aws::Crt::Mqtt::ReturnCode returnCode, bool) {
            if (errorCode == AWS_ERROR_SUCCESS && returnCode == AWS_MQTT_CONNECT_ACCEPTED)
            {

                connectionCompletedPromise.set_value(true);
            }
            else
            {

                connectionCompletedPromise.set_value(false);
            }
        };
  // Assign callbacks
   connection->OnConnectionInterrupted = std::move(onInterrupted);
   connection->OnConnectionResumed = std::move(onResumed);
 connection->OnConnectionCompleted = std::move(onConnectionCompleted);

// connect 

   if(!connection->Connect(clientId.c_str(), true, uint16_t(stdTimeout.count())))
   {
      printf("Failed to connect\n");
      return MQTT_FAILURE;
   }

 // wait for the OnConnectionCompleted callback to fire, which sets connectionCompletedPromise...
   if (connectionCompletedPromise.get_future().wait_for(stdTimeout) != std::future_status::ready)
   {
      printf("Timed out waiting for connection\n");
      return MQTT_FAILURE;
   }
    if (connectionCompletedPromise.get_future().get() == false)
    {
        fprintf(stderr, "Connection failed\n");
      return MQTT_FAILURE;
    }
  if (result != MQTT_SUCCESS)
  {

   connection->Disconnect();

  }

   return result;
}

Error is thrown in this segment:

boost::optional<Aws::Iot::MqttClientConnectionConfig> clientConfig = clientConfigBuilder.Build();
   if (!clientConfig.get())
   {
      fprintf(
         stderr,
         "Client Configuration initialization failed with error %s\n",
         Aws::Crt::ErrorDebugString(clientConfig.get().LastError()));
      return MQTT_FAILURE;
   }

Possible Solution

No response

Additional Information/Context

No response

SDK version used

2

Environment details (OS name and version, etc.)

Windows

yasminetalby commented 1 year ago

Hello @sweta98 ,

Thank you very much for your submission.

From a first guess, I think this might be an issue with the way you are providing certificate.

Could you please provide the following information:

  1. The log associated with the behavior. To enable the log, you will need to call InitializeLogging. You can look at our integration test for a good example of how this function works.

  2. SDK version being used (I am assuming that you are using the latest version of the aws-iot-device-sdk-cpp-v2 released as of today: v1.24.3) In the meantime, I'll attempt to reproduce the issue.

Thank you very much for your time and collaboration. Sincerely, Best regards,

Yasmine

sweta98 commented 1 year ago

Hi Yasmine,

  1. Please find the logs:

    [INFO] [2023-07-06T15:52:26Z] [00001700] [pki-utils] - static: loading certificate chain with 1 certificates.
    Loaded 'C:\Windows\SysWOW64\ncryptprov.dll'. 
    Loaded 'C:\Windows\SysWOW64\profapi.dll'. 
    [ERROR] [2023-07-06T15:52:26Z] [00001700] [pki-utils] - static: failed to import ecc key with status -2146893783, last error 0
    [ERROR] [2023-07-06T15:52:26Z] [00001700] [tls-handler] - static: failed to import certificate and private key with error 46.
    Client Configuration initialization failed with error aws-c-common: AWS_ERROR_SYS_CALL_FAILURE, System call failure
    [DEBUG] [2023-07-06T15:52:28Z] [00001700] [tls-handler] - static: This library was built with Windows 8.1 or later, probing OS to see what we're actually running on.
    [DEBUG] [2023-07-06T15:52:28Z] [00001700] [tls-handler] - static: We're running on Windows 8.1 or later. ALPN is available.
    [DEBUG] [2023-07-06T15:52:28Z] [00001700] [tls-handler] - static: This library was built with Windows 8.1 or later, probing OS to see what we're actually running on.
    [DEBUG] [2023-07-06T15:52:28Z] [00001700] [tls-handler] - static: We're running on Windows 8.1 or later. ALPN is available.
    [DEBUG] [2023-07-06T15:52:28Z] [00001700] [tls-handler] - static: loading custom CA file.
    [INFO] [2023-07-06T15:52:28Z] [00001700] [pki-utils] - static: loading 3 certificates in cert chain for use as a CA
    [DEBUG] [2023-07-06T15:52:28Z] [00001700] [tls-handler] - static: certificate and key have been set, setting them up now.
    [INFO] [2023-07-06T15:52:28Z] [00001700] [pki-utils] - static: loading certificate chain with 1 certificates.
    Client Configuration initialization failed with error aws-c-common: AWS_ERROR_SYS_CALL_FAILURE, System call failure
    [ERROR] [2023-07-06T15:52:28Z] [00001700] [pki-utils] - static: failed to import ecc key with status -2146893783, last error 0
    [ERROR] [2023-07-06T15:52:28Z] [00001700] [tls-handler] - static: failed to import certificate and private key with error 46.
    [DEBUG] [2023-07-06T15:52:32Z] [00001700] [tls-handler] - static: This library was built with Windows 8.1 or later, probing OS to see what we're actually running on.
    [DEBUG] [2023-07-06T15:52:32Z] [00001700] [tls-handler] - static: We're running on Windows 8.1 or later. ALPN is available.
    [DEBUG] [2023-07-06T15:52:32Z] [00001700] [tls-handler] - static: This library was built with Windows 8.1 or later, probing OS to see what we're actually running on.
    [DEBUG] [2023-07-06T15:52:32Z] [00001700] [tls-handler] - static: We're running on Windows 8.1 or later. ALPN is available.
    [DEBUG] [2023-07-06T15:52:32Z] [00001700] [tls-handler] - static: loading custom CA file.
    [INFO] [2023-07-06T15:52:32Z] [00001700] [pki-utils] - static: loading 3 certificates in cert chain for use as a CA
    [DEBUG] [2023-07-06T15:52:32Z] [00001700] [tls-handler] - static: certificate and key have been set, setting them up now.
    [INFO] [2023-07-06T15:52:32Z] [00001700] [pki-utils] - static: loading certificate chain with 1 certificates.
    Client Configuration initialization failed with error aws-c-common: AWS_ERROR_SYS_CALL_FAILURE, System call failure
    [ERROR] [2023-07-06T15:52:32Z] [00001700] [pki-utils] - static: failed to import ecc key with status -2146893783, last error 0
    [ERROR] [2023-07-06T15:52:32Z] [00001700] [tls-handler] - static: failed to import certificate and private key with error 46.
    [DEBUG] [2023-07-06T15:52:40Z] [00001700] [tls-handler] - static: This library was built with Windows 8.1 or later, probing OS to see what we're actually running on.
    [DEBUG] [2023-07-06T15:52:40Z] [00001700] [tls-handler] - static: We're running on Windows 8.1 or later. ALPN is available.
    [DEBUG] [2023-07-06T15:52:40Z] [00001700] [tls-handler] - static: This library was built with Windows 8.1 or later, probing OS to see what we're actually running on.
    [DEBUG] [2023-07-06T15:52:40Z] [00001700] [tls-handler] - static: We're running on Windows 8.1 or later. ALPN is available.
    [DEBUG] [2023-07-06T15:52:40Z] [00001700] [tls-handler] - static: loading custom CA file.

    Its an issue with the certificate and key. We have our certificate and key as const vector and they are in DER format. I am using MBedTLS to convert it to PEM. Here are the helper functions :

    Aws::Crt::String convertPrivateKeyDerToPem(const std::vector<uint8_t>& derPrivateKey)
    {
    mbedtls_pk_context pk;
    mbedtls_pk_init(&pk);
    
    int ret = mbedtls_pk_parse_key(&pk, derPrivateKey.data(), derPrivateKey.size(), nullptr, 0);
    if(ret != 0)
    {
        // Handle error
        return "";
    }
    
    size_t bufferSize = 8192;  // Increased buffer size
    unsigned char* buffer = new unsigned char[bufferSize];
    if (!buffer)
    {
        // Handle error: Couldn't allocate memory
        return "";
    }
    
    ret = mbedtls_pk_write_key_pem(&pk, buffer, bufferSize);
    if(ret != 0)
    {
        // Handle error
        delete[] buffer;
        return "";
    }
    
    Aws::Crt::String pemPrivateKey(reinterpret_cast<char*>(buffer));
    
    mbedtls_pk_free(&pk);
    delete[] buffer;
    
    return pemPrivateKey;
    }
    Aws::Crt::String convertDerToPem(const std::vector<uint8_t>& derCertificate)
    {
    mbedtls_x509_crt crt;
    mbedtls_x509_crt_init(&crt);
    
    if(mbedtls_x509_crt_parse_der(&crt, derCertificate.data(), derCertificate.size()) != 0) {
        // Error handling
        return "";
    }
    
    unsigned char buffer[8192];  // Buffer size can be adjusted as needed
    size_t len;
    
    if(mbedtls_pem_write_buffer( "-----BEGIN CERTIFICATE-----\n", "-----END CERTIFICATE-----\n", 
            derCertificate.data(), derCertificate.size(), buffer, sizeof(buffer), &len) != 0) {
        // Error handling
        return "";
    }
    
    mbedtls_x509_crt_free(&crt);
    
    return Aws::Crt::String(reinterpret_cast<char*>(buffer), len);
    }
  2. The SDK version is v1.24.3

Can you advise what might be going wrong and how can we fix it? Like version 1 had MBedTLS support, does version 2 have it too? Thank you in advance!

yasminetalby commented 1 year ago

Hello @sweta98 ,

Thank you very much for this information and for your collaboration. Would you be able to step in all the way through the failing call?

Best regards,

Yasmine

sweta98 commented 1 year ago

Hello @yasminetalby, Thank you for your response! I was able to step in all the way through, this is where the error occurs - aws-iot-device-sdk-cpp-v2\src\crt\aws-crt-cpp\crt\aws-c-io\source\windows\secure_channel_tls_handler.c

int err = aws_import_key_pair_to_cert_context(
            alloc,
            &cert_chain_cur,
            &pk_cur,
            is_client_mode,
            &secure_channel_ctx->cert_store,
            &secure_channel_ctx->pcerts,
            &secure_channel_ctx->crypto_provider,
            &secure_channel_ctx->private_key);

        if (err) {
            AWS_LOGF_ERROR(
                AWS_LS_IO_TLS, "static: failed to import certificate and private key with error %d.", aws_last_error());
            goto clean_up;
        }

Stepping into the aws_import_key_pair_to_cert_context function: Certificates seem to get added correctly. The private key type is AWS_CT_X509_ECC In the file aws-iot-device-sdk-cpp-v2\src\crt\aws-crt-cpp\crt\aws-c-io\source\windows\windows_pki_utils.c and function s_cert_context_import_ecc_private_key, this throws the error :

status = NCryptImportKey(
        crypto_prov,
        0,
        BCRYPT_ECCPRIVATE_BLOB,
        &ncBufDesc,
        &h_key,
        (BYTE *)key_blob,
        key_blob_size,
        NCRYPT_OVERWRITE_KEY_FLAG);

    if (status != ERROR_SUCCESS) {
        AWS_LOGF_ERROR(
            AWS_LS_IO_PKI,
            "static: failed to import ecc key with status %d, last error %d",
            status,
            (int)GetLastError());
        aws_raise_error(AWS_ERROR_SYS_CALL_FAILURE);
        goto done;
    }

Just wanted to make sure that there is support for Windows ECC key import.

yasminetalby commented 1 year ago

Hello @sweta98 ,

Thank you very much for providing this information. We do support for Windows ECC key import (see PR).

I'll bring this up to the team, this is unexpected behavior. Thank you very much for your time and collaboration.

Sincerely,

Yasmine

sweta98 commented 1 year ago

Hi @yasminetalby Just wanted to follow up on the issue above. I was wondering if there's any progress or if I can provide any additional information to help move it forward.

yasminetalby commented 1 year ago

Hello @sweta98 ,

Thank you very much for reaching out. We don't currently need more information. While this issue is being investigated I would like to suggest a potential workaround:

  1. Create an AWS IoT certificate using the AWS IoT console (See)
  2. Create an AWS IoT certificate using the CLI (See)
  3. Create your own client certificates (See developer guide documentation and JITP blog post

We do provide documentation for certificate generation in the IoT Core developer guide : https://docs.aws.amazon.com/iot/latest/developerguide/x509-client-certs.html

Best regards,

Yasmine

sweta98 commented 1 year ago

Hi @yasminetalby Thanks for providing the suggestions. We created a new certificate and key using the AWS Console. The key was a RSA key and that worked. We were able to connect successfully. Seems like the issue happens when the private key is in ECC format. But given the nature of our devices, we would need to stick to ECC as it's more efficient.

Hope this provides more context to the team.

Thanks and regards Sweta

xiazhvera commented 1 year ago

Hi @sweta98, would you mind provide a test ecc cert/key you are using? Or could you describe how you generate the key? I tested with our ECC key (generated using openssl commands), and it seems working. However, we didnt test through mbedtls, it could be related. How about giving a shot at using a PEM format ECC key directly? This can help eliminate any potential issues introduced by mbedtls. It's worth exploring as an alternative approach.

sweta98 commented 1 year ago

@xiazhvera Here's a test certificate and key: CERTIFICATE : "MIICjjCCAXagAwIBAgIUSu8ioQgKL3zLEggWtF5xyGzyQq4wDQYJKoZIhvcNAQELBQAwTTFLMEkGA1UECwxCQW1hem9uIFdlYiBTZXJ2aWNlcyBPPUFtYXpvbi5jb20gSW5jLiBMPVNlYXR0bGUgU1Q9V2FzaGluZ3RvbiBDPVVTMB4XDTIzMDcxMjE2NTMyMVoXDTQ5MTIzMTIzNTk1OVowHjEcMBoGA1UEAxMTQVdTIElvVCBDZXJ0aWZpY2F0ZTBZMBMGByqGSM49AgEGCCqGSM49AwEHA0IABE1HvTjFlRcFsP+z3plhxgeaktvJUyiIssIDFWiFjAKIU+Z608oMqG7MH8v3PVQO9pwRHheU5kCQHUW2yKivCsajYDBeMB8GA1UdIwQYMBaAFHpUFtUUHy8lmsQNDKXVDbkvwhNIMB0GA1UdDgQWBBQ25aSILhXp8QRCI4xPTUnI/TEGYjAMBgNVHRMBAf8EAjAAMA4GA1UdDwEB/wQEAwIHgDANBgkqhkiG9w0BAQsFAAOCAQEAHq6ehtXuszQew3oIXQ8cj58Uet8IQroe33ySWDZBk8kgwwJsPo7ed5vjSO1pag+mZNJnqna7r3Ri2zsGEXRIMGVmeMuaU2fgsfLYf99J3JxVy3X4CfrEIJtTc1T6QJaVcGoc1CT9ErGA/+3rzQ+1bdTlIFdrpe8rMflxrILX0sNTdv6Hmkyliz8VHcqRGsbnH5NWapjHl/CgaQo+E6RA7cLaOt//jVvMfmeX1Ct2oQJ/qA2esxUv91XgjW5ldPhHlNe3s8q/f/nRiYZBvegw0Tr4CMvsmCAHaJ2nxJ3QEXArvR6onfVXWr4n8wHnVhqnwoCCFJVgT297nD0ifjcEmA==" PRIVATE_KEY : "MHgCAQEEIQC+e/hEsd6woNRNkZhwmA8kbP3UvwUWJioFJDcLdMZu2KAKBggqhkjOPQMBB6FEA0IABE1HvTjFlRcFsP+z3plhxgeaktvJUyiIssIDFWiFjAKIU+Z608oMqG7MH8v3PVQO9pwRHheU5kCQHUW2yKivCsY=" We need to store they key and certificate in DER format, so I have a helper function that helps fetch the files and convert it to PEM before passing it to the MQTTClientConfigBuilder

xiazhvera commented 1 year ago

Hi @sweta98, I tested with the key you provided, but the OS failed to import the credentials. It seems that the key was not converted correctly. With a quick research online, it seems that you would need to setup the key as ECC format when parse it. Checkout the post here: https://stackoverflow.com/questions/54764933/parse-a-ecc-private-key-buffer. As I'm not familiar with mbedtls library, let me know if it helps. Meanwhile, it might help if you could provide the tool/library you used to generate the DER credentials so that we could try it out on our side.

sweta98 commented 1 year ago

Hi @xiazhvera Apologies for the late response. Thank you for providing the link, will try that and get back to you on this. The credentials are generated using the CreateCertificateFromCsrRequest() from the SDK

sweta98 commented 1 year ago

@xiazhvera We are doing the same way the Stackoverflow post states, and have validated the ECC key. Following up on the discussion above, hoping the information I provided is helpful. Let me know if there is anything else I need to provide.

jmklix commented 1 month ago

@sweta98 can you provide a quick summary for how you are creating your ECC key? We've been able to successfully create a working ECC key, so I would like to make sure we're trying to reproduce this error in the same way that you are using it.

github-actions[bot] commented 1 month ago

Greetings! It looks like this issue hasn’t been active in longer than a week. We encourage you to check if this is still an issue in the latest release. Because it has been longer than a week since the last update on this, and in the absence of more information, we will be closing this issue soon. If you find that this is still a problem, please feel free to provide a comment or add an upvote to prevent automatic closure, or if the issue is already closed, please feel free to open a new one.