Infineon / amazon-freertos

IoT operating system for microcontrollers. https://aws.amazon.com/freertos/
MIT License
10 stars 10 forks source link

Optiga Trust-M Reconnect memory leak #10

Open eschastlivtsev opened 4 years ago

eschastlivtsev commented 4 years ago

Description Hi, we have met some issue when implemented MQTT re-connection feature with TLS enabled (role - client) and Optiga TRUST-M: When we detect the MQTT disconnection, we are deleting the MQTT Agent, create new and trying to connect it. After few first connection losing it works, but at some moment function mbedtls_ssl_handshake() (\libraries\3rdparty\mbedtls\library\ssl_tls.c) started return -0x4E80 code (MBEDTLS_ERR_ECP_FEATURE_UNAVAILABLE) (we've got message "ERROR: TLS_Connect failed with error code -20096" at console).

We have looked into the depth of it and figured out that the problem occurs at the MBEDTLS_SSL_CLIENT_KEY_EXCHANGE step (mbedtls_ssl_handshake_client_step() function from \libraries\3rdparty\mbedtls\library\ssl_cli.c): There ssl_write_client_key_exchange->mbedtls_ecdh_make_public->ecdh_make_public_internal->mbedtls_ecdh_gen_public->optiga_crypt_create->optiga_cmd_create - optiga_cmd_queue_get_count_of() call starts to return 0, so after

if (0 == optiga_cmd_queue_get_count_of(g_optiga_list[optiga_instance_id], OPTIGA_CMD_QUEUE_SLOT_STATE, OPTIGA_CMD_QUEUE_NOT_ASSIGNED))

check we are going out from function without creation new cmd and with error.

It means that after some number of re-connections, cmd array p_optiga->optiga_cmd_execution_queue[index] becomes filled. So, we have made the guess that after destroying connections the memory allocated for commands is not freed. But we don't see and API in trust-m library to do that. If it exist, please point to it.

Also, we have some assumption how to fix the problem.

Possible Solution We have noticed some not obvious behavior at the mbedtls_ecdh_gen_public() function in "vendors\infineon\secure_elements\optiga_trust_m\examples\mbedtls_port\trustm_ecdh.c" file: there memory allocation happens - me = optiga_crypt_create(0, optiga_crypt_event_completed, NULL); (also cmd creation happens during it), this me pointer is not freed in mbedtls_ecdh_gen_public() function, but also this pointer is not saved to any global variable, so after returning from this function - it is not more possible to free this memory, and respectively destroy the cmd.

The identical behavior you can see at mbedtls_ecdh_compute_shared() function in the same file.

Besides, we look into the neighbor file - "vendors\infineon\secure_elements\optiga_trust_m\examples\mbedtls_port\trustm_ecdsa.c" - and there, in the similar situation (we don't know the depth of the library, so it is our assumption that it is similar) - in functions mbedtls_ecdsa_sign() and mbedtls_ecdsa_verify() - before returning from function, in cleanup - the destroying happens:

        if (me != NULL)
    {
        optiga_crypt_destroy(me);
    }

We have added the similar behavior to the mbedtls_ecdh_gen_public() and mbedtls_ecdh_compute_shared() functions, and it fixed our issue. Also, we did not notice any side-affects. So, after this fix, now we are able to perform any number of reconnections.

Please, tell us if we are wrong and we just missed some config/API to free the memory, allocated for Optiga after destroying the TLS connection.

System information

Our code for reconnection:

    if (xMQTTHandle != NULL){
        /* Disconnect from the MQTT brocker */
        xMqttRC = MQTT_AGENT_Disconnect(xMQTTHandle, mqtttaskMQTT_TIMEOUT);

        if( xMqttRC == eMQTTAgentSuccess ) {
            xMqttRC = MQTT_AGENT_Delete(xMQTTHandle);
        }

        if( xMqttRC == eMQTTAgentSuccess ) {
            xMQTTHandle = NULL;
        }
    }

    if(xMqttRC == eMQTTAgentSuccess) {

        BaseType_t xStatus = prvMqttAgentStartAndConnect();
    }

prvMqttAgentStartAndConnect() function performs MQTT_AGENT_Create() and MQTT_AGENT_Connect(). Inside the "MQTT_AGENT_Connect()" the issue occurs.

Thank you!