aws / amazon-freertos

DEPRECATED - See README.md
https://aws.amazon.com/freertos/
MIT License
2.54k stars 1.1k forks source link

[BUG] pPublishInfo->payloadLength changes after call to sendPacket #3522

Closed taherrera closed 1 year ago

taherrera commented 1 year ago

Hi,

I am currently porting MQTT Core to a EC25 Modem. I can currently connect without issues to a MQTT Mosquitto Broker however I am getting this error when publishing :

Transport send failed for PUBLISH payload.

Debugging I notice that my transport send interface sends what it is told with no errors and returning the amount of bytes sent.

So I went to core_mqtt.c and added two printf:

            printf(  "pPublishInfo->payloadLength = %d ..\n" , pPublishInfo->payloadLength );
            bytesSent = sendPacket( pContext,
                                    pPublishInfo->pPayload,
                                    pPublishInfo->payloadLength );

            printf(  "pPublishInfo->payloadLength = %d ..\n" , pPublishInfo->payloadLength );

And the output of the program is:

pPublishInfo->payloadLength = 5 ..
pPublishInfo->payloadLength = 6 ..

The function I am calling is:

int aws_mqtt_publish(const char * msg, const char * topic){
  printf("Atempt to send : \"%s\", len=%d\n", msg, strlen(msg));
  MQTTPublishInfo_t pub = {.qos = MQTTQoS1, .retain = 0, .dup = 0, .pTopicName=topic, .topicNameLength=strlen(topic), .pPayload=msg, .payloadLength=strlen(msg)};
  MQTTStatus_t mqtt_status = MQTT_Publish( &pContext, &pub, MQTT_GetPacketId(&pContext) );
  if (mqtt_status != MQTTSuccess){
    console_printf("[E] mqtt_conn.c aws_mqtt_publish: error: %d\n", mqtt_status);
    // Process the error.. disconnect maybe ?
    return -1;
  }
  return 0;
}

And I am calling it with these arguments:

aws_mqtt_publish("test", "tecnocal/sensor") Printf output: Atempt to send : "test", len=4"

This is the function I am using to send data over the modem:

static int32_t transport_send(NetworkContext_t *pNetworkContext, const void *pBuffer, size_t bytesToSend){
  pNetworkContext->tx_num++;
  int res = ec25_tcp_send(pBuffer, bytesToSend);
  if (res){
    console_printf("[E] platform.c platform_init: Unable to send tcp err %d\n", res);
    return -1;
  }

  printf("sent %d bytes\n", bytesToSend);
  return bytesToSend;
}

Output of printf() : sent 5 bytes

System information

Expected behavior pPublishInfo->payloadLength should not change.

Screenshots or console output Atempt to send : "test", len=4 sent 21 bytes pPublishInfo->payloadLength = 5 sent 5 bytes pPublishInfo->payloadLength = 6 5 2703 [main] [core_mqtt.c:1435] [ERROR] [MQTT] Transport send failed for PUBLISH payload.

ravibhagavandas commented 1 year ago

Since MQTTPublishInfo_t is allocated on the stack it could be possible that an unbounded write somewhere may be corrupting the stack at that location. Could you try

taherrera commented 1 year ago

Thanks,

I moved MQTTPublishInfo_t pub from the stack using static, now this is what I get:

Guru Meditation Error: Core  0 panic'ed (Double exception). 

Core  0 register dump:
PS      : 0x00050936  A0      : 0x800d1b67  A1      : 0x3fffff9c  
A2      : 0x4000000c  A3      : 0x00000000  A4      : 0x003ffbdb  A5      : 0x00403860  
A6      : 0x1049c500  A7      : 0xabffffff  A8      : 0x800d1cd5  A9      : 0x49f53049  
A10     : 0x00000000  A11     : 0x3ffbdb60  A12     : 0x1049c500  A13     : 0xe52049d5  
A14     : 0x00000000  A15     : 0xff000000  SAR     : 0x00000016  EXCCAUSE: 0x00000002  
EXCVADDR: 0x40000050  LBEG    : 0x400da114  LEND    : 0x400da131  LCOUNT  : 0x00000009  
0x400da114: findInRecord at /home/tom/Documents/SLT/tecnocal/instacrop/fw/instacrop_v1/build/../amazon-freertos/libraries/coreMQTT/source/core_mqtt_state.c:434
 (inlined by) MQTT_UpdateStatePublish at /home/tom/Documents/SLT/tecnocal/instacrop/fw/instacrop_v1/build/../amazon-freertos/libraries/coreMQTT/source/core_mqtt_state.c:875

0x400da131: MQTT_UpdateStatePublish at /home/tom/Documents/SLT/tecnocal/instacrop/fw/instacrop_v1/build/../amazon-freertos/libraries/coreMQTT/source/core_mqtt_state.c:849

Backtrace:0x400803bd:0x3fffff9c |<-CORRUPTED
0x400803bd: _UserExceptionVector at ??:?

I also lowered the buffer sizes in my functions that are being called by the MQTT API (send, receive), but this error remains.

I also added uxTaskGetStackHighWaterMark(NULL) to the function that sends data via uart_write_bytes and uart_read_bytes to/from the modem and it always returns 1552 (I suspect these functions are the most stack intensive calls). If I increase the buffer sizes in the stack I can see it decrease its value.

Also, how do I set configCHECK_FOR_STACK_OVERFLOW value ? I have used FreeRTOS before and it is easy to do via idf.py menuconfig → Component config → FreeRTOS → Check for stack overflow, but I do not know how to perform this in amazon-freertos.

I also changed the debug level, but it does not show any relevant information as far I can tell.

taherrera commented 1 year ago

Ok so I think It is solved. The issue was that NetworkContext_t was being deleted in the stack because I did not declare it as a static variable in a function call.

Thanks for the help !

taherrera commented 1 year ago

Still I would like to know how to set configCHECK_FOR_STACK_OVERFLOW, it could be in handy for some use latter.

n9wxu commented 1 year ago

It appears that you are using the ESP32 based upon the error dump you provided above. The ESP-IDF puts the kernel configuration options in the sdkconfig. This is configured according to the instructions here https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-reference/kconfig.html

n9wxu commented 1 year ago

It appears you have resolved the original issue. Feel free to ask general configuration questions in the forums.