h2zero / NimBLE-Arduino

A fork of the NimBLE library structured for compilation with Arduino, for use with ESP32, nRF5x.
https://h2zero.github.io/NimBLE-Arduino/
Apache License 2.0
710 stars 147 forks source link

Strange data corruption with ESP32-C3 and getValue() #413

Closed paulhamsh closed 2 years ago

paulhamsh commented 2 years ago

NimBLE v1.3.8 on EPSP32 and ESP32-C3

Hi - I have a program which runs fine on an ESP32 Dev Kit, but on an ESP32-C3 it seems to corrupt some of the data read via getValue(). It doesn't happen with data of length 15 bytes or less, but does for 16 and over.

My test program is below, and when run with LightBlue (or nRF Connect) sending

11 22 33 44 55 66 77 88 99 aa bb cc dd ee ff 11 (spaces added to make it readable)

it gives the strange result on the ESP32-C3.

64 32 C9 3F 64 32 C9 3F 99 AA BB CC DD EE FF 11

This only happens if I set a pointer to the c_str() of the data, or then strncpy that, and index those char * arrays. If I access the std::string directly, or strnpcy it, it works fine.

Also it works fine with a constructed std::string - only the one received by getValue() seems impacted.

Am I doing something wrong setting the pointer and indexing the c_str()?

I see the getValue() approach changes in v1.4.0, so hopefully that will be available for ESP32 Arduino soon - but I still wonder what is going on with my code!!

Thanks

Output

CONSTRUCTED STRING FROM 11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF 11 

Using const char *c = std::string s(bytes1, sizeof(bytes1))
11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF 11 

Using strcpy(s.c_str())
11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF 11 

WITH 11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF 11 

Using const char *p = pChar->getValue().c_str()
64 32 C9 3F 64 32 C9 3F 99 AA BB CC DD EE FF 11 

Using strcpy(const char *p)
64 32 C9 3F 64 32 C9 3F 99 AA BB CC DD EE FF 11 

Using std::string s = pChar->getValue()
11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF 11 

Using strcpy(pChar->getValue().c_str())
11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF 11 

WITH  11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF 

Using const char *p = pChar->getValue().c_str()
11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF 

Using strcpy(const char *p)
11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF 

Using std::string s = pChar->getValue()
11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF 

Using strcpy(pChar->getValue().c_str())
11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF 

Code

#include "NimBLEDevice.h"

#define S_SERVICE   "ffc0"
#define S_CHAR1     "ffc1"
#define BLE_NAME    "BLE"

BLEServer *pServer;
BLEService *pService;
BLECharacteristic *pCharacteristic_receive;
BLEAdvertising *pAdvertising;

class MyServerCallback : public BLEServerCallbacks
{
  void onConnect(BLEServer *pserver) {}
  void onDisconnect(BLEServer *pserver) {}
};

#define PRINTIT(it, le)  {for (int j = 0; j < (le); j++) {Serial.print((byte)((it)[j]), HEX); Serial.print(" ");}; Serial.println();}

class CharacteristicCallbacks: public BLECharacteristicCallbacks {
  void onWrite(BLECharacteristic* pChar) {
    int l;
    const char *p;
    char str[50];
    char str2[50];
    std::string s;

    s = pChar->getValue();
    l = pChar->getValue().length();
    strncpy(str, pChar->getValue().c_str(), 50);
    p = pChar->getValue().c_str();
    strncpy(str2, p, 50);

    Serial.println("Using const char *p = pChar->getValue().c_str()");
    PRINTIT(p,    l);
    Serial.println("Using strcpy(const char *p)");
    PRINTIT(str2, l);

    Serial.println("Using std::string s = pChar->getValue()");
    PRINTIT(s,    l);
    Serial.println("Using strcpy(pChar->getValue().c_str())");   
    PRINTIT(str,  l);   
  };
};

static CharacteristicCallbacks chrCallbacks_r;

void setup() {
  Serial.begin(115200);
  delay(3000);
  Serial.println("Starting");

  BLEDevice::init(BLE_NAME);
//  BLEDevice::setMTU(517);

  pServer = BLEDevice::createServer();
  pServer->setCallbacks(new MyServerCallback());  
  pService = pServer->createService(S_SERVICE);
  pCharacteristic_receive = pService->createCharacteristic(S_CHAR1, NIMBLE_PROPERTY::READ | NIMBLE_PROPERTY::WRITE | NIMBLE_PROPERTY::WRITE_NR);
  pCharacteristic_receive->setCallbacks(&chrCallbacks_r);
  pService->start();

  pAdvertising = BLEDevice::getAdvertising(); // create advertising instance
  pAdvertising->addServiceUUID(pService->getUUID()); // tell advertising the UUID of our service
  pAdvertising->setScanResponse(true);  
  pAdvertising->start(); 

  Serial.println("Here we go...");

  const char bytes1[] = {0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88, 0x99, 0xAA, 0xBB, 0xCC, 0xDD, 0xEE, 0xFF, 0x11}; 
  std::string s(bytes1, sizeof(bytes1));
  const char *c;
  char str[50];
  int l;

  l = s.length();
  c = s.c_str();
  strncpy(str, s.c_str(), 50);
  Serial.println("Using const char *c = std::string s(bytes1, sizeof(bytes1))"); 
  PRINTIT(c,   l);
  Serial.println("Using strcpy(s.c_str())"); 
  PRINTIT(str, l);
}

void loop() {
}
h2zero commented 2 years ago

It looks like the data you're expecting to see is being replaced with the memory address of the data. I believe this may have something to do with the templates for getValue() in version 1.3.x. Could you try this code with the master branch and see if there is any change?

paulhamsh commented 2 years ago

Strangely, it is 'worse' - but now consistent. Rather than it not working after 15 bytes, it now just doesn't work. So perhaps the error is in assigning to const char *p? Looks like the first 8 bytes of data are overwritten regardless of whether the data is >15 bytes long or not, and now truncated to the right length.

11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF 11   becomes 4 30 C9 3F 4 30 C9 3F 99 AA BB CC DD EE FF 11 
11 becomes 4
11 22 33 becomes 4 30 C9
Using const char *p = pChar->getValue().c_str()
4 30 C9 
Using strcpy(const char *p)
4 30 C9 
Using std::string s = pChar->getValue()
11 22 33 
Using strcpy(pChar->getValue().c_str())
11 22 33 
Using const char *p = pChar->getValue().c_str()
4 30 C9 3F 4 30 C9 3F 99 AA BB CC DD EE FF 11 
Using strcpy(const char *p)
4 30 C9 3F 4 30 C9 3F 99 AA BB CC DD EE FF 11 
Using std::string s = pChar->getValue()
11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF 11 
Using strcpy(pChar->getValue().c_str())
11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF 11 
Using const char *p = pChar->getValue().c_str()
4 
Using strcpy(const char *p)
4 
Using std::string s = pChar->getValue()
11 
Using strcpy(pChar->getValue().c_str())
11 

Exploring the contents of 3FC93004 gives:

3FC93004: 60
3FC93005: 14
3FC93006: EB
3FC93007: FC
3FC93008: 90
3FC93009: 90
3FC9300A: A3
3FC9300B: 45
3FC9300C: 4
3FC9300D: 30
3FC9300E: C9
3FC9300F: 3F
3FC93010: 7C
3FC93011: 53
3FC93012: CA
3FC93013: 3F
3FC93014: 1
3FC93015: 4
3FC93016: 0
3FC93017: 0
paulhamsh commented 2 years ago

This works fine using the new version:

    NimBLEAttValue v;
    const uint8_t *d;
    const char *sss;
    v = pChar->getValue();
    d = v.data();
    PRINTIT(d, l);
    sss = v.c_str();
    PRINTIT(sss, l);

So it seems to be an issue with returning a std;:string and assigning the c_str() to a const char * pointer, then using that pointer. Indexing the std::string directly works, and copying the string with strncpy works. Using a NimBLEAttValue works, and the c_str() and data() both work.

This works:

    s = pChar->getValue();
    l = pChar->getValue().length();
    strncpy(str, pChar->getValue().c_str(), 50);
    p = s.c_str();
    strncpy(str2, p, 50);

But this doesn't:

    s = pChar->getValue();
    l = pChar->getValue().length();
    strncpy(str, pChar->getValue().c_str(), 50);
    p = pChar->getValue().c_str();
    strncpy(str2, p, 50);

Nor this:

    s = pChar->getValue();
    l = pChar->getValue().length();
    strncpy(str, pChar->getValue().c_str(), 50);
    p = (pChar->getValue()).c_str();
    strncpy(str2, p, 50);

Nor this:

    int l = pChar->getValue().length();
    const uint8_t *p = pChar->getValue().data();  /instead of c_str()

So

  std::string s = pChar->getValue();
  const char * p = s.c_str();

is ok but

 const char * p = pChar->getValue().c_str();

is not!

h2zero commented 2 years ago

Yes, this makes sense now. The template for getValue is converting the code to something unexpected it seems.

Try this:

const char * p = pChar->getValue<std::string>().c_str();

paulhamsh commented 2 years ago

I'm not sure it is. I tried that, and it worked. Then I removed some other redundant code, and it stopped working. Then I put some back, and it worked. But the other code should not have impacted it. Then I rewrote it a bit and now it doesn't work.

These lines work:

    int leng = pChar->getValue().length();
    char str2[50];
    uint8_t dat2[40]; 
    strncpy(str2, pChar->getValue().c_str(), leng);
    memcpy(dat2, pChar->getValue().c_str(), leng);
    std::string s1 = pChar->getValue();
    String s2 = pChar->getValue();
    const char *ptr2 = s1.c_str();
    const char *ptr3 = s2.c_str();   

    PRINTIT(dat2, leng);
    PRINTIT(str2, leng);
    PRINTIT(s1, leng);
    PRINTIT(ptr2, leng);
    PRINTIT(ptr3, leng);

These lines don't:

    int leng = pChar->getValue().length();
    char str1[50];
    char str3[50];
    char str4[50];
    uint8_t dat1[40];
    const char *ptr = pChar->getValue<std::string>().c_str();
    const char *ptr4 = pChar->getValue<String>().c_str();
    strncpy(str1, ptr, leng);
    memcpy(dat1, ptr, leng);
    strncpy(str3, pChar->getValue<std::string>().c_str(), leng);
    strncpy(str4, pChar->getValue<String>().c_str(), leng);

    PRINTIT(ptr4, leng);
    PRINTIT(str1, leng);
    PRINTIT(str3, leng);
    PRINTIT(str4, leng);
    PRINTIT(dat1, leng);
    PRINTIT(ptr, leng);

    PRINTIT(pChar->getValue<std::string>().c_str(), pChar->getValue().length());
    PRINTIT(pChar->getValue<std::string>().data(), pChar->getValue().length());

0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 
0 0 0 0 40 4B CA 0 
0 0 0 0 40 4B CA 0 
0 0 0 0 40 4B CA 0 
0 0 0 0 40 4B CA 0 

It is odd that this works:

    strncpy(str2, pChar->getValue().c_str(), leng);
    memcpy(dat2, pChar->getValue().c_str(), leng);

but this doesn't

    strncpy(str3, pChar->getValue<std::string>().c_str(), leng);
    strncpy(str4, pChar->getValue<String>().c_str(), leng);

I can't see any useful pattern here - assigning pChar->getValue().c_str() to a pointer or using it directly doesn't seem to work, but using pChar->getValue or copying data using pChar->getValue().c_str() does. I wonder if it is a compiler issue for ESP32-C3? It has issues with some pointers?

h2zero commented 2 years ago

I wonder if it is a compiler issue for ESP32-C3? It has issues with some pointers?

I'm wondering about this as well. I'll try to test this tonight. I wonder if removing the getValue template would resolve it?

paulhamsh commented 2 years ago

If it helps, I instrumented your new code (main) to show where various calls were made - to Characteristic and AttValue classes.

   int leng = pChar->getValue().length();

    char str1[50];
    char str3[50];
    char str4[50];
    uint8_t dat1[40];
    Serial.println("Step 1");
    const char *ptr = pChar->getValue<std::string>().c_str();
    Serial.println("Step 2");
    Serial.print("ptr ");Serial.println((unsigned long) ptr, HEX);
    Serial.println("Step 3");
    const char *ptr4 = pChar->getValue<String>().c_str();
    Serial.print("ptr4 ");Serial.println((unsigned long) ptr4, HEX);
    Serial.println("Step 4");
    strncpy(str1, ptr, leng);
    Serial.println("Step 5");
    memcpy(dat1, ptr, leng);
    Serial.println("Step 6");
    strncpy(str3, pChar->getValue<std::string>().c_str(), leng);
    Serial.println("Step 7");
    strncpy(str4, pChar->getValue<String>().c_str(), leng);

    PRINTIT(ptr4, leng);
    PRINTIT(str1, leng);
    PRINTIT(str3, leng);
    PRINTIT(str4, leng);
    PRINTIT(dat1, leng);
    PRINTIT(ptr, leng);

    PRINTIT(pChar->getValue<std::string>().c_str(), pChar->getValue().length());
    PRINTIT(pChar->getValue<std::string>().data(), pChar->getValue().length());

    char str2[50];
    uint8_t dat2[40]; 
    Serial.println("Step 10");
    strncpy(str2, pChar->getValue().c_str(), leng);
    Serial.println("Step 11");
    memcpy(dat2, pChar->getValue().c_str(), leng);
    Serial.println("Step 12");
    std::string s1 = pChar->getValue();
    Serial.println("Step 13");
    String s2 = pChar->getValue();

    Serial.println("Step 14");
    const char *ptr2 = s1.c_str();
    Serial.print("ptr2 ");Serial.println((unsigned long) ptr2, HEX);
    Serial.println("Step 15");
    const char *ptr3 = s2.c_str();   
    Serial.print("ptr3 ");Serial.println((unsigned long) ptr3, HEX);

    Serial.println();

    PRINTIT(dat2, leng);
    PRINTIT(str2, leng);
    PRINTIT(s1, leng);
    PRINTIT(ptr2, leng);
    PRINTIT(ptr3, leng);

Output:

NimBLEAttValue::data()3FCA505C

NimBLECharacteristic::getValue() 3FCA5030
Step 1
Step 2
ptr 3FCA48E0
Step 3
ptr4 3FCA48D8
Step 4
Step 5
Step 6
Step 7
0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 
0 0 0 0 40 4B CA 0 0 
0 0 0 0 40 4B CA 0 0 

NimBLECharacteristic::getValue() 3FCA5030
0 
NimBLECharacteristic::getValue() 3FCA5030
0 
NimBLECharacteristic::getValue() 3FCA5030
0 
NimBLECharacteristic::getValue() 3FCA5030
0 
NimBLECharacteristic::getValue() 3FCA5030
40 
NimBLECharacteristic::getValue() 3FCA5030
4B 
NimBLECharacteristic::getValue() 3FCA5030
CA 
NimBLECharacteristic::getValue() 3FCA5030
0 
NimBLECharacteristic::getValue() 3FCA5030
0 
NimBLECharacteristic::getValue() 3FCA5030

NimBLECharacteristic::getValue() 3FCA5030
0 
NimBLECharacteristic::getValue() 3FCA5030
0 
NimBLECharacteristic::getValue() 3FCA5030
0 
NimBLECharacteristic::getValue() 3FCA5030
0 
NimBLECharacteristic::getValue() 3FCA5030
40 
NimBLECharacteristic::getValue() 3FCA5030
4B 
NimBLECharacteristic::getValue() 3FCA5030
CA 
NimBLECharacteristic::getValue() 3FCA5030
0 
NimBLECharacteristic::getValue() 3FCA5030
0 
NimBLECharacteristic::getValue() 3FCA5030

Step 10

NimBLECharacteristic::getValue() 3FCA5030
NimBLEAttValue::c_str()3FCA5364
Step 11

NimBLECharacteristic::getValue() 3FCA5030
NimBLEAttValue::c_str()3FCA5364
Step 12

NimBLECharacteristic::getValue() 3FCA5030
Step 13

NimBLECharacteristic::getValue() 3FCA5030
Step 14
ptr2 3FCA47DC
Step 15
ptr3 3FCA47C4

9 7C CA 96 8C AC 65 9A CA 
9 7C CA 96 8C AC 65 9A CA 
9 7C CA 96 8C AC 65 9A CA 
9 7C CA 96 8C AC 65 9A CA 
9 7C CA 96 8C AC 65 9A CA 
NimBLEAttValue::data()3FCA505C

The first data() call is when LightBlue connects, the rest is my program.

It never seems to call NimBLEAttValue::getValue(), but does call NimBLEAttValue::c_str() and ::data() And calls to NimBLECharacteristic::getValue() return the same result, but when ->c_str() is called on that result the resulting pointer is different each time.

Also Step 6 and 7 don't call getValue at all, so perhaps the compiler caches the prior calls?

The lines that work either copy getValue() to a result object and then use that to get the c_str(), or they end up calling NimBLEAttValue::getValue(),

I wonder how the compiler knows when to use std::string() cstr() versus NimBLEAttValue cstr()?

Perhaps that helps.

paulhamsh commented 2 years ago

And with the current release code - updating NimBLECharacteristic.cpp to log the pointers. At no point does the pointer actually have the same value as the m_value.c_str()!

std::string NimBLECharacteristic::getValue(time_t *timestamp) {
    portENTER_CRITICAL(&m_valMux);
    std::string retVal = m_value;
    if(timestamp != nullptr) {
        *timestamp = m_timestamp;
    }
    portEXIT_CRITICAL(&m_valMux);
    NIMBLE_LOGI(LOG_TAG, "getValue() &retval:%lx &m_value:%lx m_value.c_str():%lx", 
        (unsigned long) &retVal, 
        (unsigned long) &m_value, 
        (unsigned long) m_value.c_str());
    return retVal;
} // getValue

And the code, slightly changed here:

    // String s2 = pChar->getValue();
    String s2 = pChar->getValue<String>();

Which makes it fail!!

    int leng = pChar->getValue().length();

    char str1[50];
    char str3[50];
    char str4[50];
    uint8_t dat1[40];
    Serial.println("Step 1");
    const char *ptr = pChar->getValue<std::string>().c_str();
    Serial.println("Step 2");
    Serial.print("ptr ");Serial.println((unsigned long) ptr, HEX);
    Serial.println("Step 3");
    const char *ptr4 = pChar->getValue<String>().c_str();
    Serial.print("ptr4 ");Serial.println((unsigned long) ptr4, HEX);
    Serial.println("Step 4");
    strncpy(str1, ptr, leng);
    Serial.println("Step 5");
    memcpy(dat1, ptr, leng);
    Serial.println("Step 6");
    strncpy(str3, pChar->getValue<std::string>().c_str(), leng);
    Serial.println("Step 7");
    strncpy(str4, pChar->getValue<String>().c_str(), leng);

    PRINTIT(ptr4, leng);
    PRINTIT(str1, leng);
    PRINTIT(str3, leng);
    PRINTIT(str4, leng);
    PRINTIT(dat1, leng);
    PRINTIT(ptr, leng);

    PRINTIT(pChar->getValue<std::string>().c_str(), pChar->getValue().length());
    PRINTIT(pChar->getValue<std::string>().data(), pChar->getValue().length());

    char str2[50];
    uint8_t dat2[40]; 
    Serial.println("Step 10");
    strncpy(str2, pChar->getValue().c_str(), leng);
    Serial.println("Step 11");
    memcpy(dat2, pChar->getValue().c_str(), leng);
    Serial.println("Step 12");
    std::string s1 = pChar->getValue();
    Serial.println("Step 13");
    // String s2 = pChar->getValue();
    String s2 = pChar->getValue<String>();

    Serial.println("Step 14");
    const char *ptr2 = s1.c_str();
    Serial.print("ptr2 ");Serial.println((unsigned long) ptr2, HEX);
    Serial.println("Step 15");
    const char *ptr3 = s2.c_str();   
    Serial.print("ptr3 ");Serial.println((unsigned long) ptr3, HEX);

    Serial.println();

    PRINTIT(dat2, leng);
    PRINTIT(str2, leng);
    PRINTIT(s1, leng);
    PRINTIT(ptr2, leng);
    PRINTIT(ptr3, leng);
Starting
I NimBLEDevice: "BLE Host Task Started"
I NimBLEDevice: "NimBle host synced."
Here we go...
I NimBLEServer: "mtu update event; conn_handle=1 mtu=255"
I NimBLEServer: "subscribe event; attr_handle=8, subscribed: false"
I NimBLECharacteristic: "getValue() &retval:3fca4ba8 &m_value:3fca543c m_value.c_str():3fca5444"
Step 1
I NimBLECharacteristic: "getValue() &retval:3fca4a64 &m_value:3fca543c m_value.c_str():3fca5444"
Step 2
ptr 3FCA4BB0
Step 3
I NimBLECharacteristic: "getValue() &retval:3fca4a64 &m_value:3fca543c m_value.c_str():3fca5444"
ptr4 3FCA4BA8
Step 4
Step 5
Step 6
I NimBLECharacteristic: "getValue() &retval:3fca4a64 &m_value:3fca543c m_value.c_str():3fca5444"
Step 7
I NimBLECharacteristic: "getValue() &retval:3fca4a64 &m_value:3fca543c m_value.c_str():3fca5444"
0 0 0 0 0 0 0 
0 0 0 0 0 0 0 
0 0 0 0 0 0 0 
0 0 0 0 0 0 0 
0 0 0 0 CD FA 98 
0 0 0 0 CD FA 98 
I NimBLECharacteristic: "getValue() &retval:3fca4ba8 &m_value:3fca543c m_value.c_str():3fca5444"
I NimBLECharacteristic: "getValue() &retval:3fca4a64 &m_value:3fca543c m_value.c_str():3fca5444"
0 I NimBLECharacteristic: "getValue() &retval:3fca4ba8 &m_value:3fca543c m_value.c_str():3fca5444"
I NimBLECharacteristic: "getValue() &retval:3fca4a64 &m_value:3fca543c m_value.c_str():3fca5444"
FD I NimBLECharacteristic: "getValue() &retval:3fca4ba8 &m_value:3fca543c m_value.c_str():3fca5444"
I NimBLECharacteristic: "getValue() &retval:3fca4a64 &m_value:3fca543c m_value.c_str():3fca5444"
C7 I NimBLECharacteristic: "getValue() &retval:3fca4ba8 &m_value:3fca543c m_value.c_str():3fca5444"
I NimBLECharacteristic: "getValue() &retval:3fca4a64 &m_value:3fca543c m_value.c_str():3fca5444"
65 I NimBLECharacteristic: "getValue() &retval:3fca4ba8 &m_value:3fca543c m_value.c_str():3fca5444"
I NimBLECharacteristic: "getValue() &retval:3fca4a64 &m_value:3fca543c m_value.c_str():3fca5444"
CD I NimBLECharacteristic: "getValue() &retval:3fca4ba8 &m_value:3fca543c m_value.c_str():3fca5444"
I NimBLECharacteristic: "getValue() &retval:3fca4a64 &m_value:3fca543c m_value.c_str():3fca5444"
FA I NimBLECharacteristic: "getValue() &retval:3fca4ba8 &m_value:3fca543c m_value.c_str():3fca5444"
I NimBLECharacteristic: "getValue() &retval:3fca4a64 &m_value:3fca543c m_value.c_str():3fca5444"
98 I NimBLECharacteristic: "getValue() &retval:3fca4ba8 &m_value:3fca543c m_value.c_str():3fca5444"

I NimBLECharacteristic: "getValue() &retval:3fca4ba8 &m_value:3fca543c m_value.c_str():3fca5444"
I NimBLECharacteristic: "getValue() &retval:3fca4a64 &m_value:3fca543c m_value.c_str():3fca5444"
0 I NimBLECharacteristic: "getValue() &retval:3fca4ba8 &m_value:3fca543c m_value.c_str():3fca5444"
I NimBLECharacteristic: "getValue() &retval:3fca4a64 &m_value:3fca543c m_value.c_str():3fca5444"
FD I NimBLECharacteristic: "getValue() &retval:3fca4ba8 &m_value:3fca543c m_value.c_str():3fca5444"
I NimBLECharacteristic: "getValue() &retval:3fca4a64 &m_value:3fca543c m_value.c_str():3fca5444"
C7 I NimBLECharacteristic: "getValue() &retval:3fca4ba8 &m_value:3fca543c m_value.c_str():3fca5444"
I NimBLECharacteristic: "getValue() &retval:3fca4a64 &m_value:3fca543c m_value.c_str():3fca5444"
65 I NimBLECharacteristic: "getValue() &retval:3fca4ba8 &m_value:3fca543c m_value.c_str():3fca5444"
I NimBLECharacteristic: "getValue() &retval:3fca4a64 &m_value:3fca543c m_value.c_str():3fca5444"
CD I NimBLECharacteristic: "getValue() &retval:3fca4ba8 &m_value:3fca543c m_value.c_str():3fca5444"
I NimBLECharacteristic: "getValue() &retval:3fca4a64 &m_value:3fca543c m_value.c_str():3fca5444"
FA I NimBLECharacteristic: "getValue() &retval:3fca4ba8 &m_value:3fca543c m_value.c_str():3fca5444"
I NimBLECharacteristic: "getValue() &retval:3fca4a64 &m_value:3fca543c m_value.c_str():3fca5444"
98 I NimBLECharacteristic: "getValue() &retval:3fca4ba8 &m_value:3fca543c m_value.c_str():3fca5444"

Step 10
I NimBLECharacteristic: "getValue() &retval:3fca4ae4 &m_value:3fca543c m_value.c_str():3fca5444"
Step 11
I NimBLECharacteristic: "getValue() &retval:3fca4aa4 &m_value:3fca543c m_value.c_str():3fca5444"
Step 12
I NimBLECharacteristic: "getValue() &retval:3fca4aa4 &m_value:3fca543c m_value.c_str():3fca5444"
Step 13
I NimBLECharacteristic: "getValue() &retval:3fca4a64 &m_value:3fca543c m_value.c_str():3fca5444"
Step 14
ptr2 3FCA4AAC
Step 15
ptr3 3FCA4A94

97 FD C7 65 CD FA 98 
97 FD C7 65 CD FA 98 
97 FD C7 65 CD FA 98 
97 FD C7 65 CD FA 98 
0 0 0 0 0 0 0 
paulhamsh commented 2 years ago

Perhaps its a object creation issue - when pChar->getValue() creates a std::string and returns it, and it is instantly used for c_str() perhaps then the object is destroyed and the c_str()pointer then is invalid. Perhaps when std::string s1 = pChar->getValue() is used a deep copy is made and the c_str still exists? That might explain why using the pointer in a memcpy or strcpy works as it is used instantly, but storing the pointer doesn't work. Also using a template seems to make it unhappy, so the std::string response works but template responses don't. The template code makes the same use of a pointer which is not used instantly, which seems to be the issue here.

    template<typename T>
    T                 getValue(time_t *timestamp = nullptr, bool skipSizeCheck = false) {
        std::string value = getValue();
        if(!skipSizeCheck && value.size() < sizeof(T)) return T();
        const char *pData = value.data();                          //////////////   <<<<<<<<<<<<<<<<<<<<<<<<<
        return *((T *)pData);
    }

Perhaps this is allocating a std::string in scope of the function, providing a pointer to it, then closing the scope and deleting the object? I don't know C++ well enough.

For current released version and ESP32-C3 only. Fine on a normal ESP32.

paulhamsh commented 2 years ago

I tried this on the current released version:

    template<typename T>
    T                 getValue(time_t *timestamp = nullptr, bool skipSizeCheck = false) {
        std::string value = getValue();
        NIMBLE_LOGI("TEMPLATE", "TEMPLATE USED"); 
        if(!skipSizeCheck && value.size() < sizeof(T)) { NIMBLE_LOGI("TEMPLATE", "return T();  %x %x", value.size(), sizeof(T)); return T(); };
        // const char *pData = value.data();
        //return *((T *)pData);
        NIMBLE_LOGI("TEMPLATE", "TEMPLATE getValue() &value:%lx value.data():%lx", 
            (unsigned long) &value, 
            (unsigned long) value.data());
        return *((T *) value.data());
    }

And it always fails on the size check!!

Perhaps the .size() for a std::string is always less than the sizeof(std::string)? My results are mostly like this:

I NimBLECharacteristic: "getValue() &retval:3fca4a54 &m_value:3fca543c m_value.c_str():3fca5444"
I TEMPLATE: "TEMPLATE USED"
I TEMPLATE: "return T();  7 10"

I NimBLECharacteristic: "getValue() &retval:3fca4a54 &m_value:3fca543c m_value.c_str():3fca5444"
I TEMPLATE: "TEMPLATE USED"
I TEMPLATE: "return T();  7 18"
h2zero commented 2 years ago

I'm impressed by your persistence, great deductive work :+1:

I think you nailed it here:

Perhaps its a object creation issue - when pChar->getValue() creates a std::string and returns it, and it is instantly used for c_str() perhaps then the object is destroyed and the c_str() pointer then is invalid. Perhaps when std::string s1 = pChar->getValue() is used a deep copy is made and the c_str still exists?

I should have noticed that also but I did not remember how that function worked until you brought it up. Since it returns a copy of the data it must store it in a container that can be copied too, this is necessary due to the volatility of the data, which is why it's done within the critical section. It seems that only by luck does it work on the esp32 and the esp32c3 just exposed the error first.

Perhaps this is allocating a std::string in scope of the function, providing a pointer to it, then closing the scope and deleting the object? I don't know C++ well enough.

In this case what is happening is not what it appears in the code. When you specify a std::string container for the return value the compiler will make a copy of the data into that container rather than construct a new one within the function. If you do not provide a container for the data it will construct a temporary within the function and the c_str method called on that would also be pointing to temporary data.

paulhamsh commented 2 years ago

Thanks :-) I hate a mystery bug. Is there an easy patch for this?

On 25 Jun 2022, at 00:22, h2zero @.***> wrote:

ļ»æ I'm impressed by your persistence, great deductive work šŸ‘

I think you nailed it here:

Perhaps its a object creation issue - when pChar->getValue() creates a std::string and returns it, and it is instantly used for c_str() perhaps then the object is destroyed and the c_str() pointer then is invalid. Perhaps when std::string s1 = pChar->getValue() is used a deep copy is made and the c_str still exists?

I should have noticed that also but I did not remember how that function worked until you brought it up. Since it returns a copy of the data it must store it in a container that can be copied too, this is necessary due to the volatility of the data, which is why it's done within the critical section. It seems that only by luck does it work on the esp32 and the esp32c3 just exposed the error first.

Perhaps this is allocating a std::string in scope of the function, providing a pointer to it, then closing the scope and deleting the object? I don't know C++ well enough.

In this case what is happening is not what it appears in the code. When you specify a std::string container for the return value the compiler will make a copy of the data into that container rather than construct a new one within the function. If you do not provide a container for the data it will construct a temporary within the function and the c_str method called on that would also be pointing to temporary data.

ā€” Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.

h2zero commented 2 years ago

There's nothing that can be changed really. The code was written this way to avoid what you're trying to do, i.e holding a pointer to volatile data.

I'm not really sure why you need to do this anyway, can you provide some context?

paulhamsh commented 2 years ago

I don't actually want a pointer to volatile data, or at least not to keep one - I want to copy that data into a ringbuffer, byte by byte as effectively as possible. And I didn't use memcpy because it isn't necessarily contiguous.

I was doing this in the characteristic onWrite callback - and the code worked on ESP32 and not ESP32-C3. It looked like valid C code to me, and no reason for it not to work. And the data wasn't changing, so it wasn't because the data was volatile - it just failed on ESP32-C3.

So I wondered what the right way to copy the data to a ringbuffer might be, and why it didn't work - hence all my memcpy and strncpy attempts. I'm still not sure why it wouldn't work - I've researched some more C++ and can't really see why it is an issue. As you say, the returned object should be on the caller's stack and hence safe as long as the volatile data is safe. But it actually seems to provide pointers into rubbish memory, not even where the data actually was.

Why this works:

memcpy(dat2, pChar->getValue().c_str(), leng);

And this doesn't:

const char *p = pChar->getValue().c_str();
memcpy(dat2, p, leng);

Seems mysterious.

My kludge was to assign it to a std::string which works, But why that copy works, and a indexed array, or memcpy, or strncpy didn't was what confused me. Whilst the data is static, any pointer should do.

If no patch can be made, then I think it would be good to explain that this sort of code may not work, and what does, so anyone trying to create a copy of the data in the buffer can do so safely.

Or for the getValue() function to return non-volatile data.

Old code which fails on ESP32-C3:

class CharacteristicCallbacks: public BLECharacteristicCallbacks {
  void onWrite(BLECharacteristic* pCharacteristic) {
      int j, l;
      const char *p;
      byte b;
      l = pCharacteristic->getValue().length();

      /////  OH NO!!!! DOES NOT WORK ON ESP32-C3
      p = pCharacteristic->getValue().c_str();
      for (j=0; j < l; j++) {
        b = p[j];
        ble_app_in.add(b);
      }
      ble_app_in.commit();
  };
};

Kludge to fix it:

class CharacteristicCallbacks: public BLECharacteristicCallbacks {
  void onWrite(BLECharacteristic* pCharacteristic) {
    int j, l;
    std::string s;
    byte b;

    s = pCharacteristic->getValue();
    l = pCharacteristic->getValue().length();

    for (j = 0; j < l; j++) {
      b = (byte) s[j];
      ble_app_in.add(b);
    }
    ble_app_in.commit();
  };
};
paulhamsh commented 2 years ago

Also, that template code still confuses me. It always seems to return T(); which is empty

        if(!skipSizeCheck && value.size() < sizeof(T)) return T();

Changing it to ignore the size check, and return a T(pData) seems to help some of my tests - at least the ones that don't use a pointer to the c_str() now seem ok.

    template<typename T>
    T                 getValue(time_t *timestamp = nullptr, bool skipSizeCheck = false) {
        std::string value = getValue();
        //  if(!skipSizeCheck && value.size() < sizeof(T)) { return T(); };  // this seemed always to return an empty T()
        const char *pData = value.data();
        return T(pData);
    }

Using the pointer or a strncpy that used the pointer still doesn't work, regardless of which type is used.

h2zero commented 2 years ago

Thanks for providing the details.

I want to copy that data into a ringbuffer, byte by byte as effectively as possible. And I didn't use memcpy because it isn't necessarily contiguous..

Actually it is, which is how the templates work.

So I wondered what the right way to copy the data to a ringbuffer might be, and why it didn't work - hence all my memcpy and strncpy attempts.

You could also take advantage of iterators for this use case. Here is a modified version of your code that removes the std::string (works on master branch and 1.3.8):

class CharacteristicCallbacks: public BLECharacteristicCallbacks {
  void onWrite(BLECharacteristic* pCharacteristic) {
    for (auto & it : pCharacteristic->getValue()) {
      ble_app_in.add(it);
    }
    ble_app_in.commit();
  };
};

I'm still not sure why it wouldn't work - I've researched some more C++ and can't really see why it is an issue. As you say, the returned object should be on the caller's stack and hence safe as long as the volatile data is safe. But it actually seems to provide pointers into rubbish memory, not even where the data actually was.

This is only true if you have a std::string container in the caller to copy the data into. When you didn't you got undefined behavior, which seems to be "undefined" differently on the C3 šŸ˜„ . What may have happened is the std::string was allocated on the heap in the function and freed on returning (compiler dependent), so maybe the esp32 uses the stack but the C3 does not.

Which could also explain this:

Why this works:

memcpy(dat2, pChar->getValue().c_str(), leng);

And this doesn't:

const char *p = pChar->getValue().c_str(); memcpy(dat2, p, leng);

Seems mysterious.

My kludge was to assign it to a std::string which works, But why that copy works, and a indexed array, or memcpy, or strncpy > didn't was what confused me. Whilst the data is static, any pointer should do.

Also, that template code still confuses me. It always seems to return T(); which is empty

It depends on what you're trying to assign the data to, it's a feature to allow the data to be stored in a different type, such as a struct. The size check is there to make sure the data size is compatible with the type. But this check will fail with types like std::string or std::vector, which is why the new NimBLEAttValue class provides for this specialization.

paulhamsh commented 2 years ago

Thanks - I'll read that very carefully later. On the template, I think this is the wrong way round? value.size() < sizeof(T) Should be value.size() > sizeof(T) And this works return T(pData); where this crashes: return *((T *)pData);

paulhamsh commented 2 years ago

I wrote some code to overwrite the new and delete methods, and check the std::string construction and destruction. You are completely right - when using const char *ptr1 = blah().c_str(); It deletes the string content object in the function blah() before returning. This happens both on the ESP32 and ESP32-C3. But on the C3 the data gets overwritten somehow. One the ESP32 it doesn't. And things on the stack seem to be safe - so strings of 15 bytes or less, which use the short string optimisation. So - I guess I was lucky it ever worked at all.

Is this code in your examples ok, though? Given the string data object is deleted before returning, perhaps we must always assign to a std::string before using it then it persists. I think it only works because the Serial.println() uses the data before it can be overwritten, and I'm not sure that is a guaranteed behaviour?

class CharacteristicCallbacks: public NimBLECharacteristicCallbacks {
    void onRead(NimBLECharacteristic* pCharacteristic){
        Serial.print(pCharacteristic->getUUID().toString().c_str());
        Serial.print(": onRead(), value: ");
        Serial.println(pCharacteristic->getValue().c_str());
    };

    void onWrite(NimBLECharacteristic* pCharacteristic) {
        Serial.print(pCharacteristic->getUUID().toString().c_str());
        Serial.print(": onWrite(), value: ");
        Serial.println(pCharacteristic->getValue().c_str());
    };
}

Thanks for all your support on this - good to close the issue anyway.

h2zero commented 2 years ago

Is this code in your examples ok, though? Given the string data object is deleted before returning, perhaps we must always assign to a std::string before using it then it persists. I think it only works because the Serial.println() uses the data before it can be overwritten, and I'm not sure that is a guaranteed behaviour?

That's a good question, I've never heard of an issue with that but it could be at some point. I should put a debugger on it and see.

Thanks for all your support on this - good to close the issue anyway.

You're welcome šŸ˜„

paulhamsh commented 2 years ago

I think I have worked it out. On the ESP32C3 a freed block has some pointers written over the data - they are two linked list pointers for the previous and next free blocks in the list of memory blocks of a certain size. The ESP32 implementation is different and doesn't write pointers in the same place. I've checked the ESP-IDF and it uses a TLSF algorithm for heap management, and then further checking shows there is a compile flag which switches to using a ROM implementation of TLSF. I can only imagine that one of the two uses the ROM implementation and the other uses the IDF implementation.

[Kconfig](https://github.com/espressif/esp-idf/blob/master/components/heap/Kconfig)
[heap: use ROM implementation for heap tlsf on esp32c2](https://github.com/espressif/esp-idf/commit/068adfba70f969cfa0baf0c023dd5653755f380e)
CONFIG_HEAP_TLSF_USE_ROM_IMPL
(https://docs.espressif.com/projects/esp-idf/en/latest/esp32c3/api-reference/kconfig.html#config-heap-tlsf-use-rom-impl)
Use ROM implementation of heap tlsf library

Enable this flag to use heap functions from ROM instead of ESP-IDF.

If keeping this as ā€œnā€ in your project, you will have less free IRAM. If making this as ā€œyā€ in your project, you will increase free IRAM, but you will lose the possibility to debug this module, and some new features will be added and bugs will be fixed in the IDF source but cannot be synced to ROM.

Default value:
Yes (enabled) if ESP_ROM_HAS_HEAP_TLSF
paulhamsh commented 2 years ago

Actually I think it is even more simple. I think the ESP32 C3 IDF was compiled without canary poisoning in the heap management, and the ESP32 was compiled with canary poisoning. The canary introduces a head, a tail and another size field - using 3 words, followed by the data. After free() these fields are filled with a new length, and two linked list pointers - 3 words. So the old data could still be there, until the block is re-used. So the only difference really is the poisoning.

h2zero commented 2 years ago

Excellent analysis, that's quite a deep dive you took! Very interesting indeed, thanks for sharing. I like to do this sort of thing as well, can't stand not knowing why something does or doesn't do what it should/not.

jackjansen commented 1 year ago

I'm late to the party, but just want to note that I have seen this problem various times with all sorts of different software on all sorts of different platforms over the last twenty years or so.

If you have a function returning a std::string it is unsafe to use c_str() on that return value, because the std::string could be disposed right after returning from c_str() and thereby invalidate that const char *. You should always store the std::string in a temporary variable until you are sure the value return by c_str() is no longer needed.

And every time I'm hit by this I wonder why compilers don't issue a warning for it.