RIOT-OS / RIOT

RIOT - The friendly OS for IoT
https://riot-os.org
GNU Lesser General Public License v2.1
4.91k stars 1.98k forks source link

gnrc_sock_udp: Possible Race condition on copy in application buffer #10389

Open crest42 opened 5 years ago

crest42 commented 5 years ago

Description

When using gnrc_sock_udp in my application after a certain amount of hosts in the network (and thus a certain amount of packets/s on a single host), the buffer content returned by sock_udp_recv is filled with garbage. My message format is defined to be 4 bytes + 4 bytes + 4 bytes + 4 bytes (message type, app src id, app dst id, length of the remainder data).

As an example: A message of type 11 always sends a static sized buffer with size 32 and thus the message "header" would look like this: 0x0000000b 0x00000001 0x00000002 0x00000020. (Second and third 4 bytes can differ)

Sometimes if many hosts are joining the network it happens very seldom that a single host instead read something like: 0x0000000b 0xfe80aaf0 0x00000000 0x00afbb20

Two intresting observations here:

The first 4 bytes always are correct. The second 4 bytes always contains fe80something (Maybe i am just paranoid but this looks like the beginning of a link local ipv6 address)

Further debugging:

Steps to reproduce the issue

Reading content from a udp socket and sending a lots (TBD) of messages. Testing the read buffer for correctness. Sadly no better way yet than try and wait.

Expected results

Content of buffer is correct

Actual results

Buffer content is garbage

Versions

Operating System Environment

   Operating System: "Arch Linux" 
             Kernel: Linux 4.18.16-arch1-1-ARCH x86_64 unknown

Installed compiler toolchains

         native gcc: gcc (GCC) 8.2.1 20180831
  arm-none-eabi-gcc: arm-none-eabi-gcc (Arch Repository) 8.2.0
            avr-gcc: avr-gcc (GCC) 8.2.0
   mips-mti-elf-gcc: missing
         msp430-gcc: missing

riscv-none-embed-gcc: missing clang: clang version 7.0.0 (tags/RELEASE_700/final)

Installed compiler libs

arm-none-eabi-newlib: "3.0.0" mips-mti-elf-newlib: missing riscv-none-embed-newlib: missing avr-libc: "2.0.0" ("20150208")

Installed development tools

              cmake: cmake version 3.12.4
           cppcheck: missing
            doxygen: 1.8.14
             flake8: missing
                git: git version 2.19.1
               make: GNU Make 4.2.1
            openocd: Open On-Chip Debugger 0.10.0
             python: Python 3.7.1
            python2: Python 2.7.15
            python3: Python 3.7.1
         coccinelle: missing
crest42 commented 5 years ago

My check function in sock_udp_recv:


memcpy(data, pkt->data, pkt->size);
res = (int)pkt->size;
if (pkt->size == 48 && memcmp(data, pkt->data, pkt->size) != 0)
    {
        printf("%p-%p: ",data,(char *)data+pkt->size);
        for (int i = 0; i < 48; i++)
        {
            printf("%02x",((char *)data)[i]);
        }
        printf("\n");
        printf("%p-%p: ",pkt->data,(char *)pkt->data+pkt->size);
        for (int i = 0; i < 48; i++)
        {
            printf("%02x",((char *)pkt->data)[i]);
        }
        printf("\n");
        unsigned char *d = ((unsigned char *)data);
        uint32_t tmp;
        d = d + (3*sizeof(uint32_t));
        memcpy(&tmp, d, sizeof(uint32_t));
        uint32_t len = ntohl(tmp);
        assert(len < 1000);
    }    

In the case of an error the memcmp returns != 0 and the printed buffer are different. I try to reproduce and send the corresponding output.

crest42 commented 5 years ago

Was able to reproduce this: Checking the headers in grnc_sock.c:


printf("test\n");
if (pkt->size == 48)
{
    unsigned char *data = ((unsigned char *)pkt->data);
    uint32_t tmp;
    memcpy(&tmp,data,sizeof(uint32_t));
    uint32_t type = ntohl(tmp);
    data = data + sizeof(uint32_t);
    memcpy(&tmp, data, sizeof(uint32_t));
    uint32_t from = ntohl(tmp);
    data = data + sizeof(uint32_t);
    memcpy(&tmp, data, sizeof(uint32_t));
    uint32_t to = ntohl(tmp);
    data = data + sizeof(uint32_t);
    memcpy(&tmp, data, sizeof(uint32_t));
    uint32_t len = ntohl(tmp);
    printf("%d %d %d %d\n",type,from,to,len);
}

Returns:


test
11 65535 1023 32

Which is correct.

Running the memcmp test as described in my last comment returns:


copied 48 bytes
0x565a8bac-0x565a8bdc: 0000000b 1140ffff fffeffff ff800000 000000000000fffffffeffffff8000000000000000000000ffffffdeffffffadffffffbeffffffef0000000000000000ffffffdeffffffadffffffbeffffffefffffffa03e6756
0x565b50c0-0x565b50f0: 0000000b 0000ffff ffffffff ffff0000 03ffffffff00000020ffffffffffffffff00000004000000000000fffffffeffffff8000000000000000000000ffffffdeffffffadffffffbeffffffefffffffa03e6756

As you can clearly see the first 4 Byte 0x0000000b are the expected message type 11. The 4th to the 8th byte are wrong in the return buffer. As you can see in the packet the buffer is 0x0000ffff which is the expected node id 65535. The corresponding 4 byte in the return buffer are 0x1140ffff.

Starting at the 3th byte things do not look correct neither in the gnrc buffer and the return buffer. I will take a closer look and write an update on this

crest42 commented 5 years ago

I just fixed my buffer dump printing:

0x565e9bac-0x565e9bdc: 0000000b 1140fe80 00000000 0000c0c6 ff6f0000 00040000 00000000 fe800000 00000000 c0c637ff fe6ef8bc a0ce6656 
0x565f60c0-0x565f60f0: 0000000b 00006fff 000077ff 00000020 ff6f0000 00040000 00000000 fe800000 00000000 c0c637ff fe6ef8bc a0ce6656 

Now its easy to see that the 2nd to the 4th 4 bytes are somehow screwed. Rest looks fine. It looks like the wrong part is just part of the actuall payload of my protocol:

fe80 00000000 0000c0c6 is the beginning of an link local address is transferred in the payload: fe800000 00000000 c0c63

My best guess is that some kind of context switch happens inside the memcpy or some kind of weird alignment errors. I appreciate any help :)

kaspar030 commented 5 years ago

@crest42 Can you share the application code?

Also, please rule out stack size issues, either with ps() or by increasing your application stack to saomething very large (8k).

crest42 commented 5 years ago

Sure but application code is rather huge. One of the next things i will try is to minimize the application code to a few primitives to limit the problem to a specific part of the code.

Application code: https://github.com/crest42/RIOT ./chord_test. You need to clone it with --recurse-submodules. Just make a debug commit and the expected output when the fault happens is:


1542240127: [26623<-32768->53247] [INFO] thread_wait_for_msg: wait for message run 
7495
11 26623 32768 32
test
11 26623 32768 32
copied 48 bytes
0x5660dbac-0x5660dbdc: 0000000b 1140fe80 00000000 0000c0c6 ff670000 00040000 00000000 fe800000 00000000 c0c637ff fe6ef8bc a0ce6756 
0x5661a0c0-0x5661a0f0: 0000000b 000067ff 00008000 00000020 ff670000 00040000 00000000 fe800000 00000000 c0c637ff fe6ef8bc a0ce6756 
sys/net/gnrc/sock/udp/gnrc_sock_udp.c:251 => 0x565c4bdd
*** RIOT kernel panic:
FAILED ASSERTION.

pid | name                 | state    Q | pri | stack  ( used) | base addr  | current     
  - | isr_stack            | -        - |   - |   8192 (   -1) | 0x566137c0 | 0x566137c0
  1 | idle                 | pending  Q |  15 |   8192 (  584) | 0x566115e0 | 0x56613444 
  2 | main                 | pending  Q |   7 |  12288 ( 5580) | 0x5660e5e0 | 0x56611444 
  3 | ipv6                 | bl rx    _ |   3 |   8192 ( 1980) | 0x56620420 | 0x56622284 
  4 | udp                  | bl rx    _ |   4 |   8192 ( 1344) | 0x5661b800 | 0x5661d664 
  5 | gnrc_netdev_tap      | bl rx    _ |   2 |   8192 ( 3016) | 0x5661de40 | 0x5661fca4 
  6 | CHORD msg            | running  Q |   5 |  16384 ( 5128) | 0x5660a380 | 0x5660e1e4 
  7 | CHORD Periodic       | bl mutex _ |   6 |  16384 ( 4780) | 0x56606380 | 0x5660a1e4 
    | SUM                  |            |     |  86016 (22412)

*** halted.

You can start a new network with "chord new" and join additional node with "chord join". The problem start with about 16-18 nodes.

A short introduction: I try to build a distributed storage using a DHT (Chord) using a mtd device driver in RIOT. The Chord lib is using two threads. One Thread is an simple event loop to manage incomming UDP messages and the other is a periodic loop which implements the Chord state keeping protocol. The fault happens in wait_for_message thread which does the incomming message handling. The other thread mostly crafts UDP messages and wait for an answer. I already refactored the network code a little bit to use htons and friends but that should not make any difference, since i run it on the same machine anyway. (In fact the problem occures with both methods). Also i tried to null the read buffer on every run of wait_for_message in network.c (The buffer which gets passed to the gnrc_udp_recv function).

The stack sizes should be no problem as i already make them rather huge. Here is a example ps output with 18 nodes running. I tried to use static memory in the DHT implementation so it should be constant:


pid | name                 | state    Q | pri | stack  ( used) | base addr  | current     
  - | isr_stack            | -        - |   - |   8192 (   -1) | 0x5660c7c0 | 0x5660c7c0
  1 | idle                 | pending  Q |  15 |   8192 (  584) | 0x5660a5e0 | 0x5660c444 
  2 | main                 | running  Q |   7 |  12288 ( 3180) | 0x566075e0 | 0x5660a444 
  3 | ipv6                 | bl rx    _ |   3 |   8192 ( 1900) | 0x56619420 | 0x5661b284 
  4 | udp                  | bl rx    _ |   4 |   8192 ( 1344) | 0x56614800 | 0x56616664 
  5 | gnrc_netdev_tap      | bl rx    _ |   2 |   8192 ( 2812) | 0x56616e40 | 0x56618ca4 
  6 | CHORD msg            | bl mbox  _ |   5 |  16384 ( 4944) | 0x56603380 | 0x566071e4 
  7 | CHORD Periodic       | bl mutex _ |   6 |  16384 ( 4940) | 0x565ff380 | 0x566031e4 
    | SUM                  |            |     |  86016 (19704)

Three additional remarks:

  1. I decreased the PID of the udp and ipv6 thread to not interference with my second thread (I use MAIN_PID-1 and MAIN_PID-2) but the error occurs anyway
  2. Since this code is not meant to be viewed by others yet there may be some full qualified references to my home dir and stuff like that. Sorry for
  3. The state keeping protocol is not final yet so if you want to tun the code it is important to not simultaneously start node and instead run them with one or two seconds pause in between.

Regards, Robin

crest42 commented 5 years ago

I just disabled interrupts before and enabled them after the copy. Now the problem does not occur in a 5 minute run with 18 nodes which is way longer than without the "fix". I am not sure if this really has an direct effect since printing just before the copy also make the problem disappear but it could be a hint in the right direction.

crest42 commented 5 years ago

I made additional observations while debugging the problem:

  1. I tried to set an global variable (irq_test) in the isr in irq_cpu.c and tested in the gnrc_sock_udp implementation with:

    irq_test = false; memcpy(data, pkt->data, pkt->size); bool tmp_irq_test = irq_test; if(tmp_irq_test) { printf("irq called\n"); }

In 100% of all failure cases "irq called" was printed (>15 atm). This also happens on non-failure cases but on a much rare occassion.

  1. When i provide -fno-builtin-memcpy to CFLAGS the problem is not triggered anymore (at least at a first glance) Edit: Problem still gets triggered

I will take a further look next week.

crest42 commented 5 years ago

Atm i have a similar problem in this function:


int
copy_node(struct node* node, struct node* copy)
{
  assert(node);
  assert(copy);
  memcpy(copy, node, sizeof(struct node));
  assert(copy->id == node->id);
  if(memcmp(&node->addr,&copy->addr,sizeof(node->addr)) != 0) {
      printf("%p-%p -> %p-%p %d\n",(void*)node,
                                        (void *)((char *)node+sizeof(struct node)),
                                        (void*)copy,
                                        (void *)((char *)copy+sizeof(struct node)),
                                        sizeof(struct node));
      assert(false);
  }
  return 0;
}
struct node
{
  nodeid_t id; /*!< Id of the node. The node id is the hashed ipv6 address of
              the node modulo the ring size */
  uint32_t size;
  uint32_t used;
  struct in6_addr addr;
  struct node_additional* additional;   /*!< Pointer to our successor node. */
};

Which sometimes fails with:


> 0x566953e8-0x56695408 -> 0x5669cf7c-0x5669cf9c 32
chord_test/chord/src/chord.c:650 => 0x5665b30d
*** RIOT kernel panic:
FAILED ASSERTION.

pid | name                 | state    Q | pri | stack  ( used) | base addr  | current     
  - | isr_stack            | -        - |   - |   8192 (   -1) | 0x5671fe60 | 0x5671fe60
  1 | idle                 | pending  Q |  15 |   8192 (  584) | 0x5671dc80 | 0x5671fae4 
  2 | main                 | pending  Q |   7 | 524288 ( 4224) | 0x5669dc80 | 0x5671dae4 
  3 | ipv6                 | bl rx    _ |   3 |   8192 ( 2316) | 0x5672c540 | 0x5672e3a4 
  4 | udp                  | bl rx    _ |   4 |   8192 ( 1344) | 0x56727ea0 | 0x56729d04 
  5 | gnrc_netdev_tap      | bl rx    _ |   2 |   8192 ( 2380) | 0x5672a4e0 | 0x5672c344 
  6 | CHORD msg            | running  Q |   5 |  16384 ( 5100) | 0x56699a20 | 0x5669d884 
  7 | CHORD Periodic       | bl mbox  _ |   6 |  16384 ( 3456) | 0x56695a20 | 0x56699884 
    | SUM                  |            |     | 598016 (19404)

*** halted.

native: exiting

So basically the same as in the network stack happens, right after the memcpy a memcmp of the values fails. From the output you can see that neither the addresses provided to memcpy are invalid nor the memory region overlaps. So right now i am not sure anymore if the problem i reported first is even related to gnrc, but i can't see a way how i messed this up in my application code.

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want me to ignore this issue, please mark it with the "State: don't stale" label. Thank you for your contributions.