ARMmbed / mbed-os-example-mesh-minimal

Simplest Mesh capable test application for mbed OS
Apache License 2.0
38 stars 42 forks source link

[OOB mbed-os-5.4] Run time error (stack overflow) on NCS36510 #56

Closed mray190 closed 7 years ago

mray190 commented 7 years ago

Target: NCS36510 Toolchain: Any Error: Stack overflow. Adjusting the stack and heap sizes prints out varying debug messages but regardless of what size they get set to, it will always fail

Attempted mbed_app.json config:

    "target_overrides": {
        "*": {
            "target.features_add": ["NANOSTACK", "LOWPAN_HOST", "COMMON_PAL"],
            "mbed-mesh-api.6lowpan-nd-panid-filter": "0xffff",            
            "mbed-mesh-api.6lowpan-nd-channel-page": 0,
            "mbed-mesh-api.6lowpan-nd-channel": 12,
            "mbed-mesh-api.6lowpan-nd-device-type": "NET_6LOWPAN_HOST",
            "mbed-mesh-api.6lowpan-nd-channel-mask": "(1<<12)",
            "mbed-mesh-api.thread-config-panid": "0x0700",
            "mbed-mesh-api.thread-master-key": "{0x10, 0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88, 0x99, 0xaa, 0xbb, 0xcc, 0xdd, 0xee, 0xff}",
            "mbed-mesh-api.thread-config-channel": 22,
            "mbed-mesh-api.heap-size": 8000,
            "nanostack-hal.event_loop_thread_stack_size": 10000,
            "mbed-trace.enable": true,
            "platform.stdio-convert-newlines": true,
            "platform.stdio-baud-rate": 115200
        }
    }
MarceloSalazar commented 7 years ago

@SeppoTakalo could you please have a look?

0xc0170 commented 7 years ago

bump

SeppoTakalo commented 7 years ago

I cannot replicate this.

Without mbed_app.json modifications

Connecting...
connected. IP = 2001:999:10:bbf4:200::

With supplied mbed_app.json

Connecting...
[DBG ][nslp]: connect()
[DBG ][evlp]: event_loop_thread
[DBG ][m6LND]: Link-layer security NOT enabled.
[DBG ][m6LND]: Channel: 12
[DBG ][m6LND]: Channel page: 0
[DBG ][m6LND]: Channel mask: 4096
[INFO][m6LND]: Start 6LoWPAN ND Bootstrap
[DBG ][m6LND]: app_parse_network_event() 0
[INFO][m6LND]: 6LoWPAN ND bootstrap ready
[DBG ][m6LND]: ND Access Point: 2001:999:10:bbf4:0:ff:fe00:8454
[DBG ][m6LND]: ND Prefix 64: 20:01:09:99:00:10:bb:f4
[DBG ][m6LND]: GP IPv6: 2001:999:10:bbf4:200::
[DBG ][m6LND]: MAC 16-bit: ff:ff
[DBG ][m6LND]: PAN ID: 06:91
[DBG ][m6LND]: MAC 64-bit: 00:00:00:00:00:00:00:00
[DBG ][m6LND]: IID (Based on MAC 64-bit address): 02:00:00:00:00:00:00:00
[DBG ][m6LND]: Channel: 12
[DBG ][nslp]: getOwnIpAddress()
[DBG ][nslp]: getOwnIpAddress()
connected. IP = 2001:999:10:bbf4:200::

NOTE: Device I'm using does not have MAC address. Therefore the IPv6 address ends in :: (all zeroes)

Can you provide more information. Back trace from stack perhaps. What compiler were used? I'm using

$ arm-none-eabi-gcc --version
arm-none-eabi-gcc (GNU Tools for ARM Embedded Processors) 4.9.3 20150529 (release) [ARM/embedded-4_9-branch revision 227977]
mray190 commented 7 years ago

Compiler used:

$ arm-none-eabi-gcc --version
arm-none-eabi-gcc.exe (GNU Tools for ARM Embedded Processors) 5.4.1 20160919 (release) [ARM/embedded-5-branch revision 240496]

Utilizes mbed-os 5.4 (commit: https://github.com/ARMmbed/mbed-os/commit/3a27568a505bbc0bb8eeb973109d3d39ce823d1c)

Used the OOB test branch: https://github.com/ARMmbed/mbed-os-example-mesh-minimal/tree/oob_test_mbed-os-5.4

If you used all of these things to run what we saw above, Ill go back and grab some stack traces

MarceloSalazar commented 7 years ago

@mray19027 could you please try arm-none-eabi-gcc 4.9.3 (the version that it's officially supported)?

SeppoTakalo commented 7 years ago

I also tried again with arm-none-eabi-gcc (GNU Tools for ARM Embedded Processors) 5.4.1 20160919

I still cannot replicate this.

[DBG ][m6LND]: Restart bootstrap
[DBG ][m6LND]: Link-layer security NOT enabled.
[DBG ][m6LND]: Channel: 12
[DBG ][m6LND]: Channel page: 0
[DBG ][m6LND]: Channel mask: 4096
[INFO][m6LND]: Start 6LoWPAN ND Bootstrap
[DBG ][m6LND]: app_parse_network_event() 3
[DBG ][m6LND]: Link Layer Scan Fail: No Beacons

NOTE: My border router is not running now.

SeppoTakalo commented 7 years ago

Did you flash the device by copying the .bin image to device, or did you try to flash using GDB?

What debugger did you use to verify the Stack overflow?

I believe flashing through GDB's load command does not work.

MarceloSalazar commented 7 years ago

@mray19027 please help to provide more info

mray190 commented 7 years ago

@SeppoTakalo I flashed the device by copying over the .bin image to the device

Currently having issues with our border router but will get additional information as soon as possible

@maclobdell Could use your help

mray190 commented 7 years ago

Conclusion: All of our boards in lab had incorrect Flash settings - actually no settings at all as the flash memory was empty.

This caused all of the radio stacks to overflow because the rxRam variable was incrementing too far. See:

        /* Initialize frame status */
        for (uint8_t i=0; i < length; i++) {
            PHYPAYLOAD[i] = *rxRam++;
        }

in NanostackRfPhyNcs36510.cpp

Flash settings were restored on all of our devices and no further action is required.

SeppoTakalo commented 7 years ago

@mray19027 Thanks for investing this.