ARMmbed / mbed-os

Arm Mbed OS is a platform operating system designed for the internet of things
https://mbed.com
Other
4.67k stars 2.98k forks source link

5.11.0 - RC1 breaks PDMC on some ublox targets. #8929

Closed ashok-rao closed 5 years ago

ashok-rao commented 5 years ago

Description

The following targets are affected by this issue:

  1. UBLOX_EVK_ODIN_W2
  2. UBLOX_C030_U201

For UBLOX_EVK_ODIN_W2: The following tests are passing: UBLOX_EVK_ODIN_W2_ARM_output.txt

For the UBLOX_C030_U201: The following tests are passing: UBLOX_C030_U201_ARM_output.txt

With 5.10.4 the PDMC tests on these exact boards are passing fine. Logs below:

[2018-11-30 10:27:38] INFO: TEST RUNNER RESULTS:

+-------------------------+---------------------+-------------------------------------------------+--------+--------+
| target                  | platform_name       | test suite                                      | result |   time |
+-------------------------+---------------------+-------------------------------------------------+--------+--------+
| NUMAKER_PFM_M487-ARM    | NUMAKER_PFM_M487    | simple-mbed-cloud-client-tests-dev_mgmt-connect | OK     |   69.3 |
| NUMAKER_PFM_M487-ARM    | NUMAKER_PFM_M487    | simple-mbed-cloud-client-tests-dev_mgmt-update  | OK     | 139.86 |
| K66F-ARM                | K66F                | simple-mbed-cloud-client-tests-dev_mgmt-connect | OK     |  92.49 |
| K66F-ARM                | K66F                | simple-mbed-cloud-client-tests-dev_mgmt-update  | OK     | 176.27 |
| NUCLEO_F429ZI-ARM       | NUCLEO_F429ZI       | simple-mbed-cloud-client-tests-dev_mgmt-connect | OK     |  83.55 |
| NUCLEO_F429ZI-ARM       | NUCLEO_F429ZI       | simple-mbed-cloud-client-tests-dev_mgmt-update  | OK     | 172.03 |
| DISCO_L475VG_IOT01A-ARM | DISCO_L475VG_IOT01A | simple-mbed-cloud-client-tests-dev_mgmt-connect | OK     |  88.64 |
| DISCO_L475VG_IOT01A-ARM | DISCO_L475VG_IOT01A | simple-mbed-cloud-client-tests-dev_mgmt-update  | OK     | 196.17 |
| UBLOX_EVK_ODIN_W2-ARM   | UBLOX_EVK_ODIN_W2   | simple-mbed-cloud-client-tests-dev_mgmt-connect | OK     | 175.92 |
| UBLOX_EVK_ODIN_W2-ARM   | UBLOX_EVK_ODIN_W2   | simple-mbed-cloud-client-tests-dev_mgmt-update  | OK     | 492.89 |
| DISCO_F413ZH-ARM        | DISCO_F413ZH        | simple-mbed-cloud-client-tests-dev_mgmt-connect | OK     |  88.77 |
| DISCO_F413ZH-ARM        | DISCO_F413ZH        | simple-mbed-cloud-client-tests-dev_mgmt-update  | OK     | 176.32 |
| NUCLEO_F746ZG-ARM       | NUCLEO_F746ZG       | simple-mbed-cloud-client-tests-dev_mgmt-connect | OK     |  74.32 |
| NUCLEO_F746ZG-ARM       | NUCLEO_F746ZG       | simple-mbed-cloud-client-tests-dev_mgmt-update  | OK     | 154.13 |
| NUMAKER_IOT_M487-ARM    | NUMAKER_IOT_M487    | simple-mbed-cloud-client-tests-dev_mgmt-connect | OK     | 110.85 |
| NUMAKER_IOT_M487-ARM    | NUMAKER_IOT_M487    | simple-mbed-cloud-client-tests-dev_mgmt-update  | OK     | 204.66 |
| UBLOX_C030_U201-ARM     | UBLOX_C030_U201     | simple-mbed-cloud-client-tests-dev_mgmt-connect | OK     | 161.46 |
| UBLOX_C030_U201-ARM     | UBLOX_C030_U201     | simple-mbed-cloud-client-tests-dev_mgmt-update  | OK     | 979.02 |
| K64F-ARM                | K64F                | simple-mbed-cloud-client-tests-dev_mgmt-connect | OK     |  94.79 |
| K64F-ARM                | K64F                | simple-mbed-cloud-client-tests-dev_mgmt-update  | OK     |  181.9 |
| MTB_MXCHIP_EMW3166-ARM  | MTB_MXCHIP_EMW3166  | simple-mbed-cloud-client-tests-dev_mgmt-connect | OK     |  99.82 |
| MTB_MXCHIP_EMW3166-ARM  | MTB_MXCHIP_EMW3166  | simple-mbed-cloud-client-tests-dev_mgmt-update  | OK     | 231.66 |
+-------------------------+---------------------+-------------------------------------------------+--------+--------+

[2018-11-30 10:27:38] INFO: TEST FINISHED

However, updating ONLY the Mbed OS to 5.11.0-RC1 is breaking the same tests on the same boards.

For UBLOX_EVK_ODIN_W2: The following tests are failing: UBLOX_EVK_ODIN_W2_ARM_output.txt

For UBLOX_C030_U201: The following tests are failing: UBLOX_C030_U201_ARM_output.txt

With 5.11.0-RC1 the PDMC tests on these exact boards are failing. Results below:

[2018-11-30 13:16:25] INFO: TEST RUNNER RESULTS:

+-------------------------+---------------------+-------------------------------------------------+---------+---------+
| target                  | platform_name       | test suite                                      | result  |    time |
+-------------------------+---------------------+-------------------------------------------------+---------+---------+
| NUMAKER_PFM_M487-ARM    | NUMAKER_PFM_M487    | simple-mbed-cloud-client-tests-dev_mgmt-connect | TIMEOUT |  276.71 |
| NUMAKER_PFM_M487-ARM    | NUMAKER_PFM_M487    | simple-mbed-cloud-client-tests-dev_mgmt-update  | TIMEOUT |   828.2 |
| K66F-ARM                | K66F                | simple-mbed-cloud-client-tests-dev_mgmt-connect | OK      |   92.66 |
| K66F-ARM                | K66F                | simple-mbed-cloud-client-tests-dev_mgmt-update  | OK      |   193.0 |
| NUCLEO_F429ZI-ARM       | NUCLEO_F429ZI       | simple-mbed-cloud-client-tests-dev_mgmt-connect | OK      |   83.24 |
| NUCLEO_F429ZI-ARM       | NUCLEO_F429ZI       | simple-mbed-cloud-client-tests-dev_mgmt-update  | OK      |  174.48 |
| DISCO_L475VG_IOT01A-ARM | DISCO_L475VG_IOT01A | simple-mbed-cloud-client-tests-dev_mgmt-connect | OK      |  103.72 |
| DISCO_L475VG_IOT01A-ARM | DISCO_L475VG_IOT01A | simple-mbed-cloud-client-tests-dev_mgmt-update  | OK      |   185.7 |
| UBLOX_EVK_ODIN_W2-ARM   | UBLOX_EVK_ODIN_W2   | simple-mbed-cloud-client-tests-dev_mgmt-connect | TIMEOUT |  278.22 |
| UBLOX_EVK_ODIN_W2-ARM   | UBLOX_EVK_ODIN_W2   | simple-mbed-cloud-client-tests-dev_mgmt-update  | TIMEOUT |    78.5 |
| DISCO_F413ZH-ARM        | DISCO_F413ZH        | simple-mbed-cloud-client-tests-dev_mgmt-connect | OK      |   84.58 |
| DISCO_F413ZH-ARM        | DISCO_F413ZH        | simple-mbed-cloud-client-tests-dev_mgmt-update  | OK      |  180.72 |
| NUCLEO_F746ZG-ARM       | NUCLEO_F746ZG       | simple-mbed-cloud-client-tests-dev_mgmt-connect | OK      |   74.27 |
| NUCLEO_F746ZG-ARM       | NUCLEO_F746ZG       | simple-mbed-cloud-client-tests-dev_mgmt-update  | OK      |  171.83 |
| NUMAKER_IOT_M487-ARM    | NUMAKER_IOT_M487    | simple-mbed-cloud-client-tests-dev_mgmt-connect | TIMEOUT |   97.06 |
| NUMAKER_IOT_M487-ARM    | NUMAKER_IOT_M487    | simple-mbed-cloud-client-tests-dev_mgmt-update  | FAIL    |   48.46 |
| UBLOX_C030_U201-ARM     | UBLOX_C030_U201     | simple-mbed-cloud-client-tests-dev_mgmt-connect | TIMEOUT |  267.24 |
| UBLOX_C030_U201-ARM     | UBLOX_C030_U201     | simple-mbed-cloud-client-tests-dev_mgmt-update  | TIMEOUT | 1109.38 |
| K64F-ARM                | K64F                | simple-mbed-cloud-client-tests-dev_mgmt-connect | OK      |   95.54 |
| K64F-ARM                | K64F                | simple-mbed-cloud-client-tests-dev_mgmt-update  | OK      |  183.65 |
| MTB_MXCHIP_EMW3166-ARM  | MTB_MXCHIP_EMW3166  | simple-mbed-cloud-client-tests-dev_mgmt-connect | OK      |  189.74 |
| MTB_MXCHIP_EMW3166-ARM  | MTB_MXCHIP_EMW3166  | simple-mbed-cloud-client-tests-dev_mgmt-update  | OK      |  259.39 |
+-------------------------+---------------------+-------------------------------------------------+---------+---------+

[2018-11-30 13:16:25] INFO: TEST FINISHED

This indicates there is something breaking these targets between the 2 versions.

Steps to reproduce: (please note that this repo is private to Arm & partners only)

  1. mbed import pelion-enablement
  2. cd pelion-enablement
  3. mbed test -t ARM -n simple-mbed-cloud-client-tests-dev_mgmt* -m <targets_mentioned_above>

Expected: All tests for connect + update should pass.

Actual: Connect + update tests fails with either TIMEOUT or ERROR for the targets mentioned above.

This also impacts OoB for 5.11.0. cc: @MarceloSalazar @screamerbg @0xc0170 @cmonr .

Issue request type

[ ] Question
[ ] Enhancement
[x] Bug
ciarmcom commented 5 years ago

Internal Jira reference: https://jira.arm.com/browse/MBOCUSTRIA-231

cmonr commented 5 years ago

@ashok-rao Are there additional instructions on what the AP credentials should be setup to so that tests can be run?

Got to the point where I can run the tests, but the device cannot connect to a network. This is testing with an Odin W2 board.

ashok-rao commented 5 years ago

@cmonr : https://github.com/ARMmbed/pelion-enablement/blob/master/mbed_app.json#L21 .. you can put in your AP's credentials here ..

cmonr commented 5 years ago

@cmonr : https://github.com/ARMmbed/pelion-enablement/blob/master/mbed_app.json#L21 .. you can put in your AP's credentials here ..

Ah, my mistake. Was following the instructions verbatum since it was late in the day. Will try it out early tomorrow.

c1728p9 commented 5 years ago

Hi @ashok-rao, @cmonr the error logs attached for the UBLOX_EVK_ODIN_W2-ARM and UBLOX_C030_U201 indicate that a null pointer write was caught by the MPU.

Bit 1 - DACCVIOL - Indicates that there was a data access violation Bit 7 - MMARVALID - Indicates that MMFAR has valid contents

[1543580665.30][CONN][RXD] MMFSR: 00000082

The address of the memory location that caused an MPU fault:

[1543581892.72][CONN][RXD] MMFAR: 00000000

@ashok-rao could you attach the logs from the NUMAKER runs so I could check those as well?

c1728p9 commented 5 years ago

The Odin problems are likely caused by the MPU being enabled. If this is the case then #8920 will fix this.

cmonr commented 5 years ago

@ashok-rao Please re-test with master, since #8920 has been merged.

Allergies ended up finally hitting me, and I wasn't able to progress on this today. If this is still an issue tomorrow, I can take another look.

adbridge commented 5 years ago

@ashok-rao @MarceloSalazar Can someone from PE please retest this to see if #8920 has fixed this ?

ashok-rao commented 5 years ago

RC2 seems to have fixed this on u-blox (with MPU changes).

c1728p9 commented 5 years ago

@ashok-rao the MPU was only disabled for the UBLOX_EVK_ODIN_W2 Ublox board. The MPU is still enabled for the UBLOX_C030_U201 on RC2, so any problems caused by it are still present. I created #8994 to turn off the MPU for the UBLOX_C030 family of devices. The root cause of this problem still needs to be determined.

screamerbg commented 5 years ago

@c1728p9 Will provide you with test results during the next few hours for these 2 boards.

c1728p9 commented 5 years ago

I was finally able to identify the cause of the UBLOX_C030_U201 MPU fault and it is the same issue that has been identified and fixed in #8946. This fix went into RC2 so this bug only effects RC1. Enabling the MPU is what caused this to manifest in RC1 though so with either #8946 or #8994 there is no MPU fault.

Since the root cause has been fixed I updated #9020 to enable the MPU on the UBLOX_C030_U201.

The MPU on the UBLOX_EVK_ODIN_W2 must still be left disabled until #8930 is fixed.

screamerbg commented 5 years ago

@c1728p9 this is great! Thanks for your and @juhoeskeli's work on this. Do we have a test case / test to validate this across other partner implementations? It seems that the behaviour wasn't documented or at least wasn't tested against.

c1728p9 commented 5 years ago

Hi @screamerbg, I did a bit of digging to figure out why this wasn't caught in CI. There are two problems lettings this get past CI:

  1. The target UBLOX_C030_U201 is not tested as part of mbed-os CI
  2. The mbedtls tests that would have run if part of CI, tests-mbedtls-multi and tests-mbedtls-selftest do not catch this problem.

This target specific code is not part of the mbed-os HAL or covered by a HAL specification. It is a feature with target specific code residing in mbed-os/features/mbedtls/targets/TARGET_STM. @Patater what testing is done for vendor supplied crypto accelerators?

adbridge commented 5 years ago

@ashok-rao Can this one be closed ? We have #8930 for the underlying issue ?

ashok-rao commented 5 years ago

Sure. Thanks @adbridge .