Azure / iot-edge-v1

Azure IoT Edge
http://azure.github.io/iot-edge/
Other
524 stars 258 forks source link

Valgrind reports "Invalid read of size 4" When iot-edge is built with --config Release flag #413

Open villepalo opened 6 years ago

villepalo commented 6 years ago

Valgrind reports "Invalid read of size 4" When iot-edge is built with '--config Release' flag. Without '--config Release' flag, this error is not reported.

Reproducible with simulated_device_cloud_upload_sample. Tested on Ubuntu 14.04.

IotHub configuration:

      "Transport": "AMQP",
      "RetryPolicy": "INTERVAL"

==14350== Memcheck, a memory error detector ==14350== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al. ==14350== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info ==14350== Command: ./samples/simulated_device_cloud_upload/simulated_device_cloud_upload_sample ../samples/simulated_device_cloud_upload/src/simulated_device_cloud_upload_lin.json ==14350== Press return to exit the application. ==14350== Thread 3: ==14350== Invalid read of size 4 ==14350== at 0x7607F5E: twin_messenger_create (in /home/user/iot-edge/build/modules/iothub/libiothub.so) ==14350== by 0x75FD75E: device_create (in /home/user/iot-edge/build/modules/iothub/libiothub.so) ==14350== by 0x75FB573: IoTHubTransport_AMQP_Common_Register (in /home/user/iot-edge/build/modules/iothub/libiothub.so) ==14350== by 0x7888171: IoTHubClient_LL_CreateWithTransport (in /home/user/iot-edge/install-deps/lib/x86_64-linux-gnu/libiothub_client.so) ==14350== by 0x7882389: IoTHubClient_CreateWithTransport (in /home/user/iot-edge/install-deps/lib/x86_64-linux-gnu/libiothub_client.so) ==14350== by 0x75F813E: IotHub_Receive (in /home/user/iot-edge/build/modules/iothub/libiothub.so) ==14350== by 0x4E459EF: module_worker (in /home/user/iot-edge/build/core/libgateway.so) ==14350== by 0x4E52A4D: ThreadWrapper (in /home/user/iot-edge/build/core/libgateway.so) ==14350== by 0x56A2183: start_thread (pthread_create.c:312) ==14350== by 0x59B5FFC: clone (clone.S:111) ==14350== Address 0x6a6be04 is 36 bytes inside a block of size 38 alloc'd ==14350== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==14350== by 0x7607F0A: twin_messenger_create (in /home/user/iot-edge/build/modules/iothub/libiothub.so) ==14350== by 0x75FD75E: device_create (in /home/user/iot-edge/build/modules/iothub/libiothub.so) ==14350== by 0x75FB573: IoTHubTransport_AMQP_Common_Register (in /home/user/iot-edge/build/modules/iothub/libiothub.so) ==14350== by 0x7888171: IoTHubClient_LL_CreateWithTransport (in /home/user/iot-edge/install-deps/lib/x86_64-linux-gnu/libiothub_client.so) ==14350== by 0x7882389: IoTHubClient_CreateWithTransport (in /home/user/iot-edge/install-deps/lib/x86_64-linux-gnu/libiothub_client.so) ==14350== by 0x75F813E: IotHub_Receive (in /home/user/iot-edge/build/modules/iothub/libiothub.so) ==14350== by 0x4E459EF: module_worker (in /home/user/iot-edge/build/core/libgateway.so) ==14350== by 0x4E52A4D: ThreadWrapper (in /home/user/iot-edge/build/core/libgateway.so) ==14350== by 0x56A2183: start_thread (pthread_create.c:312) ==14350== by 0x59B5FFC: clone (clone.S:111) ==14350==

aribeironovaes commented 6 years ago

HI @villepalo ,

Is it a new issue or a duplicate of #409?

Thanks,

Angelo RIbeiro.

villepalo commented 6 years ago

It is not duplicate. To reproduce this, connection break is not required. And as far as I know, there are no memory leak here.

damonbarry commented 6 years ago

Looks like a problem inside the AMQP transport adapter of the C device SDK. @villepalo Could you run Valgrind directly against one of the device SDK samples? E.g., https://github.com/Azure/azure-iot-sdk-c/tree/master/iothub_client/samples/iothub_client_sample_device_twin/.

villepalo commented 6 years ago

Sure, and here it is. One suspicious invalid read of size was found, but doesn't seem to be the same problem?

==11233== Invalid read of size 1 ==11233== at 0x6055943: vfprintf (vfprintf.c:1661) ==11233== by 0x605E3D8: printf (printf.c:33) ==11233== by 0x414EAD: deviceTwinCallback (in /home/user/azure-iot-sdk-c/cmake/iotsdk_linux/iothub_client/samples/iothub_client_sample_device_twin/iothub_client_sample_device_twin) ==11233== by 0x41876D: IoTHubClient_LL_RetrievePropertyComplete (in /home/user/azure-iot-sdk-c/cmake/iotsdk_linux/iothub_client/samples/iothub_client_sample_device_twin/iothub_client_sample_device_twin) ==11233== by 0x42729D: on_device_twin_update_received_callback (in /home/user/azure-iot-sdk-c/cmake/iotsdk_linux/iothub_client/samples/iothub_client_sample_device_twin/iothub_client_sample_device_twin) ==11233== by 0x42C04A: on_twin_state_update_callback (in /home/user/azure-iot-sdk-c/cmake/iotsdk_linux/iothub_client/samples/iothub_client_sample_device_twin/iothub_client_sample_device_twin) ==11233== by 0x43A288: on_amqp_message_received_callback (in /home/user/azure-iot-sdk-c/cmake/iotsdk_linux/iothub_client/samples/iothub_client_sample_device_twin/iothub_client_sample_device_twin) ==11233== by 0x43DC2A: on_message_received_internal_callback (in /home/user/azure-iot-sdk-c/cmake/iotsdk_linux/iothub_client/samples/iothub_client_sample_device_twin/iothub_client_sample_device_twin) ==11233== by 0x4731DF: on_transfer_received (in /home/user/azure-iot-sdk-c/cmake/iotsdk_linux/iothub_client/samples/iothub_client_sample_device_twin/iothub_client_sample_device_twin) ==11233== by 0x46E4BC: link_frame_received (in /home/user/azure-iot-sdk-c/cmake/iotsdk_linux/iothub_client/samples/iothub_client_sample_device_twin/iothub_client_sample_device_twin) ==11233== by 0x47A2C9: on_frame_received (in /home/user/azure-iot-sdk-c/cmake/iotsdk_linux/iothub_client/samples/iothub_client_sample_device_twin/iothub_client_sample_device_twin) ==11233== by 0x46A86F: on_amqp_frame_received (in /home/user/azure-iot-sdk-c/cmake/iotsdk_linux/iothub_client/samples/iothub_client_sample_device_twin/iothub_client_sample_device_twin) ==11233== Address 0xa9feb8b is 0 bytes after a block of size 123 alloc'd ==11233== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==11233== by 0x47193D: message_add_body_amqp_data (in /home/user/azure-iot-sdk-c/cmake/iotsdk_linux/iothub_client/samples/iothub_client_sample_device_twin/iothub_client_sample_device_twin) ==11233== by 0x472F78: decode_message_value_callback (in /home/user/azure-iot-sdk-c/cmake/iotsdk_linux/iothub_client/samples/iothub_client_sample_device_twin/iothub_client_sample_device_twin) ==11233== by 0x462431: internal_decoder_decode_bytes (in /home/user/azure-iot-sdk-c/cmake/iotsdk_linux/iothub_client/samples/iothub_client_sample_device_twin/iothub_client_sample_device_twin) ==11233== by 0x466009: amqpvalue_decode_bytes (in /home/user/azure-iot-sdk-c/cmake/iotsdk_linux/iothub_client/samples/iothub_client_sample_device_twin/iothub_client_sample_device_twin) ==11233== by 0x473108: on_transfer_received (in /home/user/azure-iot-sdk-c/cmake/iotsdk_linux/iothub_client/samples/iothub_client_sample_device_twin/iothub_client_sample_device_twin) ==11233== by 0x46E4BC: link_frame_received (in /home/user/azure-iot-sdk-c/cmake/iotsdk_linux/iothub_client/samples/iothub_client_sample_device_twin/iothub_client_sample_device_twin) ==11233== by 0x47A2C9: on_frame_received (in /home/user/azure-iot-sdk-c/cmake/iotsdk_linux/iothub_client/samples/iothub_client_sample_device_twin/iothub_client_sample_device_twin) ==11233== by 0x46A86F: on_amqp_frame_received (in /home/user/azure-iot-sdk-c/cmake/iotsdk_linux/iothub_client/samples/iothub_client_sample_device_twin/iothub_client_sample_device_twin) ==11233== by 0x47BA4B: frame_received (in /home/user/azure-iot-sdk-c/cmake/iotsdk_linux/iothub_client/samples/iothub_client_sample_device_twin/iothub_client_sample_device_twin) ==11233== by 0x46D10D: frame_codec_receive_bytes (in /home/user/azure-iot-sdk-c/cmake/iotsdk_linux/iothub_client/samples/iothub_client_sample_device_twin/iothub_client_sample_device_twin) ==11233== by 0x469962: connection_byte_received (in /home/user/azure-iot-sdk-c/cmake/iotsdk_linux/iothub_client/samples/iothub_client_sample_device_twin/iothub_client_sample_device_twin) ==11233==

iothub_client_sample_device_twin with valgrind: device_twin_valgrind.log

damonbarry commented 6 years ago

Yeah, that's different.

Actually, I'm realizing that was the wrong sample to try because it doesn't use the multiplexing feature in AMQP, which our iothub module uses.

Could you run valgrind against the iothub_client_sample_amqp_shared sample instead? I verified it takes the same code path that your original valgrind error reported (it calls twin_messenger_create via IoTHubTransport_AMQP_Common_Register, etc.).

Thanks and sorry about the run-around. I don't have a good valgrind setup right now...

villepalo commented 6 years ago

No problems with iothub_client_sample_amp_shared:

valgrind.log

damonbarry commented 6 years ago

OK, thanks.

I just tried this in Ubuntu 17.04 with gcc 6.3.0 and valgrind 3.12.0, and had no errors:

damon@ubuntu-17:~/iot-edge/v1/build$ tools/build.sh --config Release --disable-ble-module --disable-native-remote-modules
...
[ 91%] Built target simulated_device_cloud_upload_sample
...

damon@ubuntu-17:~/iot-edge/v1/build$ valgrind samples/simulated_device_cloud_upload/simulated_device_cloud_upload_sample ../samples/simulated_device_cloud_upload/src/simulated_device_cloud_upload_lin.json
==10010== Memcheck, a memory error detector
==10010== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==10010== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==10010== Command: samples/simulated_device_cloud_upload/simulated_device_cloud_upload_sample ../samples/simulated_device_cloud_upload/src/simulated_device_cloud_upload_lin.json
==10010==
Press return to exit the application.
Device: 01:01:01:01:01:01, Temperature: 10.00
Device: 01:01:01:01:01:01, Temperature: 11.00
Device: 01:01:01:01:01:01, Temperature: 12.00

Device: 01:01:01:01:01:01, Temperature: 13.00
==10010==
==10010== HEAP SUMMARY:
==10010==     in use at exit: 960 bytes in 5 blocks
==10010==   total heap usage: 19,096 allocs, 19,091 frees, 1,241,131 bytes allocated
==10010==
==10010== LEAK SUMMARY:
==10010==    definitely lost: 0 bytes in 0 blocks
==10010==    indirectly lost: 0 bytes in 0 blocks
==10010==      possibly lost: 0 bytes in 0 blocks
==10010==    still reachable: 960 bytes in 5 blocks
==10010==         suppressed: 0 bytes in 0 blocks
==10010== Rerun with --leak-check=full to see details of leaked memory
==10010==
==10010== For counts of detected and suppressed errors, rerun with: -v
==10010== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

You said you're running Ubuntu 14.04, and I can see you're using valgrind 3.10.1. What version of the compiler are you using? Also, did you build release bits by passing --config Release to tools/build.sh, or some other way?

villepalo commented 6 years ago

command line was: ./build.sh --disable-native-remote-modules --config Release --rebuild-deps Used versions: valgrind-3.10.1 gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.3)

damonbarry commented 6 years ago

I spun up a new Ubuntu 14.04.5 LTS VM, and I was able to reproduce this error (using gcc 4.8.4 and valgrind 3.10.1):

damon@damon-ubuntu-14:~/iot-edge/v1/build$ valgrind samples/simulated_device_cloud_upload/simulated_device_cloud_upload_sample ../samples/simulated_device_cloud_upload/src/simulated_device_cloud_upload_lin.json
==23066== Memcheck, a memory error detector
==23066== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==23066== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==23066== Command: samples/simulated_device_cloud_upload/simulated_device_cloud_upload_sample ../samples/simulated_device_cloud_upload/src/simulated_device_cloud_upload_lin.json
==23066==
==23066== Thread 3:
==23066== Invalid read of size 4
==23066==    at 0x76079BE: twin_messenger_create (in /home/damon/iot-edge/v1/build/modules/iothub/libiothub.so)
==23066==    by 0x75FD1BE: device_create (in /home/damon/iot-edge/v1/build/modules/iothub/libiothub.so)
==23066==    by 0x75FAFD3: IoTHubTransport_AMQP_Common_Register (in /home/damon/iot-edge/v1/build/modules/iothub/libiothub.so)
==23066==    by 0x7887171: IoTHubClient_LL_CreateWithTransport (in /home/damon/iot-edge/v1/install-deps/lib/x86_64-linux-gnu/libiothub_client.so)
==23066==    by 0x7881389: IoTHubClient_CreateWithTransport (in /home/damon/iot-edge/v1/install-deps/lib/x86_64-linux-gnu/libiothub_client.so)
==23066==    by 0x75F7DC8: IotHub_Receive (in /home/damon/iot-edge/v1/build/modules/iothub/libiothub.so)
==23066==    by 0x4E459CF: module_worker (in /home/damon/iot-edge/v1/build/core/libgateway.so)
==23066==    by 0x4E52A2D: ThreadWrapper (in /home/damon/iot-edge/v1/build/core/libgateway.so)
==23066==    by 0x56A2183: start_thread (pthread_create.c:312)
==23066==    by 0x59B5FFC: clone (clone.S:111)
==23066==  Address 0x6a85a64 is 36 bytes inside a block of size 38 alloc'd
==23066==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==23066==    by 0x760796A: twin_messenger_create (in /home/damon/iot-edge/v1/build/modules/iothub/libiothub.so)
==23066==    by 0x75FD1BE: device_create (in /home/damon/iot-edge/v1/build/modules/iothub/libiothub.so)
==23066==    by 0x75FAFD3: IoTHubTransport_AMQP_Common_Register (in /home/damon/iot-edge/v1/build/modules/iothub/libiothub.so)
==23066==    by 0x7887171: IoTHubClient_LL_CreateWithTransport (in /home/damon/iot-edge/v1/install-deps/lib/x86_64-linux-gnu/libiothub_client.so)
==23066==    by 0x7881389: IoTHubClient_CreateWithTransport (in /home/damon/iot-edge/v1/install-deps/lib/x86_64-linux-gnu/libiothub_client.so)
==23066==    by 0x75F7DC8: IotHub_Receive (in /home/damon/iot-edge/v1/build/modules/iothub/libiothub.so)
==23066==    by 0x4E459CF: module_worker (in /home/damon/iot-edge/v1/build/core/libgateway.so)
==23066==    by 0x4E52A2D: ThreadWrapper (in /home/damon/iot-edge/v1/build/core/libgateway.so)
==23066==    by 0x56A2183: start_thread (pthread_create.c:312)
==23066==    by 0x59B5FFC: clone (clone.S:111)
==23066==

I'm not able to reproduce this in Ubuntu 17.04 with gcc 6.3.0 and valgrind 3.12.0.

On 14.04, if I run iothub_client_sample_amqp_shared in valgrind I get the exact same result (which is what I would expect):

damon@damon-ubuntu-14:~/iot-edge/v1/deps/iot-sdk-c/build$ valgrind iothub_client/samples/iothub_client_sample_amqp_shared/iothub_client_sample_amqp_shared ../iothub_client/samples/iothub_client_sample_amqp_shared
==24700== Memcheck, a memory error detector
==24700== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==24700== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==24700== Command: iothub_client/samples/iothub_client_sample_amqp_shared/iothub_client_sample_amqp_shared ../iothub_client/samples/iothub_client_sample_amqp_shared
==24700==
Starting the IoTHub client sample AMQP...
==24700== Invalid read of size 4
==24700==    at 0x436EDE: twin_messenger_create (in /home/damon/iot-edge/v1/deps/iot-sdk-c/build/iothub_client/samples/iothub_client_sample_amqp_shared/iothub_client_sample_amqp_shared)
==24700==    by 0x42C6DE: device_create (in /home/damon/iot-edge/v1/deps/iot-sdk-c/build/iothub_client/samples/iothub_client_sample_amqp_shared/iothub_client_sample_amqp_shared)
==24700==    by 0x42A4F3: IoTHubTransport_AMQP_Common_Register (in /home/damon/iot-edge/v1/deps/iot-sdk-c/build/iothub_client/samples/iothub_client_sample_amqp_shared/iothub_client_sample_amqp_shared)
==24700==    by 0x41DAD1: IoTHubClient_LL_CreateWithTransport (in /home/damon/iot-edge/v1/deps/iot-sdk-c/build/iothub_client/samples/iothub_client_sample_amqp_shared/iothub_client_sample_amqp_shared)
==24700==    by 0x416C49: IoTHubClient_CreateWithTransport (in /home/damon/iot-edge/v1/deps/iot-sdk-c/build/iothub_client/samples/iothub_client_sample_amqp_shared/iothub_client_sample_amqp_shared)
==24700==    by 0x41576A: iothub_client_sample_amqp_shared_hl_run (in /home/damon/iot-edge/v1/deps/iot-sdk-c/build/iothub_client/samples/iothub_client_sample_amqp_shared/iothub_client_sample_amqp_shared)
==24700==    by 0x415088: main (in /home/damon/iot-edge/v1/deps/iot-sdk-c/build/iothub_client/samples/iothub_client_sample_amqp_shared/iothub_client_sample_amqp_shared)
==24700==  Address 0xa948714 is 36 bytes inside a block of size 38 alloc'd
==24700==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==24700==    by 0x436E8A: twin_messenger_create (in /home/damon/iot-edge/v1/deps/iot-sdk-c/build/iothub_client/samples/iothub_client_sample_amqp_shared/iothub_client_sample_amqp_shared)
==24700==    by 0x42C6DE: device_create (in /home/damon/iot-edge/v1/deps/iot-sdk-c/build/iothub_client/samples/iothub_client_sample_amqp_shared/iothub_client_sample_amqp_shared)
==24700==    by 0x42A4F3: IoTHubTransport_AMQP_Common_Register (in /home/damon/iot-edge/v1/deps/iot-sdk-c/build/iothub_client/samples/iothub_client_sample_amqp_shared/iothub_client_sample_amqp_shared)
==24700==    by 0x41DAD1: IoTHubClient_LL_CreateWithTransport (in /home/damon/iot-edge/v1/deps/iot-sdk-c/build/iothub_client/samples/iothub_client_sample_amqp_shared/iothub_client_sample_amqp_shared)
==24700==    by 0x416C49: IoTHubClient_CreateWithTransport (in /home/damon/iot-edge/v1/deps/iot-sdk-c/build/iothub_client/samples/iothub_client_sample_amqp_shared/iothub_client_sample_amqp_shared)
==24700==    by 0x41576A: iothub_client_sample_amqp_shared_hl_run (in /home/damon/iot-edge/v1/deps/iot-sdk-c/build/iothub_client/samples/iothub_client_sample_amqp_shared/iothub_client_sample_amqp_shared)
==24700==    by 0x415088: main (in /home/damon/iot-edge/v1/deps/iot-sdk-c/build/iothub_client/samples/iothub_client_sample_amqp_shared/iothub_client_sample_amqp_shared)
==24700==

So this helps narrow it down to a problem in the C SDK, specific to the AMQP transport in multiplexing scenarios. Note that I used the version of the C SDK that IoT Edge currently depends on; I also tried against the C SDK's master branch and got the same behavior, so this hasn't been fixed by a more recent commit.

@villepalo It would be great if you could file an issue directly against the C SDK (using iothub_client_sample_amqp_shared as the repro scenario). We can keep this issue open until we upgrade to a version of the SDK that fixes this issue.