warmcat / libwebsockets

canonical libwebsockets.org networking library
https://libwebsockets.org
Other
4.75k stars 1.48k forks source link

*** Error in `./lws-minimal-mqtt-client': free(): invalid next size (fast): 0x0127f8b8 *** Aborted #2113

Closed assistuelectronics closed 3 years ago

assistuelectronics commented 3 years ago

Hi,

When I run the executable , i am getting below error. No clue where I should look in the minimal-mqtt-client.c.

I have added my certification file thats all.

Error in `./lws-minimal-mqtt-client': free(): invalid next size (fast): 0x0127f8b8 Aborted

lws-team commented 3 years ago

Hm if I build the example and run mosquitto on the same machine...

$ valgrind ./bin/lws-minimal-mqtt-client
==528376== Memcheck, a memory error detector
==528376== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==528376== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==528376== Command: ./bin/lws-minimal-mqtt-client
==528376== 
[2020/11/13 11:01:27:0809] U: LWS minimal MQTT client unencrypted [-d<verbosity>][-s]
[2020/11/13 11:01:27:1239] N: LWS: 4.1.99-v4.1.0-87-g032d0ea82, loglevel 1031
[2020/11/13 11:01:27:1242] N: NET CLI SRV H1 H2 WS MQTT SS-JSON-POL ASYNC_DNS NTPCLIENT IPv6-absent
[2020/11/13 11:01:27:1823] E: callback_ntpc: set up system ops for set_clock
[2020/11/13 11:01:27:1919] N: lws_state_notify_protocol_init: waiting for netlink coldplug
[2020/11/13 11:01:27:3342] N: lws_mqtt_generate_id: User space provided a client ID 'lwsMqttClient'
[2020/11/13 11:01:27:3766] N: _lws_mqtt_rx_parser: migrated nwsi 0x52b25d0 to sid 1 0x52b3050
[2020/11/13 11:01:27:3771] U: callback_mqtt: MQTT_CLIENT_ESTABLISHED
[2020/11/13 11:01:27:3821] U: callback_mqtt: WRITEABLE: Subscribing
[2020/11/13 11:01:27:3897] U: callback_mqtt: MQTT_SUBSCRIBED
^C[2020/11/13 11:01:29:9810] U: Completed: failed
[2020/11/13 11:01:30:0151] U: callback_mqtt: CLIENT_CLOSED
==528376== 
==528376== HEAP SUMMARY:
==528376==     in use at exit: 0 bytes in 0 blocks
==528376==   total heap usage: 70 allocs, 70 frees, 99,245 bytes allocated
==528376== 
==528376== All heap blocks were freed -- no leaks are possible
==528376== 
==528376== For lists of detected and suppressed errors, rerun with: -s
==528376== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

there are no problems. This is current main on Fedora 33...

assistuelectronics commented 3 years ago

Thanks for quick response.

I have cross compiled it for arm platform and I am running it. the overall log i could collect is as below.

[image: image.png]

On Fri, Nov 13, 2020 at 4:32 PM Andy Green notifications@github.com wrote:

Hm if I build the example and run mosquitto on the same machine...

$ valgrind ./bin/lws-minimal-mqtt-client ==528376== Memcheck, a memory error detector ==528376== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==528376== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info ==528376== Command: ./bin/lws-minimal-mqtt-client ==528376== [2020/11/13 11:01:27:0809] U: LWS minimal MQTT client unencrypted [-d][-s] [2020/11/13 11:01:27:1239] N: LWS: 4.1.99-v4.1.0-87-g032d0ea82, loglevel 1031 [2020/11/13 11:01:27:1242] N: NET CLI SRV H1 H2 WS MQTT SS-JSON-POL ASYNC_DNS NTPCLIENT IPv6-absent [2020/11/13 11:01:27:1823] E: callback_ntpc: set up system ops for set_clock [2020/11/13 11:01:27:1919] N: lws_state_notify_protocol_init: waiting for netlink coldplug [2020/11/13 11:01:27:3342] N: lws_mqtt_generate_id: User space provided a client ID 'lwsMqttClient' [2020/11/13 11:01:27:3766] N: _lws_mqtt_rx_parser: migrated nwsi 0x52b25d0 to sid 1 0x52b3050 [2020/11/13 11:01:27:3771] U: callback_mqtt: MQTT_CLIENT_ESTABLISHED [2020/11/13 11:01:27:3821] U: callback_mqtt: WRITEABLE: Subscribing [2020/11/13 11:01:27:3897] U: callback_mqtt: MQTT_SUBSCRIBED ^C[2020/11/13 11:01:29:9810] U: Completed: failed [2020/11/13 11:01:30:0151] U: callback_mqtt: CLIENT_CLOSED ==528376== ==528376== HEAP SUMMARY: ==528376== in use at exit: 0 bytes in 0 blocks ==528376== total heap usage: 70 allocs, 70 frees, 99,245 bytes allocated ==528376== ==528376== All heap blocks were freed -- no leaks are possible ==528376== ==528376== For lists of detected and suppressed errors, rerun with: -s ==528376== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

there are no problems. This is current main on Fedora 33...

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/warmcat/libwebsockets/issues/2113#issuecomment-726701287, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQLM3VJNUYMOPHHFUHKKCLTSPUG55ANCNFSM4TUOB2LA .

assistuelectronics commented 3 years ago

root@ts-imx6:~/Executables# ./lws-minimal-mqtt-client -s [2020/11/13 11:04:38:1622] N: SSL Configuration [2020/11/13 11:04:38:1632] U: LWS minimal MQTT client tls enabled [-d][-s] [2020/11/13 11:04:38:1633] N: LWS: 4.1.99-v4.1.0-76-g7c5e5987, loglevel 1031 [2020/11/13 11:04:38:1634] N: NET CLI SRV H1 H2 WS MQTT ASYNC_DNS IPv6-absent [2020/11/13 11:04:38:1650] N: lws_role_call_adoption_bind: incoming type 0x12 [2020/11/13 11:04:38:1652] N: lws_sort_dns_dump: 1: (2)192.168.43.1, gw (0), idi: 0, lbl: 0, prec: 0 [2020/11/13 11:04:38:1653] N: lws_role_call_adoption_bind: incoming type 0x1000012 [2020/11/13 11:04:38:1669] N: Loading client CA for verification ./rootCA.crt [2020/11/13 11:04:38:1671] N: lws_tls_client_create_vhost_context: doing cert filepath ./52a2f01cc3-certificate.crt [2020/11/13 11:04:38:1679] N: Loaded client cert ./52a2f01cc3-certificate.crt [2020/11/13 11:04:38:1680] N: lws_state_notify_protocol_init: waiting for netlink coldplug Error in `./lws-minimal-mqtt-client': free(): invalid next size (fast): 0x010728b8 Aborted

lws-team commented 3 years ago

Just running with -s doesn't make any problem here.

Best way is run the same test under valgrind, if that means run it on a linux laptop that's also useful. Valgrind will give more information about any double free, including a backtrace.

assistuelectronics commented 3 years ago

OK sure, let me try to do that and comeback with further info.

May take sometime...But doing it now.

assistuelectronics commented 3 years ago

If I run on normal linux below is the error. I know it is somewhere related to SSL configuration. Any suggestions.

ubuntu@ip-172-31-0-27:~/libwebsockets/build/bin$ sudo ./lws-minimal-mqtt-client -s [2020/11/13 12:32:26:9609] N: SSL Configuration [2020/11/13 12:32:26:9611] U: LWS minimal MQTT client tls enabled [-d][-s] [2020/11/13 12:32:26:9612] N: LWS: 4.1.99-v4.1.0-80-g155b84ef, loglevel 1031 [2020/11/13 12:32:26:9613] N: NET CLI SRV H1 H2 WS MQTT IPv6-absent [2020/11/13 12:32:26:9629] N: lws_state_notify_protocol_init: waiting for netlink coldplug [2020/11/13 12:32:27:0631] N: lws_mqtt_generate_id: User space provided a client ID 'lwsMqttClient' [2020/11/13 12:32:27:0643] N: lws_sort_dns_dump: 1: (2)3.20.6.247, gw (0), idi: 2, lbl: 0, prec: 0 [2020/11/13 12:32:27:0644] N: lws_sort_dns_dump: 2: (2)3.12.127.17, gw (0), idi: 2, lbl: 0, prec: 0 [2020/11/13 12:32:27:0644] N: lws_sort_dns_dump: 3: (2)18.218.225.48, gw (0), idi: 2, lbl: 0, prec: 0 [2020/11/13 12:32:27:0671] E: SSL error: unable to get local issuer certificate (preverify_ok=0;err=20;depth=3) [2020/11/13 12:32:27:0673] E: callback_mqtt: CLIENT_CONNECTION_ERROR: connect SSL err 1: error:00000001:lib(0):func(0):reason(1) [2020/11/13 12:32:27:0674] U: Completed: failed ubuntu@ip-172-31-0-27:~/libwebsockets/build/bin$

assistuelectronics commented 3 years ago

OK I have some observations here.

I had added below immediate lines to to the existing code which has only "info.client_ssl_ca_filepath". If I comment those two newly added statements then program runs, but obviously connection to AWS IoT is not possible because we need to give these files as well.

//info.client_ssl_cert_filepath = "./AWSCert.crt"; //info.client_ssl_private_key_filepath = "./AWSPKey.key"; info.client_ssl_ca_filepath = "./rootCA.crt";

Any suggestion how to move forward.

lws-team commented 3 years ago

Hum... without lws involved at all, if I do

$ openssl s_client -connect 3.20.6.247:8883

on Fedora 33 I get

CONNECTED(00000003)
Can't use SSL_get_servername
depth=3 C = US, ST = Arizona, L = Scottsdale, O = "Starfield Technologies, Inc.", CN = Starfield Services Root Certificate Authority - G2
verify error:num=20:unable to get local issuer certificate

So you must give lws the CA cert to trust. The existing code says

#if defined(LWS_WITH_MBEDTLS) || defined(USE_WOLFSSL)
    /*
     * OpenSSL uses the system trust store.  mbedTLS has to be told which
     * CA to trust explicitly.
     */
    info.client_ssl_ca_filepath = "./mosq-ca.crt";
#endif

because normally, the root CA will be trusted by the system trust bundle in OpenSSL case. But here, it seems to use its own CA, you must remove the #if and #endif so it is also happening for OpenSSL.

assistuelectronics commented 3 years ago

//#if defined(LWS_WITH_MBEDTLS) || defined(USE_WOLFSSL) /*

//#endif

But I have same issue. FYI , I am cross compiling and running. Encountering the same.

But if I comment this one line then it runs, but again it can't make connection to AWS IoT.

info.client_ssl_cert_filepath = "./AWSCert.crt";

lws-team commented 3 years ago

So where did "rootCA.crt" come from and what is in it?

assistuelectronics commented 3 years ago

This "rootCA.crt" file downloaded from AWS IoT when we create the thing and it is CA certification file. Apart from that we download two files 1. private key 2. certificate file

lws-team commented 3 years ago

Yeah so what is in it, it's a public CA cert it's not your private credentials

assistuelectronics commented 3 years ago

Yeah thats right...

-----BEGIN CERTIFICATE----- MIIDQTCCAimgAwIBAgITBmyfz5m/jAo54vB4ikPmljZbyjANBgkqhkiG9w0BAQsF ADA5MQswCQYDVQQGEwJVUzEPMA0GA1UEChMGQW1hem9uMRkwFwYDVQQDExBBbWF6 b24gUm9vdCBDQSAxMB4XDTE1MDUyNjAwMDAwMFoXDTM4MDExNzAwMDAwMFowOTEL MAkGA1UEBhMCVVMxDzANBgNVBAoTBkFtYXpvbjEZMBcGA1UEAxMQQW1hem9uIFJv b3QgQ0EgMTCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBALJ4gHHKeNXj ca9HgFB0fW7Y14h29Jlo91ghYPl0hAEvrAIthtOgQ3pOsqTQNroBvo3bSMgHFzZM 9O6II8c+6zf1tRn4SWiw3te5djgdYZ6k/oI2peVKVuRF4fn9tBb6dNqcmzU5L/qw IFAGbHrQgLKm+a/sRxmPUDgH3KKHOVj4utWp+UhnMJbulHheb4mjUcAwhmahRWa6 VOujw5H5SNz/0egwLX0tdHA114gk957EWW67c4cX8jJGKLhD+rcdqsq08p8kDi1L 93FcXmn/6pUCyziKrlA4b9v7LWIbxcceVOF34GfID5yHI9Y/QCB/IIDEgEw+OyQm jgSubJrIqg0CAwEAAaNCMEAwDwYDVR0TAQH/BAUwAwEB/zAOBgNVHQ8BAf8EBAMC AYYwHQYDVR0OBBYEFIQYzIU07LwMlJQuCFmcx7IQTgoIMA0GCSqGSIb3DQEBCwUA A4IBAQCY8jdaQZChGsV2USggNiMOruYou6r4lK5IpDB/G/wkjUu0yKGX9rbxenDI U5PMCCjjmCXPI6T53iHTfIUJrU6adTrCC2qJeHZERxhlbI1Bjjt/msv0tadQ1wUs N+gDS63pYaACbvXy8MWy7Vu33PqUXHeeE6V/Uq2V8viTO96LXFvKWlJbYK8U90vv o/ufQJVtMVT8QtPHRh8jrdkPSHCa2XV4cdFyQzR1bldZwgJcJmApzyMZFo6IQ6XU 5MsI+yMRQ+hDKXJioaldXgjUkK642M4UwtBV8ob2xJNDd2ZhwLnoQdeXeGADbkpy rqXRfboQnoZsG4q5WTP468SQvvG5 -----END CERTIFICATE-----

lws-team commented 3 years ago

No idea what you have there but the root CA cert needed to be trusted to validate the remote servershould be this

https://www.amazontrust.com/repository/SFSRootCAG2.pem

if I wget that and try

$ openssl s_client -connect 3.20.6.247:8883 -CAfile SFSRootCAG2.pem

I can validate the server fine

---
SSL handshake has read 5364 bytes and written 431 bytes
Verification: OK
---
assistuelectronics commented 3 years ago

I am already using these certificate in node-red to test the mqtt client and it is working and able to make AWS IoT connection.

Those same certificate files I am reusing....It shouldn't make any difference right?

why if I add this statement "info.client_ssl_cert_filepath = "./AWSCert.crt";"

it throws memory related issue Error in `./lws-minimal-mqtt-client': free(): invalid next size (fast): 0x01d028b0 Aborted

lws-team commented 3 years ago

why if I add this statement

I duuno, because you haven't provided any valgrind backtrace for what's happening there. Is it mbedtls on the Arm device and openssl elsewhere?

Instead of the backtrace the topic changed to SSL error: unable to get local issuer certificate (preverify_ok=0;err=20;depth=3), this is caused by not telling it to trust the CA. Giving it the root CA cert to trust should solve that. Your other cert you are giving it before is an intermediate, signed by the CA I used, it is not the same as trust the root CA but it should also work for that.

I do not have a set of certs I can use to reproduce what you're doing either.

assistuelectronics commented 3 years ago

OK Now I used Linux running on laptop and then execute the program ..it runs but no connections. Not connecting to AWS IoT is different issue altogether. Issue is why it is running here but not in embedded linux

But most importatnly, the program runs with all below statement. But if I cross compile then it doesn't. In arm I use MBED_TLS

//if defined(LWS_WITH_MBEDTLS) || defined(USE_WOLFSSL) /*

//endif

assistuelectronics commented 3 years ago

For AWS IoT only rootCA certificate is not enough. Does the code "lws_create_context" take other two files or some changes to be done?

lws-team commented 3 years ago

From my perspective the main interesting thing is the double free on an error path, that only happens with mbedtls it seems. So it'd be great if you could run that on your arm device using valgrind or perhaps even gdb would do, to get a backtrace.

It's not anything useful to hear something "doesn't run", although it's helpful to know it only happens with mbedtls. Plase build lws with -DCMAKE_BUILD_TYPE=DEBUG and set the commandline -d1151 to get full logs of when it "doesn't run" so we stand a chance of understanding what it's about. Valgrind or gdb backtrace will also help with understanding.

For the problem you wrote about with "unable to get local issuer certificate" that is entirely about providing the trusted CA root. After you get past that, you must provide client certs and that's the idea of the other members. Butto understand what actually happens, you need full logs as described above.

assistuelectronics commented 3 years ago

Okay can this be solved easily with WOLFSSL instead of MBEDTLS? Can I try or it is gonna be waste of time?

assistuelectronics commented 3 years ago

I ran the lws-minimal-mqtt-client -s in the linux machine (laptop) and seems it is working. But no clue why it is not running in embedded linux. I am not able to debug it

ubuntu@ip-172-31-0-27://home/ubuntu/libwebsockets/build/bin$ ./lws-minimal-mqtt-client -s [2020/11/13 20:12:23:2023] N: SSL Configuration [2020/11/13 20:12:23:2025] U: LWS minimal MQTT client tls enabled [-d][-s] [2020/11/13 20:12:23:2026] N: LWS: 4.1.99-v4.1.0-80-g155b84ef, loglevel 1031 [2020/11/13 20:12:23:2027] N: NET CLI SRV H1 H2 WS MQTT IPv6-absent [2020/11/13 20:12:23:2050] N: lws_tls_client_create_vhost_context: doing cert filepath ./AWSCert.crt [2020/11/13 20:12:23:2052] N: Loaded client cert ./AWSCert.crt [2020/11/13 20:12:23:2053] N: lws_state_notify_protocol_init: waiting for netlink coldplug [2020/11/13 20:12:23:3055] N: lws_mqtt_generate_id: User space provided a client ID 'lwsMqttClient' [2020/11/13 20:12:23:3472] N: lws_sort_dns_dump: 1: (2)3.135.118.160, gw (0), idi: 2, lbl: 0, prec: 0 [2020/11/13 20:12:23:3473] N: lws_sort_dns_dump: 2: (2)18.219.221.37, gw (0), idi: 2, lbl: 0, prec: 0 [2020/11/13 20:12:23:3474] N: lws_sort_dns_dump: 3: (2)3.15.115.209, gw (0), idi: 2, lbl: 0, prec: 0 [2020/11/13 20:12:23:4089] N: _lws_mqtt_rx_parser: migrated nwsi 0x55e14a8779c0 to sid 1 0x55e14a88fec0 [2020/11/13 20:12:23:4090] U: callback_mqtt: MQTT_CLIENT_ESTABLISHED [2020/11/13 20:12:23:4091] U: callback_mqtt: WRITEABLE: Subscribing [2020/11/13 20:12:23:4389] U: callback_mqtt: MQTT_SUBSCRIBED [2020/11/13 20:12:43:4390] N: rops_handle_POLLOUT_mqtt: issuing PINGREQ ^C[2020/11/13 20:12:56:9322] U: Completed: failed [2020/11/13 20:12:56:9322] U: callback_mqtt: CLIENT_CLOSED ubuntu@ip-172-31-0-27://home/ubuntu/libwebsockets/build/bin$

lws-team commented 3 years ago

Well that's good news. For debugging it, please look at what I have written earlier

Please build lws with -DCMAKE_BUILD_TYPE=DEBUG and set the commandline -d1151 to get full logs of when it "doesn't run" so we stand a chance of understanding what it's about. Valgrind or gdb backtrace will also help with understanding.

assistuelectronics commented 3 years ago

Yes, I am able to do so and ran it and below is the response. However not able to make out where is the issue.

root@ts-imx6:~/Executables# ./lws-minimal-mqtt-client -s -d1151 [2020/11/13 20:18:37:5931] N: SSL Configuration [2020/11/13 20:18:37:5943] U: LWS minimal MQTT client tls enabled [-d][-s] [2020/11/13 20:18:37:5944] N: LWS: 4.1.99-v4.1.0-76-g7c5e5987, loglevel 1151 [2020/11/13 20:18:37:5944] N: NET CLI SRV H1 H2 WS MQTT ASYNC_DNS IPv6-absent [2020/11/13 20:18:37:5946] I: lws_create_context: ev lib path /home/ubuntu/ts7970dev/ts7970cross/lib [2020/11/13 20:18:37:5947] I: Event loop: poll [2020/11/13 20:18:37:5948] D: _realloc: size 5992: context [2020/11/13 20:18:37:5950] D: _realloc: size 56: lws_smd_register [2020/11/13 20:18:37:5951] D: lws_smd_register: registered [2020/11/13 20:18:37:5951] D: _realloc: size 48: fds table [2020/11/13 20:18:37:5952] I: ctx: 5336B (1240 ctx + pt(1 thr x 4096)), pt-fds: 6, fdmap: 48 [2020/11/13 20:18:37:5953] I: http: ah_data: 4096, ah: 952, max count 6 [2020/11/13 20:18:37:5953] D: _realloc: size 24: lws_lookup [2020/11/13 20:18:37:5954] I: mem: platform fd map: 24 B [2020/11/13 20:18:37:5956] D: _realloc: size 656: lws_wsi_create_with_role [2020/11/13 20:18:37:5957] D: lws_role_transition: 0x9d4928: wsistate 0x200, ops pipe [2020/11/13 20:18:37:5958] D: event pipe fd 4 [2020/11/13 20:18:37:5958] D: insert_wsi_socket_into_fds: 0x9d4928: tsi=0, sock=4, pos-in-fds=0 [2020/11/13 20:18:37:5960] I: rops_pt_init_destroy_netlink: creating netlink skt [2020/11/13 20:18:37:5961] D: _realloc: size 656: lws_wsi_create_with_role [2020/11/13 20:18:37:5961] D: lws_role_transition: 0x9d4bc0: wsistate 0x200, ops netlink [2020/11/13 20:18:37:5963] D: insert_wsi_socket_into_fds: 0x9d4bc0: tsi=0, sock=5, pos-in-fds=1 [2020/11/13 20:18:37:5965] D: rops_pt_init_destroy_netlink: starting netlink coldplug wait [2020/11/13 20:18:37:5966] I: Compiled with MbedTLS support [2020/11/13 20:18:37:5966] I: canonical_hostname = ts-imx6 [2020/11/13 20:18:37:5967] D: _realloc: size 616: lws_create_vhost [2020/11/13 20:18:37:5968] D: _realloc: size 56: vhost-specific plugin table [2020/11/13 20:18:37:5969] D: _realloc: size 24: same vh list [2020/11/13 20:18:37:5969] I: Creating Vhost 'system' port 0, 2 protocols, IPv6 off [2020/11/13 20:18:37:5972] D: _realloc: size 656: listen wsi [2020/11/13 20:18:37:5972] D: _lws_vhost_init_server: lws_socket_bind says 57655 [2020/11/13 20:18:37:5973] D: lws_role_transition: 0x9d50c8: wsistate 0x200, ops listen [2020/11/13 20:18:37:5974] D: lws_vhost_bind_wsi: vh system: wsi listen/lws-async-dns, count_bound_wsi 1 [2020/11/13 20:18:37:5974] D: insert_wsi_socket_into_fds: 0x9d50c8: tsi=0, sock=6, pos-in-fds=2 [2020/11/13 20:18:37:5975] I: Listening on port 57655 [2020/11/13 20:18:37:5978] I: lws_create_adopt_udp: 192.168.31.1:53 [2020/11/13 20:18:37:5979] D: lws_get_idlest_tsi: 3 5 [2020/11/13 20:18:37:5979] D: _realloc: size 656: new server wsi [2020/11/13 20:18:37:5980] D: new wsi 0x9d5360 joining vhost system, tsi 0 [2020/11/13 20:18:37:5981] D: lws_vhost_bind_wsi: vh system: wsi none/none, count_bound_wsi 2 [2020/11/13 20:18:37:5981] D: lwsi_set_state(0x9d5360, 0x20000200) [2020/11/13 20:18:37:5982] D: lws_ensure_user_space: 0x9d5360 protocol pss 0, user_space=(nil) [2020/11/13 20:18:37:5982] N: lws_role_call_adoption_bind: incoming type 0x12 [2020/11/13 20:18:37:5983] D: _realloc: size 32: udp struct [2020/11/13 20:18:37:5983] D: lws_role_transition: 0x9d5360: wsistate 0x119, ops raw-skt [2020/11/13 20:18:37:5984] D: lws_ensure_user_space: 0x9d5360 protocol pss 0, user_space=(nil) [2020/11/13 20:18:37:5985] D: _realloc: size 125: adns-numip [2020/11/13 20:18:37:5986] D: lws_async_dns_complete: q: 0x7efbe628, c: 0x9d55f8, refcount 0 -> 1 [2020/11/13 20:18:37:9306] I: lws_sort_dns: sort_dns: 0x9d5638 [2020/11/13 20:18:37:9307] D: _realloc: size 72: lws_sort_dns [2020/11/13 20:18:37:9308] I: lws_sort_dns: unsorted entry (af 2) 192.168.31.1 [2020/11/13 20:18:37:9308] N: lws_sort_dns_dump: 1: (2)192.168.31.1, gw (0), idi: 0, lbl: 0, prec: 0 [2020/11/13 20:18:37:9309] D: lws_async_dns_freeaddrinfo: c 0x9d55f8, 192.168.31.1, refcount 1 -> 0 [2020/11/13 20:18:37:9311] D: insert_wsi_socket_into_fds: 0x9d5360: tsi=0, sock=7, pos-in-fds=3 [2020/11/13 20:18:37:9311] N: lws_role_call_adoption_bind: incoming type 0x1000012 [2020/11/13 20:18:37:9312] D: lws_role_transition: 0x9d5360: wsistate 0x119, ops raw-skt [2020/11/13 20:18:37:9313] D: lws_ensure_user_space: 0x9d5360 protocol pss 0, user_space=(nil) [2020/11/13 20:18:37:9314] D: _realloc: size 616: lws_create_vhost [2020/11/13 20:18:37:9314] D: _realloc: size 56: vhost-specific plugin table [2020/11/13 20:18:37:9315] D: _realloc: size 12: same vh list [2020/11/13 20:18:37:9315] I: Creating Vhost 'default' (serving disabled), 1 protocols, IPv6 off [2020/11/13 20:18:37:9318] D: _realloc: size 1188: alloc_file [2020/11/13 20:18:37:9340] N: Loading client CA for verification ./rootCA.crt [2020/11/13 20:18:37:9341] N: lws_tls_client_create_vhost_context: doing cert filepath ./AWSCert.crt [2020/11/13 20:18:37:9343] D: _realloc: size 1220: alloc_file [2020/11/13 20:18:37:9356] N: Loaded client cert ./AWSCert.crt [2020/11/13 20:18:37:9357] I: created client ssl context for default [2020/11/13 20:18:37:9357] I: LWS_MAX_EXTENSIONS_ACTIVE: 1 [2020/11/13 20:18:37:9358] I: mem: per-conn: 656 bytes + protocol rx buf [2020/11/13 20:18:37:9358] I: lws_plat_drop_app_privileges: not changing group [2020/11/13 20:18:37:9359] I: lws_plat_drop_app_privileges: not changing user [2020/11/13 20:18:37:9359] D: lws_cancel_service [2020/11/13 20:18:37:9360] D: _lws_state_transition: system: changed 1 'CONTEXT_CREATED' -> 2 'INITIALIZED' [2020/11/13 20:18:37:9361] N: lws_state_notify_protocol_init: waiting for netlink coldplug [2020/11/13 20:18:37:9361] I: _report: system: prot_init: rejected 'INITIALIZED' -> 'IFACE_COLDPLUG' [2020/11/13 20:18:37:9362] I: lws_state_transition_steps: CONTEXT_CREATED -> INITIALIZED [2020/11/13 20:18:37:9364] D: _realloc: size 72: rops_handle_POLLIN_netlink [2020/11/13 20:18:37:9365] D: _realloc: size 47: lws_smd_msg_alloc [2020/11/13 20:18:37:9366] D: lws_cancel_service [2020/11/13 20:18:37:9366] D: _realloc: size 72: rops_handle_POLLIN_netlink [2020/11/13 20:18:37:9367] D: _realloc: size 47: lws_smd_msg_alloc [2020/11/13 20:18:37:9368] D: lws_cancel_service [2020/11/13 20:18:37:9368] D: _realloc: size 72: rops_handle_POLLIN_netlink [2020/11/13 20:18:37:9369] D: _realloc: size 47: lws_smd_msg_alloc [2020/11/13 20:18:37:9369] D: lws_cancel_service [2020/11/13 20:18:37:9370] D: _realloc: size 72: rops_handle_POLLIN_netlink [2020/11/13 20:18:37:9370] D: _realloc: size 47: lws_smd_msg_alloc [2020/11/13 20:18:37:9371] D: lws_cancel_service [2020/11/13 20:18:37:9372] D: _realloc: size 72: rops_handle_POLLIN_netlink [2020/11/13 20:18:37:9372] D: _realloc: size 47: lws_smd_msg_alloc [2020/11/13 20:18:37:9373] D: lws_cancel_service [2020/11/13 20:18:37:9373] D: _realloc: size 76: lws_smd_msg_alloc [2020/11/13 20:18:37:9374] D: lws_cancel_service [2020/11/13 20:18:37:9375] I: (0)/0, gw: (2)192.168.31.1, ifidx: 7, pri: 1024, proto: 3 [2020/11/13 20:18:38:2696] I: (2)192.168.31.0/24, gw: (0), ifidx: 7, pri: -1, proto: 2 [2020/11/13 20:18:38:2697] I: (2)192.168.31.1/32, gw: (0), ifidx: 7, pri: 1024, proto: 3 [2020/11/13 20:18:38:2698] I: (2)127.0.0.0/8, gw: (0), ifidx: 1, pri: -1, proto: 2 [2020/11/13 20:18:38:2698] I: (2)127.0.0.1/32, gw: (0), ifidx: 1, pri: -1, proto: 2 [2020/11/13 20:18:38:2699] D: _lws_state_transition: system: changed 2 'INITIALIZED' -> 3 'IFACE_COLDPLUG' [2020/11/13 20:18:38:2700] D: _lws_state_transition: system: changed 3 'IFACE_COLDPLUG' -> 4 'DHCP' [2020/11/13 20:18:38:2700] D: _lws_state_transition: system: changed 4 'DHCP' -> 5 'CPD_PRE_TIME' [2020/11/13 20:18:38:2701] D: _lws_state_transition: system: changed 5 'CPD_PRE_TIME' -> 6 'TIME_VALID' [2020/11/13 20:18:38:2701] D: _lws_state_transition: system: changed 6 'TIME_VALID' -> 7 'CPD_POST_TIME' [2020/11/13 20:18:38:2702] I: lws_state_notify_protocol_init: doing protocol init on POLICY_VALID [2020/11/13 20:18:38:2702] I: lws_protocol_init [2020/11/13 20:18:38:2703] D: _lws_state_transition: system: changed 7 'CPD_POST_TIME' -> 8 'POLICY_VALID' [2020/11/13 20:18:38:2703] D: _lws_state_transition: system: changed 8 'POLICY_VALID' -> 9 'REGISTERED' [2020/11/13 20:18:38:2704] D: _lws_state_transition: system: changed 9 'REGISTERED' -> 10 'AUTH1' [2020/11/13 20:18:38:2705] D: _lws_state_transition: system: changed 10 'AUTH1' -> 11 'AUTH2' [2020/11/13 20:18:38:2705] D: _lws_state_transition: system: changed 11 'AUTH2' -> 12 'OPERATIONAL' [2020/11/13 20:18:38:2706] N: Debug Message: 1 [2020/11/13 20:18:38:2706] D: _realloc: size 656: client wsi [2020/11/13 20:18:38:2707] D: lws_vhost_bind_wsi: vh default: wsi none/none, count_bound_wsi 1 [2020/11/13 20:18:38:2707] D: rops_client_bind_mqtt: i = 0x7efbe81c [2020/11/13 20:18:38:2708] D: _realloc: size 208: client mqtt struct [2020/11/13 20:18:38:2708] D: _realloc: size 27: lws_mqtt_str_create [2020/11/13 20:18:38:2709] N: lws_mqtt_generate_id: User space provided a client ID 'lwsMqttClient' [2020/11/13 20:18:38:2709] I: lws_create_client_mqtt_object: using client id 'lwsMqttClient' [2020/11/13 20:18:38:2710] D: _realloc: size 66: lws_mqtt_str_create [2020/11/13 20:18:38:2710] D: _realloc: size 21: lws_mqtt_str_create [2020/11/13 20:18:38:2711] D: lws_role_transition: 0x9d6128: wsistate 0x10000200, ops mqtt [2020/11/13 20:18:38:2711] I: lws_client_connect_via_info: role binding to mqtt [2020/11/13 20:18:38:2712] I: lws_client_connect_via_info: vh default protocol binding to mqtt [2020/11/13 20:18:38:2712] D: _realloc: size 12: user space [2020/11/13 20:18:38:2713] I: lws_client_connect_via_info: wsi 0x9d6128: mqtt mqtt entry [2020/11/13 20:18:38:2714] D: _realloc: size 143: client stash [2020/11/13 20:18:38:2714] D: rops_client_bind_mqtt: i = (nil) [2020/11/13 20:18:38:2715] D: lws_http_client_connect_via_info2: 0x9d6128 (stash 0x9d63c0) [2020/11/13 20:18:38:2715] D: lws_client_connect_2_dnsreq: new conn on no pipeline flag [2020/11/13 20:18:38:2716] D: _realloc: size 46: strdup [2020/11/13 20:18:38:2716] I: lws_client_connect_2_dnsreq: adding active conn 0x9d6128 [2020/11/13 20:18:38:2717] D: lwsi_set_state(0x9d6128, 0x10000201) [2020/11/13 20:18:38:2717] I: lws_client_connect_2_dnsreq: 0x9d6128: lookup af5123kb4e6f6-ats.iot.us-east-2.amazonaws.com:8883 [2020/11/13 20:18:38:6030] D: lwsi_set_state(0x9d6128, 0x10000201) [2020/11/13 20:18:38:6031] D: _realloc: size 238: lws_async_dns_query [2020/11/13 20:18:38:6034] D: _lws_change_pollfd: wsi 0x9d5360: fd 7 events 1 -> 5 [2020/11/13 20:18:38:6035] I: lws_async_dns_query: created new query [2020/11/13 20:18:38:6036] D: lws_client_connect_via_info: wsi 0x9d6128: adoption cb 200 to mqtt mqtt [2020/11/13 20:18:38:6036] I: lws_state_transition_steps: INITIALIZED -> OPERATIONAL Error in `./lws-minimal-mqtt-client': free(): invalid next size (fast): 0x009d68b8 Aborted root@ts-imx6:~/Executables#

On Fri, Nov 13, 2020 at 9:17 PM Andy Green notifications@github.com wrote:

From my perspective the main interesting thing is the double free on an error path, that only happens with mbedtls it seems. So it'd be great if you could run that on your arm device using valgrind or perhaps even gdb would do, to get a backtrace.

It's not anything useful to hear something "doesn't run", although it's helpful to know it only happens with mbedtls. Plase build lws with -DCMAKE_BUILD_TYPE=DEBUG and set the commandline -d1151 to get full logs of when it "doesn't run" so we stand a chance of understanding what it's about. Valgrind or gdb backtrace will also help with understanding.

For the problem you wrote about with "unable to get local issuer certificate" that is entirely about providing the trusted CA root. After you get past that, you must provide client certs and that's the idea of the other members. Butto understand what actually happens, you need full logs as described above.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/warmcat/libwebsockets/issues/2113#issuecomment-726838223, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQLM3VMTGFUNOC2NYOZ62GTSPVIHVANCNFSM4TUOB2LA .

assistuelectronics commented 3 years ago

Well that's good news. For debugging it, please look at what I have written earlier

Please build lws with -DCMAKE_BUILD_TYPE=DEBUG and set the commandline -d1151 to get full logs of when it "doesn't run" so we stand a chance of understanding what it's about. Valgrind or gdb backtrace will also help with understanding.

Thanks for this input again. Hope you have seen the log I have captured with debug statements. I can't make out the issue. Are you able to make out? Can you advise what next can be done to solve the same.

lws-team commented 3 years ago

Hm seems to do with the async DNS lookup. But I tried your server af5123kb4e6f6-ats.iot.us-east-2.amazonaws.com dns here also using async dns on Linux and it resolves fine

[2020/11/13 20:27:25:4152] I: lws_async_dns_store: 0: af5123kb4e6f6-ats.iot.us-east-2.amazonaws.com: 3.12.55.214
[2020/11/13 20:27:25:4152] I: lws_async_dns_store: 1: af5123kb4e6f6-ats.iot.us-east-2.amazonaws.com: 3.12.236.22
[2020/11/13 20:27:25:4152] I: lws_async_dns_store: 2: af5123kb4e6f6-ats.iot.us-east-2.amazonaws.com: 3.12.172.14

So it seems a platform specific issue. Cross-building or native building valgrind or gdb will get the backtrace needed to understand where it blows up.

assistuelectronics commented 3 years ago

Hm seems to do with the async DNS lookup. But I tried your server af5123kb4e6f6-ats.iot.us-east-2.amazonaws.com dns here also using async dns on Linux and it resolves fine

[2020/11/13 20:27:25:4152] I: lws_async_dns_store: 0: af5123kb4e6f6-ats.iot.us-east-2.amazonaws.com: 3.12.55.214
[2020/11/13 20:27:25:4152] I: lws_async_dns_store: 1: af5123kb4e6f6-ats.iot.us-east-2.amazonaws.com: 3.12.236.22
[2020/11/13 20:27:25:4152] I: lws_async_dns_store: 2: af5123kb4e6f6-ats.iot.us-east-2.amazonaws.com: 3.12.172.14

So it seems a platform specific issue. Cross-building or native building valgrind or gdb will get the backtrace needed to understand where it blows up.

Any alternative quick workaround, as I am using Linux machine which is running in EC2. That means remote Linux machine. Not sure what problems I encounter. But will post if I am able to setup gdb and run it.

lws-team commented 3 years ago

Isn't the problem coming on an nxp platform? I mean debug that by valgrind or gdb.

assistuelectronics commented 3 years ago

Meanwhile I have used below for building the package. Let me know if I have missed something here.

sudo cmake .. -DCMAKE_TOOLCHAIN_FILE=/home/ubuntu/ts7970dev/ts7970toolchainfile -DCMAKE_INSTALL_PREFIX:PATH=/home/ubuntu/ts7970dev/ts7970cross -DLWS_WITHOUT_EXTENSIONS=0 -DLWS_WITH_MINIMAL_EXAMPLES=1 -DLWS_WITH_LWSWS=1 -DLWS_WITH_MBEDTLS=1 -DLWS_MBEDTLS_LIBRARIES="/home/ubuntu/ts7970dev/ts7970cross/lib/libmbedcrypto.so;/home/ubuntu/ts7970dev/ts7970cross/lib/libmbedtls.so;/home/ubuntu/ts7970dev/ts7970cross/lib/libmbedx509.so" -DLWS_MBEDTLS_INCLUDE_DIRS=/home/ubuntu/ts7970dev/ts7970cross/include -DLWS_LIBUV_LIBRARIES=/home/ubuntu/ts7970dev/ts7970cross/lib/libuv.so -DLWS_LIBUV_INCLUDE_DIRS=/home/ubuntu/ts7970dev/ts7970cross/include -DLWS_ZLIB_LIBRARIES=/home/ubuntu/ts7970dev/ts7970cross/lib/libz.so -DLWS_ZLIB_INCLUDE_DIRS=/home/ubuntu/ts7970dev/ts7970cross/include -DLWS_WITH_LEJP=1 -DLWS_WITH_STRUCT_JSON=1 -DLWS_WITH_MINIMAL_SQLITE3=1 -DLWS_WITH_SQLITE3=1 -DLWS_SQLITE3_LIBRARIES="/home/ubuntu/ts7970dev/ts7970cross/lib/libsqlite3.so" -DLWS_SQLITE3_INCLUDE_DIRS=/home/ubuntu/ts7970dev/ts7970cross/include -DLWS_ROLE_MQTT=1 -DCMAKE_BUILD_TYPE=DEBUG

assistuelectronics commented 3 years ago

May be weired question. If i am able to ping from imx6 platform to amazon mqtt end point then , it is resolving right? or issue is somewhere else?

root@ts-imx6:~# ping af5123kb4e6f6-ats.iot.us-east-2.amazonaws.com PING af5123kb4e6f6-ats.iot.us-east-2.amazonaws.com (3.135.118.160) 56(84) bytes of data. 64 bytes from ec2-3-135-118-160.us-east-2.compute.amazonaws.com (3.135.118.160): icmp_seq=1 ttl=223 time=244 ms 64 bytes from ec2-3-135-118-160.us-east-2.compute.amazonaws.com (3.135.118.160): icmp_seq=2 ttl=223 time=260 ms 64 bytes from ec2-3-135-118-160.us-east-2.compute.amazonaws.com (3.135.118.160): icmp_seq=3 ttl=223 time=252 ms 64 bytes from ec2-3-135-118-160.us-east-2.compute.amazonaws.com (3.135.118.160): icmp_seq=4 ttl=223 time=251 ms 64 bytes from ec2-3-135-118-160.us-east-2.compute.amazonaws.com (3.135.118.160): icmp_seq=5 ttl=223 time=251 ms 64 bytes from ec2-3-135-118-160.us-east-2.compute.amazonaws.com (3.135.118.160): icmp_seq=6 ttl=223 time=251 ms ^C --- af5123kb4e6f6-ats.iot.us-east-2.amazonaws.com ping statistics --- 6 packets transmitted, 6 received, 0% packet loss, time 5005ms rtt min/avg/max/mdev = 244.657/251.967/260.253/4.568 ms root@ts-imx6:~#

lws-team commented 3 years ago

Can I expect some debug information about this double free? If not I will close this.