Closed staowyze closed 3 years ago
This is a really odd one. I haven't seen any crash stacks like this. Perhaps there is a memory corruption already in place somewhere as the stack points to a fairly innocuous lines of code in libWebSockets.
Any debug logs? What's the easiest way to go with these crashes? Can we update some of the devices to the latest to eliminate a few of the issues that were fixed post v1.3 release?
I could definitely try if a new release is out
FYI after porting the #861 fix, now this core dump happens most
Will look into the ICE refresh shortly. Meanwhile, if you have any further info/data please let us know. I am assuming you are using the latest master.
not yet, still with v1.30, but i have ported some critical fixes, e.g. the ICE refresh one you mentioned. I just lowered the logging level, hope to get more meaningful log soon.
here is an similar core dump
#0 0x7740ddb8 in __lws_close_free_wsi (wsi=0x8e5598, reason=LWS_CLOSE_STATUS_NOSTATUS, caller=<optimized out>)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/open-source/libwebsockets/build/src/project_libwebsockets/lib/core-net/close.c:342
#1 0x7742204c in lws_client_connect_2 (wsi=0x8e5598)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/open-source/libwebsockets/build/src/project_libwebsockets/lib/roles/http/client/client-handshake.c:768
#2 0x77417cfc in lws_header_table_attach (wsi=0x8e5598, autoservice=0)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/open-source/libwebsockets/build/src/project_libwebsockets/lib/roles/http/server/parsers.c:322
#3 0x77419920 in rops_client_bind_h1 (wsi=0x8e5598, i=<optimized out>)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/open-source/libwebsockets/build/src/project_libwebsockets/lib/roles/h1/ops-h1.c:965
#4 rops_client_bind_h1 (wsi=0x8e5598, i=<optimized out>)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/open-source/libwebsockets/build/src/project_libwebsockets/lib/roles/h1/ops-h1.c:927
#5 0x7741fd80 in lws_client_connect_via_info (i=0x736c8e30)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/open-source/libwebsockets/build/src/project_libwebsockets/lib/core-net/connect.c:257
#6 0x7744ee34 in lwsCompleteSync () from ./libkvsWebrtcSignalingClient.so
#7 0x77451018 in getIceConfigLws () from ./libkvsWebrtcSignalingClient.so
#8 0x77459090 in getIceConfig () from ./libkvsWebrtcSignalingClient.so
#9 0x7745b378 in executeGetIceConfigSignalingState () from ./libkvsWebrtcSignalingClient.so
#10 0x774d7e14 in stepStateMachine () from ./libkvsWebrtcClient.so
#11 0x77459e8c in stepSignalingStateMachine () from ./libkvsWebrtcSignalingClient.so
#12 0x774572d0 in refreshIceConfigurationCallback () from ./libkvsWebrtcSignalingClient.so
#13 0x774d6094 in timerQueueExecutor () from ./libkvsWebrtcClient.so
and the corresponding info log
[10-16 21:54:20][2020-09-16 13:54:20] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:54:22][2020-09-16 13:54:22] [KVS]refreshIceConfigurationCallback(): Refreshing the ICE Server Configuration
[10-16 21:54:22][2020-09-16 13:54:22] [KVS]stepStateMachine(): State Machine - Current state: 0x0000000000000100, Next state: 0x0000000000000020
[10-16 21:54:24][2020-09-16 13:54:24] [KVS]lwsHttpCallbackRoutine(): Client append handshake header
[10-16 21:54:24][2020-09-16 13:54:24] [KVS]lwsHttpCallbackRoutine(): Sending the body {
[10-16 21:54:24][2020-09-16 13:54:24] [KVS]lwsHttpCallbackRoutine(): Connected with server response: 200
[10-16 21:54:24][2020-09-16 13:54:24] [KVS]lwsHttpCallbackRoutine(): Received client http
[10-16 21:54:24][2020-09-16 13:54:24] [KVS]lwsHttpCallbackRoutine(): Received client http read: 972 bytes
[10-16 21:54:24][2020-09-16 13:54:24] [KVS]lwsHttpCallbackRoutine(): Http client completed
[10-16 21:54:24][2020-09-16 13:54:24] [KVS]lwsHttpCallbackRoutine(): Client http closed
[10-16 21:54:24][2020-09-16 13:54:24] [KVS]stepStateMachine(): State Machine - Current state: 0x0000000000000020, Next state: 0x0000000000000040
[10-16 21:54:24][2020-09-16 13:54:24] [KVS]stepStateMachine(): State Machine - Current state: 0x0000000000000040, Next state: 0x0000000000000080
[10-16 21:54:24][2020-09-16 13:54:24] [KVS]stepStateMachine(): State Machine - Current state: 0x0000000000000080, Next state: 0x0000000000000100
[10-16 21:54:30][2020-09-16 13:54:30] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:54:40][2020-09-16 13:54:40] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:54:50][2020-09-16 13:54:50] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:55:00][2020-09-16 13:55:00] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:55:10][2020-09-16 13:55:10] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:55:20][2020-09-16 13:55:20] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:55:30][2020-09-16 13:55:30] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:55:41][2020-09-16 13:55:41] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:55:51][2020-09-16 13:55:51] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:56:01][2020-09-16 13:56:01] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:56:11][2020-09-16 13:56:11] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:56:21][2020-09-16 13:56:21] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:56:31][2020-09-16 13:56:31] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:56:42][2020-09-16 13:56:41] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:56:52][2020-09-16 13:56:52] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:57:02][2020-09-16 13:57:02] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:57:12][2020-09-16 13:57:12] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:57:22][2020-09-16 13:57:22] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:57:32][2020-09-16 13:57:32] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:57:43][2020-09-16 13:57:43] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:57:53][2020-09-16 13:57:53] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:58:03][2020-09-16 13:58:03] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:58:13][2020-09-16 13:58:13] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:58:23][2020-09-16 13:58:23] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:58:33][2020-09-16 13:58:33] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:58:44][2020-09-16 13:58:44] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:58:54][2020-09-16 13:58:54] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-16 21:58:54][2020-09-16 13:58:54] [KVS]refreshIceConfigurationCallback(): Refreshing the ICE Server Configuration
[10-16 21:58:54][2020-09-16 13:58:54] [KVS]stepStateMachine(): State Machine - Current state: 0x0000000000000100, Next state: 0x0000000000000020
The last report is something different. This doesn't seem to be related to the termination sequence. I am really not sure what I am looking at - seems to be some sort of a crash (in absence of the registry dump) and the logs indicate a normal execution when refreshing ICE candidate. Is there a way we can take the entire master instead of cherry-picking commits?
Any way we can get more info on this?
i have collected a few core dumps pointing to the same back trace, regarding the log, I think it just means we don't have enough logging to reveal the problem.
Are you using a specific libWebSockets version/commit or you are using whatever is in the WebRtc SDK that we define? I believe we had ramped up a version but not since the last release of WebRtc
i am using the libwebsockets associated with v1.30.
GIT_REPOSITORY https://github.com/warmcat/libwebsockets.git
GIT_TAG v3.2.3
That's the version that we currently use in libWebSockets-CMakeLists.txt
Perhaps there is something that causes libWebSockets to complain - not sure what indeed. Let me do some thinking around this but it would be great to capture some more info on what's going on.
On thing to note is that we are attempting to refresh ICE configuration while staying connected.
That's the version that we currently use in libWebSockets-CMakeLists.txt
Perhaps there is something that causes libWebSockets to complain - not sure what indeed. Let me do some thinking around this but it would be great to capture some more info on what's going on.
- Try to get debug logs
- Try to run with debug libWebSockets by enabling -DCMAKE_BUILD_TYPE=DEBUG in the CMake file for libWebSockets.
Thank you, will do and share logs once i have them
Actually, i was using debug libwebsockets, just switch to debug log level.
Sounds good. I have a theory that I would like to give a shot. Will work in it soon
Sounds good. I have a theory that I would like to give a shot. Will work in it soon
great! looking forward to seeing it.
one more crash sample
#0 lws_plat_set_socket_options (vhost=0x0, fd=11, unix_skt=0)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/open-source/libwebsockets/build/src/project_libwebsockets/lib/plat/unix/unix-sockets.c:96
#1 0x76f2031c in lws_client_connect_2 (wsi=0x661f50)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/open-source/libwebsockets/build/src/project_libwebsockets/lib/roles/http/client/client-handshake.c:612
#2 0x76f15cfc in lws_header_table_attach (wsi=0x661f50, autoservice=0)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/open-source/libwebsockets/build/src/project_libwebsockets/lib/roles/http/server/parsers.c:322
#3 0x76f17920 in rops_client_bind_h1 (wsi=0x661f50, i=<optimized out>)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/open-source/libwebsockets/build/src/project_libwebsockets/lib/roles/h1/ops-h1.c:965
#4 rops_client_bind_h1 (wsi=0x661f50, i=<optimized out>)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/open-source/libwebsockets/build/src/project_libwebsockets/lib/roles/h1/ops-h1.c:927
#5 0x76f1dd80 in lws_client_connect_via_info (i=0x739c6e30)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/open-source/libwebsockets/build/src/project_libwebsockets/lib/core-net/connect.c:257
#6 0x76f4ce34 in lwsCompleteSync () from ./libkvsWebrtcSignalingClient.so
#7 0x76f4f018 in getIceConfigLws () from ./libkvsWebrtcSignalingClient.so
#8 0x76f57090 in getIceConfig () from ./libkvsWebrtcSignalingClient.so
#9 0x76f59378 in executeGetIceConfigSignalingState () from ./libkvsWebrtcSignalingClient.so
#10 0x76fd5e14 in stepStateMachine () from ./libkvsWebrtcClient.so
#11 0x76f57e8c in stepSignalingStateMachine () from ./libkvsWebrtcSignalingClient.so
#12 0x76f552d0 in refreshIceConfigurationCallback () from ./libkvsWebrtcSignalingClient.so
#13 0x76fd4094 in timerQueueExecutor () from ./libkvsWebrtcClient.so
debugging log
[10-17 16:56:05][2020-09-17 08:56:05] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-17 16:56:06][2020-09-17 08:56:06] [KVS]lwsWssCallbackRoutine(): WSS callback with reason 9
[10-17 16:56:06][2020-09-17 08:56:06] [KVS]signalingClientGetCurrentState(): Signaling Client Get Current State
[10-17 16:56:07][2020-09-17 08:56:07] [KVS]signalingClientGetCurrentState(): Signaling Client Get Current State
[10-17 16:56:08][2020-09-17 08:56:08] [KVS]signalingClientGetCurrentState(): Signaling Client Get Current State
[10-17 16:56:09][2020-09-17 08:56:09] [KVS]signalingClientGetCurrentState(): Signaling Client Get Current State
[10-17 16:56:10][2020-09-17 08:56:10] [KVS]signalingClientGetCurrentState(): Signaling Client Get Current State
[10-17 16:56:11][2020-09-17 08:56:11] [KVS]signalingClientGetCurrentState(): Signaling Client Get Current State
[10-17 16:56:12][2020-09-17 08:56:12] [KVS]signalingClientGetCurrentState(): Signaling Client Get Current State
[10-17 16:56:13][2020-09-17 08:56:13] [KVS]signalingClientGetCurrentState(): Signaling Client Get Current State
[10-17 16:56:14][2020-09-17 08:56:14] [KVS]signalingClientGetCurrentState(): Signaling Client Get Current State
[10-17 16:56:15][2020-09-17 08:56:15] [KVS]signalingClientGetCurrentState(): Signaling Client Get Current State
[10-17 16:56:16][2020-09-17 08:56:16] [KVS]lwsWssCallbackRoutine(): WSS callback with reason 10
[10-17 16:56:16][2020-09-17 08:56:16] [KVS]lwsWssCallbackRoutine(): Client is writable
[10-17 16:56:16][2020-09-17 08:56:16] [KVS]signalingClientGetCurrentState(): Signaling Client Get Current State
[10-17 16:56:17][2020-09-17 08:56:17] [KVS]signalingClientGetCurrentState(): Signaling Client Get Current State
[10-17 16:56:18][2020-09-17 08:56:18] [KVS]signalingClientGetCurrentState(): Signaling Client Get Current State
[10-17 16:56:18][2020-09-17 08:56:18] [KVS]refreshIceConfigurationCallback(): Refreshing the ICE Server Configuration
[10-17 16:56:18][2020-09-17 08:56:18] [KVS]stepStateMachine(): State Machine - Current state: 0x0000000000000100, Next state: 0x0000000000000020
[10-17 16:56:18][2020-09-17 08:56:18] [KVS]signalingClientStateChanged(): Signaling client state changed to 6 - 'Get ICE Server Configuration'
[10-17 16:56:18][2020-09-17 08:56:18] [KVS]lwsCompleteSync(): Perform secure synchronous call for URL: https://xxxx.kinesisvideo.us-west-2.amazonaws.com/v1/get-ice-server-config
[10-17 16:56:18][2020-09-17 08:56:18] [KVS]lwsHttpCallbackRoutine(): HTTPS callback with reason 85
[10-17 16:56:19][2020-09-17 08:56:19] [KVS]signalingClientGetCurrentState(): Signaling Client Get Current State
[10-17 16:56:20][2020-09-17 08:56:20] [KVS]signalingClientGetCurrentState(): Signaling Client Get Current State
[10-17 16:56:21][2020-09-17 08:56:21] [KVS]lwsWssCallbackRoutine(): WSS callback with reason 9
[10-17 16:56:21][2020-09-17 08:56:21] [KVS]signalingClientGetCurrentState(): Signaling Client Get Current State
[10-17 16:56:22][2020-09-17 08:56:22] [KVS]signalingClientGetCurrentState(): Signaling Client Get Current State
[10-17 16:56:23][2020-09-17 08:56:23] [KVS]signalingClientGetCurrentState(): Signaling Client Get Current State
[10-17 16:56:24][2020-09-17 08:56:24] [KVS]signalingClientGetCurrentState(): Signaling Client Get Current State
[10-17 16:56:25][2020-09-17 08:56:25] [KVS]signalingClientGetCurrentState(): Signaling Client Get Current State
[10-17 16:56:26][2020-09-17 08:56:26] [KVS]signalingClientGetCurrentState(): Signaling Client Get Current State
[10-17 16:56:27][2020-09-17 08:56:27] [KVS]signalingClientGetCurrentState(): Signaling Client Get Current State
[10-17 16:56:28][2020-09-17 08:56:28] [KVS]signalingClientGetCurrentState(): Signaling Client Get Current State
[10-17 16:56:28][2020-09-17 08:56:28] [KVS]lwsHttpCallbackRoutine(): HTTPS callback with reason 30
[10-17 16:56:29][2020-09-17 08:56:29] [KVS]signalingClientGetCurrentState(): Signaling Client Get Current State
Sounds good. I have a theory that I would like to give a shot. Will work in it soon
Any clue?
Post bt full in case it may be helpful
bt full
#0 lws_plat_set_socket_options (vhost=0x0, fd=11, unix_skt=0)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/open-source/libwebsockets/build/src/project_libwebsockets/lib/plat/unix/unix-sockets.c:96
optval = 1
#1 0x77a2a31c in lws_client_connect_2 (wsi=0xbaf150)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/open-source/libwebsockets/build/src/project_libwebsockets/lib/roles/http/client/client-handshake.c:612
context = <optimized out>
pt = <optimized out>
adsin = <optimized out>
plen = 0
n = <optimized out>
m = <optimized out>
port = 443
cce = 0x77a381a4 ""
iface = <optimized out>
psa = <optimized out>
meth = 0xb9b4b8 "\340\a\273"
result = 0xbaffb8
ads = <optimized out>
sa46 = {sa4 = {sin_family = 2, sin_port = 0, sin_addr = {s_addr = 3027191606}, sin_zero = "\000\000\000\000\000\000\000"}}
#2 0x77a1fcfc in lws_header_table_attach (wsi=0xbaf150, autoservice=0)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/open-source/libwebsockets/build/src/project_libwebsockets/lib/roles/http/server/parsers.c:322
context = <optimized out>
pt = <optimized out>
pa = {fd = 2007313408, events = 12254160, prev_events = 1}
n = <optimized out>
__func__ = "lws_header_table_attach"
#3 0x77a21920 in rops_client_bind_h1 (wsi=0xbaf150, i=<optimized out>)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/open-source/libwebsockets/build/src/project_libwebsockets/lib/roles/h1/ops-h1.c:965
No locals.
#4 rops_client_bind_h1 (wsi=0xbaf150, i=<optimized out>)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/open-source/libwebsockets/build/src/project_libwebsockets/lib/roles/h1/ops-h1.c:927
No locals.
#5 0x77a27d80 in lws_client_connect_via_info (i=0x74cd0e30)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/open-source/libwebsockets/build/src/project_libwebsockets/lib/core-net/connect.c:257
n = <optimized out>
wsi = 0xbaf150
safe = 0x0
p = <optimized out>
local = <optimized out>
---Type <return> to continue, or q <return> to quit---
__func__ = "lws_client_connect_via_info"
#6 0x77a56e34 in lwsCompleteSync (pCallInfo=0xbb6a70)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/src/source/Signaling/LwsApiCalls.c:545
retStatus = 0
retVal = 0
pHostStart = 0xbb1200 "r-xxxxx.kinesisvideo.us-west-2.amazonaws.com"
pHostEnd = 0xbb122f ""
pVerb = 0x77a77500 "POST"
connectInfo = {context = 0xb91ce8, address = 0xbb1200 "r-xxxx.kinesisvideo.us-west-2.amazonaws.com", port = 443, ssl_connection = 1,
path = 0x74cd0ea0 "/v1/get-ice-server-config", host = 0xbb1200 "r-xxxx.kinesisvideo.us-west-2.amazonaws.com", origin = 0x0,
protocol = 0x77a77c80 "https", ietf_version_or_minus_one = 0, userdata = 0x0, client_exts = 0x0, method = 0x77a77500 "POST", parent_wsi = 0x0,
uri_replace_from = 0x0, uri_replace_to = 0x0, vhost = 0x0, pwsi = 0x74cd0e98, iface = 0x0, local_protocol_name = 0x0, alpn = 0x0, seq = 0x0,
opaque_user_data = 0xbb6a70, _unused = {0x0, 0x0, 0x0, 0x0}}
clientLws = 0x0
pContext = 0xb91ce8
secureConnection = 1
locked = 1
path = "/v1/get-ice-server-config", '\000' <repeats 9975 times>
__FUNCTION__ = "lwsCompleteSync"
#7 0x77a59018 in getIceConfigLws (pSignalingClient=0xb2a768, time=16030500265974471)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/src/source/Signaling/LwsApiCalls.c:958
__status = 0
retStatus = 0
pRequestInfo = 0xbb11e8
url = "https://r-xxxx.kinesisvideo.us-west-2.amazonaws.com/v1/get-ice-server-config", '\000' <repeats 6712 times>...
paramsJson = "{\n\t\"ChannelARN\": \"arn:aws:kinesisvideo:us-west-2:xxxx:channel/xxxx/xxxx\",\n\t\"ClientId\": \"ProducerMaster\",\n\t\"Service\": \"TURN\"\n}", '\000' <repeats 470 times>...
pLwsCallInfo = 0xbb6a70
pResponseStr = 0x0
parser = {pos = 0, toknext = 0, toksuper = 0}
tokens = {{type = JSMN_UNDEFINED, start = 0, end = 0, size = 0} <repeats 27 times>, {type = JSMN_UNDEFINED, start = 1959626544, end = 2006948520,
size = 65538}, {type = 1959626680, start = 2006877920, end = 0, size = 2007049808}, {type = 2006074544, start = 1959626576, end = 2006949812,
size = 65538}, {type = 1959626680, start = -16843009, end = -2139062144, size = 9}, {type = 18, start = 11, end = 40, size = 26}, {type = 4262560,
start = 2007049808, end = 0, size = 0}, {type = 11264008, start = 1603050026, end = 26, size = 40}, {type = 11, start = 18, end = 9, size = 120}, {
type = JSMN_UNDEFINED, start = 291, end = 0, size = -28800}, {type = 2006843972, start = 842019419, end = 959458608, size = 540553517}, {
type = 876228913, start = 909261360, end = 1264263261, size = 1935496022}, {type = 1634625385, start = 1735289196, end = 1701407811, size = 1951626350}, {
type = 1130722401, start = 1735287144, end = 690512997, size = 1767055418}, {type = 1818324583, start = 543649385, end = 1701407843, size = 1931506798}, {
type = 1702125940, start = 1634231072, end = 1684367214, size = 544175136}, {type = 539828278, start = 1952794407, end = 1162037536, size = 1919243040}, {
type = 544367990, start = 1718513475, end = 1920296809, size = 1869182049}, {type = 665454, start = 0, end = 0, size = 0}, {type = JSMN_UNDEFINED,
start = 0, end = 0, size = 0} <repeats 51 times>, {type = JSMN_UNDEFINED, start = 2008019904, end = 0, size = 1603050026}, {type = 597699133, start = 0,
end = 2006074544, size = 2006006764}, {type = JSMN_UNDEFINED, start = 1959627680, end = 2007874568, size = 2008019904}, {type = JSMN_UNDEFINED,
start = 1603050026, end = 597700466, size = 0}}
---Type <return> to continue, or q <return> to quit---
pToken = 0x0
i = 0
strLen = 0
resultLen = 0
configCount = 0
tokenCount = 0
j = 0
ttl = 2006006764
jsonInIceServerList = 0
#8 0x77a61090 in getIceConfig (pSignalingClient=0xb2a768, time=16030500265974471)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/src/source/Signaling/Signaling.c:1060
retStatus = 0
timerId = 4366400
#9 0x77a63378 in executeGetIceConfigSignalingState (customData=11708264, time=16030500265974471)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/src/source/Signaling/StateMachine.c:559
retStatus = 0
pSignalingClient = 0xb2a768
#10 0x77adfe14 in stepStateMachine () from ./libkvsWebrtcClient.so
No symbol table info available.
#11 0x77a61e8c in stepSignalingStateMachine (pSignalingClient=0xb2a768, status=0)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/src/source/Signaling/StateMachine.c:98
__status = 2006006764
retStatus = 0
i = 4262560
locked = 1
currentTime = 16030500265974391
__FUNCTION__ = "stepSignalingStateMachine"
#12 0x77a5f2d0 in refreshIceConfigurationCallback (timerId=0, scheduledTime=16030500265971597, customData=11708264)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/src/source/Signaling/Signaling.c:621
__status = 2006107328
retStatus = 0
pStateMachineState = 0xb30e78
pSignalingClient = 0xb2a768
iceRefreshErrMsg = "\354;\221wH\215\255w0\000\000]0\000\000\000\300\215\315t}\266Y9\300\363\257w\247\363\070\000\000\000\000\000\000\000\000\000\220\215\315t\024\252\245w8J\266\000\001\000\000\000\060\000\000]\300\215\315t0\000\000\000\000\000\000\000p\024\251w\000\000\000\000\017\000\000\000h\247\262\000\017\000\000\000\060\000\000\000Failed to reconnect with status code 0x0000000f.", '\000' <repeats 112 times>...
iceRefreshErrLen = 0
newTimerId = 2006074544
__FUNCTION__ = "refreshIceConfigurationCallback"
#13 0x77ade094 in timerQueueExecutor () from ./libkvsWebrtcClient.so
Thanks, it does shed more light. I believe I know the root cause and it seems that our understanding of the libWebSocket internal handling of the states is a little off. LibWebSockets entirely assumes a single threaded execution. This leaves us with two choices - either disconnect from the connected websocket when we do ICE refresh and reconnect it once that's done or to spin up two separate contexts. The first solution is more straight forward but at the expense of potentially losing a message or having latency when viewers are connecting. The dual context solution is bad because it doesn't scale down to small footprint devices. So we decided to use the same context to drive synchronous in-flight requests.
Unfortunately, if my theory is correct, it will require some certain reworking of the threading model to handle parallel execution when we still have a connection up and trying to do refresh. This will take a little longer and I don't have yet any ETAs. I will start looking at it tomorrow.
@staowyze it's hard for us to reproduce the issue you noticed on your devices. Apparently, none of the platforms we run or have seen encountered this issue.
I pushed a new PR: https://github.com/awslabs/amazon-kinesis-video-streams-webrtc-sdk-c/pull/878
This should serialize the connect API as well. Could you give this a try on your device/platform and see if this fixes the crash. The crash stack, while in the guts of the LWS, points to this potential issue that I can think of.
sure, #878 didn't pass all the tests, still good to patch?
it passed my tests though, and I just deployed it, let's wait for a day to see if the same crash happens or not.
I will be looking at the tests. There might be something in the thread sanitizer. Please let us know how this looks on your devices. Also, please take a look at the changes and add your comments.
Got something different with #878
bt full
#0 0x76ed0c8c in free () from /opt/nfs/lib/libuClibc-0.9.33.2.so
No symbol table info available.
#1 0x76f5ff50 in _realloc (ptr=<optimized out>, size=<optimized out>, reason=<optimized out>)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/open-source/libwebsockets/build/src/project_libwebsockets/lib/core/alloc.c:120
No locals.
#2 0x76f605c8 in lws_context_destroy3 (context=0x6d4aa0)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/open-source/libwebsockets/build/src/project_libwebsockets/lib/core/context.c:636
pcontext_finalize = 0x0
n = <optimized out>
#3 0x76facb50 in freeSignaling (ppSignalingClient=0x7fb91e88)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/src/source/Signaling/Signaling.c:206
retStatus = 0
pSignalingClient = 0x6ca7e0
__FUNCTION__ = "freeSignaling"
#4 0x76fa19ac in freeSignalingClient (pSignalingHandle=0x6c4bc8)
at /home/admin/workspace/amazon-kinesis-video-streams-webrtc-sdk-c-mirror/src/source/Signaling/Client.c:45
__status = 3732584
retStatus = 0
pSignalingClient = 0x6ca7e0
__FUNCTION__ = "freeSignalingClient"
#5 0x0040747c in ?? ()
Aha, I know what's going on. Will provide a fix. I didn't change the lock here. Fix in 30 mins
alright, now most of the problems come from context destroy, i am going to update the patch accordingly.
crash in freeSignaling is still happening, I realized you have a new commit which I didn't update
Resolving race condition with peer connection tear-down.
would it cause the crash?
It would take us a day to verify, due to the crash happens not very often and some core dump uploading latency, is the fix ready for landing? if so, we could do one final verification.
We are about to land the changes in flight
i just see a commit in the PR, could you please notify me when the PR is ready? We want to address the bug asap.
We want to address the issue ASAP too. This is the reason we are expediting these changes. Again, please note that this behavior is only seen on your devices and we were not able to reproduce it as the issue is with a related open source component which is the dependency for the sdk.
At this stage, we are assuming that the lws APIs are properly serialized and that was the root cause of the crash.
Forgot to mention, as soon as we get a clean run in ci we will push the change and you can then try to upgrade to the latest. Make sure you remove the KVS common library so it can be rebuilt
Good to know, FYI CI just failed.
Hmm.. Still building: https://github.com/awslabs/amazon-kinesis-video-streams-webrtc-sdk-c/pull/878/checks?check_run_id=1284214993
I can assure you that we are monitoring this
@staowyze the change is in the master
no more crash today, thank you
Hi experts,
Got a core dump as follows, v1.30 tag, any comments are appreciated!