awslabs / amazon-kinesis-video-streams-pic

Apache License 2.0
47 stars 47 forks source link

[BUG] Deadlock in resetStream #103

Closed Nomidia closed 3 years ago

Nomidia commented 3 years ago

Describe the bug When we use pic and c-producer, every time uploading is complete, stopKinesisVideoStreamSync will be called. After a while, kinesisVideoStreamResetStream will be called before uploading streams. If the token expiration is reached, there will be a deadlock in different threads, and the therads will no be destroyed.

  1. Lock the stream and call the stepStateMachine to reset the machine state.

https://github.com/awslabs/amazon-kinesis-video-streams-pic/blob/c020b496abf6a34d857b6daa22119e9c19adc593/src/client/src/Stream.c#L3223

https://github.com/awslabs/amazon-kinesis-video-streams-pic/blob/c020b496abf6a34d857b6daa22119e9c19adc593/src/client/src/Stream.c#L3323

  1. If the token expiration is reached, the client state will change from CLIENT_STATE_READY to CLIENT_STATE_NEW. getAuthInfo will be called. https://github.com/awslabs/amazon-kinesis-video-streams-pic/blob/c020b496abf6a34d857b6daa22119e9c19adc593/src/client/src/ClientState.c#L159

  2. A timeout may occur and 0x15000011 will be returned.

    17:34:20 2020-12-30 09:34:19 ERROR   blockingCurlCall(): Curl perform failed for url https://c3qp4tl980s52m.credentials.iot.ap-south-1.amazonaws.com/role-aliases/dev-kvs-access-role-alias/credentials with result Timeout was reached : Operation timed out after 3004 milliseconds with 0 out of 0 bytes received 
  3. stepStateMachine will return 0x15000011 to resetStream, it will skip releasing the lock.

At this time, the stream state will keep STREAM_STATE_DESCRIBE instead of STREAM_STATE_PUT_STREAM, the frames will be dropped once the buffer is full.

  1. Next time resetStream is called, getAuthInfo may succeed. The stream state needs to chage to STREAM_STATE_DESCRIBE. A thread will be created. After getting the reponse, describeStreamResult will be called and lock the stream. This is where the deadlock is. If the resetStream is called again and again, several threads will be created and blocked here. The stream state will cycle between STREAM_STATE_DESCRIBE and STREAM_STATE_STOPPED. At the same time, frames will never be uploaded to kvs successfully.

https://github.com/awslabs/amazon-kinesis-video-streams-producer-c/blob/ee4eb03429f03737d9245e54f4d2fb13b08eaf80/src/source/CurlApiCallbacks.c#L1244

https://github.com/awslabs/amazon-kinesis-video-streams-pic/blob/c020b496abf6a34d857b6daa22119e9c19adc593/src/client/src/StreamEvent.c#L198

image

MushMal commented 3 years ago

Oh wow! This is a bug indeed. Thanks for providing a fix. Will resolve this issue as the fix is merged.