awslabs / amazon-kinesis-video-streams-webrtc-sdk-c

Amazon Kinesis Video Streams Webrtc SDK is for developers to install and customize realtime communication between devices and enable secure streaming of video, audio to Kinesis Video Streams.
https://awslabs.github.io/amazon-kinesis-video-streams-webrtc-sdk-c/group__PublicMemberFunctions.html
Apache License 2.0
1.04k stars 313 forks source link

[BUG] Reconnecting only possible for limited number #1523

Closed mahush closed 2 years ago

mahush commented 2 years ago

Describe the bug After a few iterations of connecting and disconnecting to a video stream provided by amazon-kinesis-video-streams-webrtc-sdk-c connecting either takes a very long time (e.g. 15 seconds) or does not succeed at all. We expect to be able to reconnect as often as necessary.

The setup In the first place we came across this when using our own app that streams video data and is build on top of amazon-kinesis-video-streams-webrtc-sdk-c and gstreamer. Step by step we simplified our setup and ended up with a very basic one:

We are using version 1.7.3 of amazon-kinesis-video-streams-webrtc-sdk-c.

Steps to reproduce

  1. Setup a signal channel (here "our_signaling_channel" on https://eu-central-1.console.aws.amazon.com/kinesisvideo/home?region=eu-central-1#/signalingChannels/create)
  2. Start kvsWebrtcClientMasterGstSample (./amazon-kinesis-video-streams-webrtc-sdk-c/build/samples/kvsWebrtcClientMasterGstSample our_signaling_channel video-only testsrc)
  3. Open AWS web player (https://eu-central-1.console.aws.amazon.com/kinesisvideo/home?region=eu-central-1#/signalingChannels/signalingChannelName/our_signaling_channel)
  4. Press the play button
  5. Wait until video starts + 2 seconds
  6. Press the stop button
  7. Wait two seconds
  8. Proceed with step 4

After 6 to 15 iterations connecting does not succeed anymore and the website freezes. Usually the last connect that succeed takes a lot of time (> 10 seconds).

Things that seem to have nothing to do with this issue

Maybe related to following issues already reported here

Any help on this is highly appreciated :-)

disa6302 commented 2 years ago

@mahush ,

Do you see the CleanUp succeeding? What happens on the master side? Do you have logs?

mahush commented 2 years ago

Thanks @disa6302 for your reply!

Do you see the CleanUp succeeding?

The Master does not terminate by its own, so no CleanUp happening.

What happens on the master side? Do you have logs?

I reproduced the issue once again and recorded a log with LOG_LEVEL_DEBUG. To provide some orientation in the log file, I added markers (starting with #) each time I did something by hand on the client side or if something else of interest happened. In the following I will explain the sequence that happened using the markers:

# Will start playback (1) # Will stop playback (1) --> 1. iteration succeeds

# Will start playback (2) # Will stop playback (2) --> 2. iteration succeeds

# Will start playback (3) # Will stop playback (3) --> 3. iteration succeeds

# Will start playback (4) # Will stop playback (4) --> 4. iteration succeeds

# Will start playback (5) # Will stop playback (5) --> 5. iteration succeeds

# Will start playback (6) # Will stop playback (6) --> 6. iteration succeeds

# Will start playback (7) # Still connecting (frozen) # Still connecting (frozen) # Will stop playback (7) --> 7. iteration finally succeeds, but in the meanwhile the webplayer showed "connecting…" for a long time and the website itself was frozen

# Will start playback (8) # Still connecting (frozen) --> 8. iteration did not succeed. Even after waiting some time webplayer still showed "connecting…" and website was still frozen

# Will close website and open again -->Then I closed the website and opened it again without touching the master

# Will start playback (9) # Will stop playback (9) --> 9. iteration succeeds again

# Will terminate --> Then I closed to master by hand

In summary, the behavior is the same as described above.

But there is one new insight: The issue can be solved by just reopening the website/webplayer. This is very interesting, so it seems that the web player is misbehaving. Hopefully the logs will tell you if the webplayer behaves wrong by itself or because the master makes him stumble.

I am very much looking forward to hearing from you. Thank you in advance!

bug_reconnecting.log

disa6302 commented 2 years ago

@mahush ,

This is my observation from the logs:

The part where freezing happens: I see connection is established, but not sure if there are frames being sent by the viewer.

Towards the end, I also see "Ice agent detected disconnection" which indicates that the viewer is not sending keep alives or something happened with the network that was unable to keep the connection alive. I am unable to reproduce this issue.

github-actions[bot] commented 2 years ago

It looks like this issue has not been active for 10 days. If the issue is not resolved, please add an update to the ticket, else it will be automatically resolved in a few days.