tfabris / CrowCam

A set of Bash scripts to control and maintain a YouTube live cam from a Synology NAS.
GNU General Public License v3.0
3 stars 3 forks source link

If the stream is unstable, then it waits too long to bounce, forcing a new stream to be created. #89

Open tfabris opened 5 months ago

tfabris commented 5 months ago

I observed a behavior several times on 2024-04-01 where:

Relevant log entries:

Error 2024-04-01 13:36:53 CrowCam Controller - The streamStatus is not active. Value retrieved was: inactive.
Error 2024-04-01 13:36:53 CrowCam Controller - The healthStatus is not good. Value retrieved was: noData.
Error 2024-04-01 13:37:09 CrowCam Controller - The streamStatus is not active. Value retrieved was: inactive.
Error 2024-04-01 13:37:09 CrowCam Controller - The healthStatus is not good. Value retrieved was: noData.
Error 2024-04-01 13:37:26 CrowCam Controller - The streamStatus is not active. Value retrieved was: inactive.
Error 2024-04-01 13:37:26 CrowCam Controller - The healthStatus is not good. Value retrieved was: noData.
Error 2024-04-01 13:37:42 CrowCam Controller - The streamStatus is not active. Value retrieved was: inactive.
Error 2024-04-01 13:37:42 CrowCam Controller - The healthStatus is not good. Value retrieved was: noData.
Error 2024-04-01 13:37:42 CrowCam Controller - The lifeCycleStatus is not good. Value retrieved was: complete.
Error 2024-04-01 13:37:42 CrowCam Controller - The recordingStatus is not good. Value retrieved was: recorded.
Error 2024-04-01 13:37:42 CrowCam Controller - The lifeCycleStatus and recordingStatus indicate that it's time to create a new livestream from scratch. Creating a new livestream now.
Info  2024-04-01 13:37:43 CrowCam Controller - Creating new YouTube Live Broadcast.
Info  2024-04-01 13:37:58 CrowCam Controller - New video is live. Video ID: j-KJTEKXjmI Status: active Key: xxxxxxxxxxxxxxxxx.
Info  2024-04-01 13:38:04 CrowCam Controller - Live stream came back up.
Error 2024-04-01 14:10:48 CrowCam Controller - The streamStatus is not active. Value retrieved was: inactive.
Error 2024-04-01 14:10:48 CrowCam Controller - The healthStatus is not good. Value retrieved was: noData.
Error 2024-04-01 14:11:04 CrowCam Controller - The streamStatus is not active. Value retrieved was: inactive.
Error 2024-04-01 14:11:04 CrowCam Controller - The healthStatus is not good. Value retrieved was: noData.
Error 2024-04-01 14:11:20 CrowCam Controller - The streamStatus is not active. Value retrieved was: inactive.
Error 2024-04-01 14:11:20 CrowCam Controller - The healthStatus is not good. Value retrieved was: noData.
Error 2024-04-01 14:11:37 CrowCam Controller - The streamStatus is not active. Value retrieved was: inactive.
Error 2024-04-01 14:11:37 CrowCam Controller - The healthStatus is not good. Value retrieved was: noData.
Error 2024-04-01 14:11:37 CrowCam Controller - Bouncing the YouTube stream for 4 seconds, because the stream was unexpectedly down during the quick test in the main code after having run the Test_Network function.
Error 2024-04-01 14:17:25 CrowCam Controller - The lifeCycleStatus is not good. Value retrieved was: complete.
Error 2024-04-01 14:17:25 CrowCam Controller - The recordingStatus is not good. Value retrieved was: recorded.
Error 2024-04-01 14:17:25 CrowCam Controller - The lifeCycleStatus and recordingStatus indicate that it's time to create a new livestream from scratch. Creating a new livestream now.
Info  2024-04-01 14:17:26 CrowCam Controller - Creating new YouTube Live Broadcast.
Info  2024-04-01 14:17:44 CrowCam Controller - New video is live. Video ID: IcqrG9obnYE Status: active Key: xxxxxxxxxxxy.
Info  2024-04-01 14:17:52 CrowCam Controller - Live stream came back up.
Error 2024-04-01 14:28:09 CrowCam Controller - The streamStatus is not active. Value retrieved was: inactive.
Error 2024-04-01 14:28:09 CrowCam Controller - The healthStatus is not good. Value retrieved was: noData.
Error 2024-04-01 14:28:09 CrowCam Controller - The lifeCycleStatus is not good. Value retrieved was: complete.
Error 2024-04-01 14:28:09 CrowCam Controller - The recordingStatus is not good. Value retrieved was: recorded.
Error 2024-04-01 14:28:09 CrowCam Controller - The lifeCycleStatus and recordingStatus indicate that it's time to create a new livestream from scratch. Creating a new livestream now.
Info  2024-04-01 14:28:11 CrowCam Controller - Creating new YouTube Live Broadcast.
Info  2024-04-01 14:28:26 CrowCam Controller - New video is live. Video ID: g1aOS4opMvw Status: active Key: xxxxxxxxxxxxxxxxx.
Info  2024-04-01 14:28:32 CrowCam Controller - Live stream came back up.
Error 2024-04-01 14:33:39 CrowCam Controller - The streamStatus is not active. Value retrieved was: inactive.
Error 2024-04-01 14:33:39 CrowCam Controller - The healthStatus is not good. Value retrieved was: noData.
Error 2024-04-01 14:33:39 CrowCam Controller - The lifeCycleStatus is not good. Value retrieved was: complete.
Error 2024-04-01 14:33:39 CrowCam Controller - The recordingStatus is not good. Value retrieved was: recorded.
Error 2024-04-01 14:33:39 CrowCam Controller - The lifeCycleStatus and recordingStatus indicate that it's time to create a new livestream from scratch. Creating a new livestream now.
tfabris commented 5 months ago

With checkin 1f98a91 I have reduced the number of hysteresis loops in Test_Stream from 4 to 2 and reduced the pause between tests from 15 to 7. That means that if the stream is bad, instead of waiting a whole minute to bounce it, it's now only waiting 14 seconds.

This still isn't perfect though, because the program doesn't re-query the stream throughout the entire run of the CrowCam.sh program. Instead the program currently works like this:

  1. Do all the timing-related stuff first (sunrise, sunset, hourly quick bounce, etc.).
  2. Quick check the stream state. If bad, bounce it (or rebuild it if lifeCycleStatus says so)
  3. For five minutes (well, four minutes and change), keep looping and testing the network ping.
  4. Check the stream state again. If bad, bounce it (or rebuild it if lifeCycleStatus says so)

Checkin 1f98a91 improves the hysteresis at steps 2 and 4, but still leaves a five minute "hole" during step 3 where the stream isn't being queried at all. A worst-case scenario would look like this:

Options to fix this:

To do in the meantime:

tfabris commented 4 months ago

The error seemed to recur on April 14th, so, need to continue investigating this further:

2024-04-14 18:21:10   CrowCam Controller - The streamStatus is not active. Value retrieved was: inactive.
2024-04-14 18:21:10   CrowCam Controller - The healthStatus is not good. Value retrieved was: noData.
2024-04-14 18:21:18   CrowCam Controller - The streamStatus is not active. Value retrieved was: inactive.
2024-04-14 18:21:18   CrowCam Controller - The healthStatus is not good. Value retrieved was: noData.
2024-04-14 18:21:18   CrowCam Controller - Bouncing the YouTube stream for 4 seconds, because the stream was unexpectedly down in the main code after the Test_Stream function.
2024-04-14 18:22:21   CrowCam Controller - The lifeCycleStatus is not good. Value retrieved was: complete.
2024-04-14 18:22:21   CrowCam Controller - The recordingStatus is not good. Value retrieved was: recorded.
2024-04-14 18:22:21   CrowCam Controller - The lifeCycleStatus and recordingStatus indicate that it's time to create a new livestream from scratch. Creating a new livestream now.