samtap / fang-hacks

Collection of modifications for the XiaoFang WiFi Camera
1.67k stars 340 forks source link

RSTP server hang #178

Open zekje opened 7 years ago

zekje commented 7 years ago

Hi,

after my problem with orange fixed led, (resolved ) , i encour a new one. Camera work fine , and with no reason, there is no image.

i have modified the script that launch rtspserver to keep log, and i found this :

RTSP TEARDOWN received RTSP GET_PARAMETER received RTSP GET_PARAMETER received RTSP DESCRIBE received RTSP SETUP received RTSP SETUP received RTSP PLAY received afterGettingFrame1: start playing time 1498598886:833536 afterGettingFrame1: start playing time 1498598886:892145 RTSP TEARDOWN received RTSP DESCRIBE received RTSP SETUP received RTSP SETUP received RTSP PLAY received afterGettingFrame1: start playing time 1498598894:478020 afterGettingFrame1: start playing time audio out tv_sec: 1498598895, framecount: 10, bitrate: 62 kbps audio out tv_sec: 1498598896, framecount: 9, bitrate: 56 kbps audio out tv_sec: 1498598897, framecount: 10, bitrate: 62 kbps audio out tv_sec: 1498598898, framecount: 11, bitrate: 68 kbps create audio sink 8 8000 PCMA StreamReplicator::deliverReceivedFrame() Internal Error 2(60,4)! audio out tv_sec: 1498598899, framecount: 10, bitrate: 62 kbps audio out tv_sec: 1498598900, framecount: 10, bitrate: 62 kbps audio out tv_sec: 1498598901, framecount: 9, bitrate: 56 kbps create audio sink 8 8000 PCMA StreamReplicator::deliverReceivedFrame() Internal Error 2(61,4)! audio out tv_sec: 1498598902, framecount: 11, bitrate: 68 kbps audio out tv_sec: 1498598903, framecount: 9, bitrate: 56 kbps audio out tv_sec: 1498598904, framecount: 10, bitrate: 62 kbps audio out tv_sec: 1498598905, framecount: 11, bitrate: 68 kbps audio out tv_sec: 1498598906, framecount: 10, bitrate: 62 kbps audio out tv_sec: 1498598907, framecount: 10, bitrate: 62 kbps audio out tv_sec: 1498598908, framecount: 10, bitrate: 62 kbps audio out tv_sec: 1498598909, framecount: 9, bitrate: 56 kbps create audio sink 8 8000 PCMA (and again, again ... )

did you have some idea ? in the web interface, all is green , but if i try to stop rtsp server, i can't restart it . the only thing is to reboot the camera , and all work fine .... since the next hang .

zekje commented 7 years ago

before it hang, i saw this on my jeedom : capture d ecran 2017-07-01 a 19 29 12

once hanged ( camera is disconected on synology, and image stay in jeedom ) , i can : ping the camera go to the status page reboot , and it ok for a short time ( since the next hang :( )

zekje commented 7 years ago

more logs :

when camera work fine i have this on my log : RTSP DESCRIBE received RTSP DESCRIBE received RTSP SETUP received RTSP SETUP received RTSP SETUP received create audio sink 8 8000 PCMA RTSP SETUP received create audio sink 8 8000 PCMA RTSP PLAY received RTSP PLAY received afterGettingFrame1: start playing time 1498995732:73889 afterGettingFrame1: start playing time 1498995732:85415 audio out tv_sec: 1498995732, framecount: 10, bitrate: 62 kbps afterGettingFrame1: start playing time 1498995732:92501 afterGettingFrame1: start playing time 1498995732:93150 sonix audio --- audio_record_thr_func:Audio record buffer overrun! out tv_sec: 1498995733, fps: 9, bandwidth: 4377 kbps audio out tv_sec: 1498995733, framecount: 10, bitrate: 62 kbps out tv_sec: 1498995734, fps: 16, bandwidth: 4166 kbps audio out tv_sec: 1498995734, framecount: 10, bitrate: 62 kbps out tv_sec: 1498995735, fps: 16, bandwidth: 4391 kbps audio out tv_sec: 1498995735, framecount: 10, bitrate: 62 kbps out tv_sec: 1498995736, fps: 10, bandwidth: 4496 kbps audio out tv_sec: 1498995736, framecount: 10, bitrate: 62 kbps out tv_sec: 1498995737, fps: 16, bandwidth: 4161 kbps audio out tv_sec: 1498995737, framecount: 10, bitrate: 62 kbps RTSP TEARDOWN received

when image stop , i saw : RTSP DESCRIBE received RTSP DESCRIBE received RTSP SETUP received RTSP SETUP received RTSP SETUP received create audio sink 8 8000 PCMA RTSP SETUP received create audio sink 8 8000 PCMA RTSP PLAY received RTSP PLAY received afterGettingFrame1: start playing time 1498995732:73889 afterGettingFrame1: start playing time 1498995732:85415 audio out tv_sec: 1498995732, framecount: 10, bitrate: 62 kbps afterGettingFrame1: start playing time 1498995732:92501 afterGettingFrame1: start playing time 1498995732:93150 sonix audio --- audio_record_thr_func:Audio record buffer overrun! out tv_sec: 1498995733, fps: 9, bandwidth: 4377 kbps audio out tv_sec: 1498995733, framecount: 10, bitrate: 62 kbps out tv_sec: 1498995734, fps: 16, bandwidth: 4166 kbps audio out tv_sec: 1498995734, framecount: 10, bitrate: 62 kbps out tv_sec: 1498995735, fps: 16, bandwidth: 4391 kbps audio out tv_sec: 1498995735, framecount: 10, bitrate: 62 kbps out tv_sec: 1498995736, fps: 10, bandwidth: 4496 kbps audio out tv_sec: 1498995736, framecount: 10, bitrate: 62 kbps out tv_sec: 1498995737, fps: 16, bandwidth: 4161 kbps audio out tv_sec: 1498995737, framecount: 10, bitrate: 62 kbps RTSP TEARDOWN received

in telnet , i kill the snx_rtsp_server pid , wait 30sec ( because port is in use ) , and i can relaunch manualy snx_rtsp-server , and it work again .... for 5 min to 10min before next hang

when it hang, log from snx_rtsp_server continue, process is not in fault .... how to test /dev/video1 ?

zekje commented 7 years ago

I have made a simple watchdog , that show if the rtspserver is out , and restart it .... modified 20-rstp-server , to write a log , and modify the stop part to wait until rtsp port is free

20-rtsp-server.txt

new 99-check-rtsp script , lanched at startup and continuously chek if server hanged by analysing the log ( in case of yes, it simply launch stop/start to script 20 )

99-check-rtsp.txt

this is not perfect , but it work .... in case of hang, i loose 1 min of recording ( time to free port and restart ) i have done it for 3 days, and video stream always on :) but i know by the log that is hang very often ( perhaps overheating ? we have 35°C this time .... and probably more when the sun hit the cam )

waitfor1t commented 7 years ago

I have the same issue, and I think I have narrowed it down a little.... it seems to hang if I have my qnap nas with surveillance station constantly recording the stream. When I turn of the recording, the rtsp server seems stable. Perhaps because the webcam was designed more for occasional streaming?

Thanks for the scripts. I modified 99-check-rtsp to look for /var/run/rtsp-server.pid instead, and if it doesn't exist, restart the service. Will push commits when I get my server reconfigured for development. I also modified the logger to include a time stamp in /var/log/messages.

zekje commented 6 years ago

my problem is rtsp-server is alive, but hanged , pid is already present :p so i need to check logs to word 'Error' perhaps a newer version of rtsp-server can help

i think our problem is the same, i use Synology surveillance station to check and record stream on HDD ....

by curiosity, could you post your 99 modified script ?

waitfor1t commented 6 years ago

Here is the updated script using PID and with date/timestamp in the messages log.

I've also updated the status page to include a button to look at the log from the status page. These need to be modified to work with the scripts start/stop and enable/disable method (using chmod) and then committed to the source rather than simply pasted here. Example: there's no reason to run 99-check-rtsp if you're disabling rtsp; it will work and not impact the system, but it's sloppy.

99-check-rtsp.txt action.txt status.txt

Also: I have confirmed that the rtsp server is quite STABLE without using surveillance station, so I need to better understand rtsp to see if a rtsp client can impact a server in some way to destabilize it (ie: any commands that surveillance station may be sending to increase rate or adjust stream?)

waitfor1t commented 6 years ago

I removed the -a option in the script and that seems to do the trick. I am recording to surveillance station without issue so far.

psmanek commented 6 years ago

I'm getting "/tmp/www/cgi-bin/scripts: line 65: /media/mmcblk0p2/data/etc/scripts/99-check-rtsp: not found" :(

waitfor1t commented 6 years ago

The script is obsolete. The problem should be resolved by changing how s_rtsp_server is called in 20-rtsp-server. Specifically, the call line should be like this: snx_rtsp_server >$LOGFILE 2>&1 & echo "$!" > "$PIDFILE

The key change is to remove the -a option. Not 100% certain, but I saw audio buffering errors with -a

psmanek commented 6 years ago

I have exactly as you write it, but it still says "not found". I've made test script with simple echo and it says "not found" too :(

waitfor1t commented 6 years ago

Most likely because you didn't make it executable. you need to chmod the file with +x it for root so it is seen as executable. That said, it is no longer necessary. I found why the server was crashing.

psmanek commented 6 years ago

No, it is executable. For me -a isn't working. RTSP server still hangs.

waitfor1t commented 6 years ago

Here is the revised 20-rtsp-server script. Note that snx_rtsp_server is called with DEFAULT options. This has not crashed on me with several days of constant recording, so if you replace this file on you camera, (remove txt extension, put in /media/mmcblk0p2/data/etc/scripts, run 'chmod 777 20-rtsp-server' for the benefit of others), it will be rock-solid stable and you won't need 99-check-rtsp 20-rtsp-server.txt

psmanek commented 6 years ago

Ok, i'll check this. Thanks.

waitfor1t commented 6 years ago

I've made 2 pull requests: 1 to update 20-rtsp-server so that it doesn't crash anymore when recording to a server, and another to add the button on the status page to view the log for troubleshooting. At some point I will add a form in www to specify options for the snx_rtsp_server.

psmanek commented 6 years ago

I don't know, why do you think that removing all parameters resolves the problem. For me it doesn't change anything. Camera still hangs the same way.

waitfor1t commented 6 years ago

It solved the problem for me. There is less load on the camera with the default settings, and no loss of functionality as far as I am concerned. We may be dealing with a hardware issue..are you using a different power supply? Have you tried lower resolution?

yuyi commented 6 years ago

Hello,

I found out that is about the -a parameter. (enable audio output)

unstable

snx_rtsp_server -W 1920 -H 1080 -Q 10 -b 5000 -a >$LOG 2>&1 &
snx_rtsp_server -W 1280 -H 720 -Q 10 -b 2048 -a >$LOG 2>&1 &
snx_rtsp_server -W 1920 -H 1080 -Q 10 -b 2048 -a >$LOG 2>&1 &

stable

snx_rtsp_server >$LOG 2>&1 &
snx_rtsp_server -W 1920 -H 1080 -Q 10 -b 5000 >$LOG 2>&1 &

After I deleted the -a parameter, the fang can run stably at 1080p resolution. You can try it :p

isitnikov commented 6 years ago

Guys! Let me share my experience with fixing this issue. I'm using modified snx_rtsp_server which supports authentification (for my NVR) and one of my cameras (only one camera) has a bug with hanging rtsp server. I made some changes for three files, which looks likes solved this issue. I didn't get any errors last 2 hours.

20-rtsp-server

#!/bin/sh
PIDFILE="/var/run/rtsp-server.pid"

status()
{
  pid="$(cat "$PIDFILE" 2>/dev/null)"
  if [ "$pid" ]; then
    kill -0 "$pid" >/dev/null && echo "PID: $pid" || return 1
  fi
}

start()
{
  LOG=/dev/null
  echo "Starting RTSP server..."
  snx_rtsp_server -Q 10 -u media/stream1 -P 554 -A admin:12345 >$LOG 2>&1 &
  echo "$!" > "$PIDFILE"
}

stop()
{
  pid="$(cat "$PIDFILE" 2>/dev/null)"
  if [ "$pid" ]; then
     kill $pid ||  rm "$PIDFILE"
  fi
  pids=$(ps w | grep snx_rtsp_server | grep -v 'grep' | awk '{print $1}')
  echo $pids
  for pid in $pids ; do
    kill -9 $pid
  done
}

if [ $# -eq 0 ]; then
  start
else
  case $1 in start|stop|status)
    $1
    ;;
  esac
fi

where I added these lines

  pids=$(ps w | grep snx_rtsp_server | grep -v 'grep' | awk '{print $1}')
  echo $pids
  for pid in $pids ; do
    kill -9 $pid
  done

to avoid duplicating of rtsp server processes.

99-check-rtsp

#!/bin/sh
PIDFILE="/var/run/check_rtsp.pid"

status()
{
  pid="$(cat "$PIDFILE" 2>/dev/null)"
  if [ "$pid" ]; then
    kill -0 "$pid" 2>/dev/null && echo "PID: $pid" || return 1
  fi
}

start()
{
  echo "Starting RTSP checker..."
  check_rtsp >/dev/null 2>&1 &
  echo "$!" > "$PIDFILE"
}

stop()
{
  pid="$(cat "$PIDFILE" 2>/dev/null)"
  if [ "$pid" ]; then
     kill $pid || rm "$PIDFILE"
  fi
}

if [ $# -eq 0 ]; then
  start
else
  case $1 in start|stop|status)
    $1
    ;;
  esac
fi

where I made a special script which monitors a state of RTSP server by PID and by processes.

check_rtsp I put it into /media/mmcblk0p2/data/usr/bin/

#!/bin/sh
# check-rtsp script
# Uses PID instead of log file. Writes timestamp with message in messages log file.

PIDFILE="/var/run/rtsp-server.pid"

restart()
{
    /media/mmcblk0p2/data/etc/scripts/20-rtsp-server stop
    /media/mmcblk0p2/data/etc/scripts/20-rtsp-server start  
}

check_pid()
{
    pid=$(cat "$PIDFILE" 2>/dev/null)
    #echo $pid >>/var/log/messages
    if [ -z "$pid" ]; then
        TIMESTAMP="$(date +%c)"
        echo $TIMESTAMP "rtsp server seems hung. PID is absent." >>/var/log/messages
        restart
    fi
}

check_process()
{
    prc=$(ps w | grep snx_rtsp_server | grep -v 'grep')
    #echo $prc >> /var/log/messages
    if [ -z "$prc" ]; then
                TIMESTAMP=$(date +%c)
        echo $TIMESTAMP 'rtsp server seems hung. Process not found.' >>/var/log/messages
        restart
    fi
}

while [ "1" -ne "2" ]; do
        sleep 30
    check_pid
    check_process
done

I can make a pull request for this.

waitfor1t commented 6 years ago

I have discovered what may be another cause of snx_rtsp_server hanging.... If you have your router set to use a 40Mhz channel width (802.11n capability), it is supposed to scan and fall back to a 20Mhz channel if there is interference. This happened to me.. a LOT. when it does, it seems to cause trouble for snx_rtsp_server. So - I have set my router to not scan so that it won't fall back, PLUS I put my 40Mhz channel in a quieter part of the spectrum. The server has not hung since.

ioanbsu commented 5 years ago

Hi, @isitnikov is there a link for compiled snx_rtsp_server that accepts login and pwd or you compiled your own version?