roleoroleo / yi-hack-MStar

Custom firmware for Yi 1080p camera based on MStar platform
GNU General Public License v3.0
850 stars 112 forks source link

Frequent connection drops #149

Closed fabiosci closed 3 years ago

fabiosci commented 4 years ago

Hi, I own two 9FUS with stock firmare 4.2.0*. Inside the camera is written "Y203C_MB_RT2.0 2019/008/08". I successfully flashed both with y25 firmware and everything went fine and the cameras seem to work as expected. At the moment one camera has been flashed with the 0.2.9 version and the other one with the 0.3.0. After a while (hours or days, randomly) the cameras disconnect from wifi network (they appear not connected in my router page) and the only way to recover them is to unplug the power supply. This behaviour is not synchronized between the two cameras but occurs with both firmware versions. The 0.2.9 seems to be more stable than the 0.3.0. Another info: I turn off wifi network during the night but in the morning the cameras usually connect normally.

roleoroleo commented 4 years ago

It could be a memory and/or cpu usage problem. Try to disable unnecessary services.

fabiosci commented 4 years ago

I think unnecessary services are already disabled. This was and still is my configuration:

Another detail (maybe important): I can randomly hear a sound from the cameras as if were reboot (like a click), but they continue working after this "click". Never understood meaning and reason of this random sound.

Thanks for all your work!

roleoroleo commented 4 years ago

Did you noticed the same behavior with the original fw? Because my fw doesn't apply change to the wifi part.

fabiosci commented 4 years ago

I don't know because I flashed with your firmware 5 min after unboxing and didn't backup the original firmware (at my own risk, I know). The strange thing is that both cameras act the same way...

roleoroleo commented 4 years ago

A serial log would be useful. Are you able to do it?

fabiosci commented 4 years ago

do you mean I have to follow the first part of this guide: https://github.com/roleoroleo/yi-hack-MStar/wiki/Dump-your-backup-firmware ? If not, could you please point me in the right direction? thanks

roleoroleo commented 4 years ago

Yes.

fabiosci commented 4 years ago

here is the putty log with internet access blocked by firewall: putty.log

I noticed a different log when internet access is allowed: putty2.log

both finish with line '[PWM] mstar_pwm_enable', which is the last line even after some minutes. they seem very different from logs posted in issued opened by other people... don't know if it is ok... thank you!

roleoroleo commented 4 years ago

The logs are ok. Is it possible to record the log across a disconnection? Could you leave the log enabled for a long time?

fabiosci commented 4 years ago

sure! I'm already logging and I'll stop it the next time it disconnects. In the meanwhile I took a picture because log in this moment is continually changing and this is a difference with the one attached to the previous post 20200419_183901_compress18

sorry for this horrible picture...

fabiosci commented 4 years ago

Just an update: after 3 days my cam is still online (unfortunately, I'd say...) and the log still keep changing as shown in the previous picture. On the other cam I applied the new firmware release (0.3.1). I'm monitoring the updated cam to see if anything changes

fabiosci commented 4 years ago

Another update: I noticed that the camera with v0.3.1 (not under log recording) worked fine for some days until I blocked its traffic from firewall. after more or less 24h it stopped working: it disconnected from wifi network and blue led started blinking

fabiosci commented 4 years ago

it finally stopped working under monitoring. I found the led blinking, the cam disconnected from wifi and here there is the serial log: putty.zip blinking led.zip

hope this can help in finding the issue thank you

roleoroleo commented 4 years ago

Unfortunately the log doesn't show any problem.

fabiosci commented 4 years ago

ok, thank you. I'll try by disabling one by one all features and check if one of them can cause the crash

fabiosci commented 4 years ago

Just an update: the cam seems to stop working when it can't communicate with the cloud. I tried the following scenarios:

SCENARIO 1 Option "disable cloud" ON --> the cam stops working after a while (hours or days) (tested with firmware 0.3.3)

SCENARIO 2 Option "disable cloud" OFF AND cam with access to the internet --> the cam is still working after several days (tested with firmware 0.3.3)

SCENARIO 3 Option "disable cloud" OFF AND cam traffic blocked by router --> the cam stops working after a while (hours or days) (tested with firmware 0.3.1, I'm going to test also with 0.3.3)

roleoroleo commented 4 years ago

I have the same behavior regarding scenarios 2 and 3 but not for scenario 1. My default config is "Disable cloud" ON and my cam works correctly. Are you blocking the traffic from your router in this scenario?

fabiosci commented 4 years ago

Sorry but I don't remember. I thought it doesn't affect the behaviour and I didn't take note of this. In the next days I'll try the following scenarios and report the results: SCENARIO 1.1: Option "disable cloud" ON AND traffic blocked from the router SCENARIO 1.2: Option "disable cloud" ON AND traffic NOT blocked from the router

fabiosci commented 4 years ago

ok, i've got news regarding the SCENARIO 1.1: Option "disable cloud" ON AND traffic blocked from the router the cam stopped working (as expected) just after less than 2 days and in the router's configuration appears not connected. I'm talking about the cam with firmware 0.3.3

regarding the SCENARIO 1.2 the cam (firmware 0.3.1) is still working

roleoroleo commented 4 years ago

Ok. So, when the cam can't access to internet, it stops working. I don't know why, I will check it.

fabiosci commented 4 years ago

thank you very much! if it can be useful for furhter tests, i can update the other cam from 0.3.1 to 0.3.4 and monitor its behaviour in the same conditions

fabiosci commented 4 years ago

another update on the following scenario:

SCENARIO 1.2: Option "disable cloud" ON AND traffic NOT blocked from the router

after weeks, the cam stopped working.

EDIT: I've just noticed that the log.txt tends to become bigger when I block internet connection. I say so because I find in it a lot of error such as:

p2p_tnp.c(state_statistics-6242) check_login fail 0

and some blocks like:

][7/6/14:25:35:588]: p2p_tnp.c(tnp_proc-6389) PPPP_API Version: d2020f04 210.2.15.4 [ ][7/6/14:25:40:379]: p2p_tnp.c(tnp_proc-6398) PPPP_NetworkDetect() ret = 0 [ ][7/6/14:25:40:379]: p2p_tnp.c(tnp_proc-6399) -------------- NetInfo: ------------------- [ ][7/6/14:25:40:379]: p2p_tnp.c(tnp_proc-6400) Internet Reachable : NO [ ][7/6/14:25:40:379]: p2p_tnp.c(tnp_proc-6401) P2P Server IP resolved : YES [ ][7/6/14:25:40:379]: p2p_tnp.c(tnp_proc-6402) P2P Server Hello Ack : NO [ ][7/6/14:25:40:379]: p2p_tnp.c(tnp_proc-6403) Local NAT Type :[ ][7/6/14:25:40:379]: p2p_tnp.c(tnp_proc-6408) Unknow [ ][7/6/14:25:40:379]: p2p_tnp.c(tnp_proc-6421) My Wan IP : 0.0.0.0 [ ][7/6/14:25:40:379]: p2p_tnp.c(tnp_proc-6422) My Lan IP : 0.0.0.0 [ ][7/6/14:25:40:379]: p2p_tnp.c(tnp_proc-6424) InitStr(MMFBJPLDIEEPKPHOOIFEPNEHHDCJFMGOHJENKLIHIJBKDABIPMIIAONIPBCLBOKOPIKPDKPFNJDPAIDKBE) [ ][7/6/14:25:40:379]: p2p_tnp.c(tnp_proc-6425) did(TNPUSAI-576296-BRCSC)

this is the output of free -m command: Mem: total 59, used 52, free 6

In two or three hours the free memory decreased of 1mb.

I'm start suspecting that the cams work fine until they run out of memory, then crash. Could it be possible?

roleoroleo commented 4 years ago

Sorry for the delay in the response.

I tried again your scenario and I think your idea is correct. There is a memory leak when the connection is blocked. The cam goes in low memory and crashes.

fabiosci commented 4 years ago

just to let you know my other progresses in investigating this issue: it happens also when the cam is regularly accessing internet both from router (internet connection allowed from firewall) and from firmware (option "disable cloud" off). this means that the connectivity is not responsible for the problem. I also found a blog post where someone talks about your firmware and says that it works on yi home camera v1. my cam is v3 (based on the amazon link used to buy it...). i don't know if this can be the reason...

at the moment the cams are useless, hope you'll find the issue soon. If there are tests I can do to help, let me know. Thanks for your support

elraro commented 4 years ago

Im having exactly the same problem @fabiosci , but with https://github.com/roleoroleo/yi-hack-Allwinner firmware

roleoroleo commented 4 years ago

I need more info. Try to run this script: https://github.com/roleoroleo/yi-hack-MStar/issues/215#issuecomment-682453320

I will check if there is a memory leak.

mhensema commented 4 years ago

I'm experiencing the issue as well on the Allwinner firmware/device. I have cloud disabled, but I do see the device making DNS lookups for api.eu.xiaoyi.com. This seems to only happen at boot. Internet access was blocked for the device, but enabled it for now to see if it makes any difference.

fabiosci commented 4 years ago

I run the script from #215 with nohup ./startup.sh & on both cams. Both are configured with "disable cloud" on and internet connection blocked by router. I also see a DNS lookup for api.eu.xiaoyi.com and api.github.com. One of them stopped working during the last night, probably when wifi was off. Please ignore the date and time in the log file attached

top-yicam1.log

roleoroleo commented 4 years ago

I can't see memory problems. The mystery deepens.

fabiosci commented 4 years ago

when also the second cam will stop working I'll post the other log to allow a comparison and find the differences. I'll also share both configurations (which are different). this might help the investigation.

UPDATE: yesterday I disconnected and reconnected the power supply of the first cam (the one which stopped working) and this morning I found it again offline. When it happens, the blue led blinks. The other cam is still working, I'll keep it monitored and in a few days I'll post both configurations

roleoroleo commented 4 years ago

If the led blinks the cam is disconnected from the wifi. Is it possible that you have a wifi problem?

fabiosci commented 4 years ago

i don't think so for at least 3 reasons: 1) only the yi cams have problems. no other devices disconnect 2) the cams are 2m far from the router with just a door (usually open) in between, definitively less than other devices 3) my router is a fritz box, which has a very good wifi signal

the only information i can add is that in the night i turn off the wifi using the router's schedule feature. I don't think that turn off the wifi during the night is the problem, because the cam doesn't disconnect every day

mhensema commented 4 years ago

I don't believe Wifi is the issue. The device is still connected. See screenshot. I checked the system log on my router (OpenWRT based), and see no deauth(s) or disconnect(s) from the camera.

After some time the camera does start working again (i.e. the webserver becomes available again) without a power cycle. I'll see if I can get some logging from httpd.

Screenshot 2020-09-13 at 15 47 53

roleoroleo commented 4 years ago

Could you check the uptime of the cam?

mhensema commented 4 years ago

Uptime is at ~11d

image
fabiosci commented 4 years ago

I don't believe Wifi is the issue. The device is still connected. See screenshot. I checked the system log on my router (OpenWRT based), and see no deauth(s) or disconnect(s) from the camera.

After some time the camera does start working again (i.e. the webserver becomes available again) without a power cycle. I'll see if I can get some logging from httpd.

Screenshot 2020-09-13 at 15 47 53

it seems that we are experiencing different issues. in my case the device disconnects from the wifi network, the led start blinking and it does not reconnect automatically even after several days. this is the 7th day without disconnection. At the moment I'm powering both cams with samsung smartphone charger and original usb cables. I'm starting thinking about a power supply issue (e.g. unstable 5v output from the original power supply device, thin cable, etc.) What do you think about this hypothesis?

roleoroleo commented 4 years ago

The only power problem I had is that if I use a low power supply the camera goes in boot loop.

fabiosci commented 4 years ago

after 8 days (is almost the best performance so far) the first cam stopped working, the second one is still connected. It seems that when it disconnects it's no able to establish the connection to wifi network again. It happens after some time that could be one day or more, such as in this case. The cam appears disconnected from the router's page and, of course, it is unreachble also from other scanning apps (i.e. fing for android). I tried putty, but as expected I couldn't reach it.

As you said, it doesn't seem to be related to the power supply.

I will post the configurations of both cams as soon as possible, to highligh the differences

fabiosci commented 4 years ago

I monitored the free memory and it is always 20000kB more or less also when it stop working. I also disable RTSP, ONVIF and all other options and services just to easily find the problem, and it seems to be just a connection issue: the cam is unable to establish again the connection when the wifi turns on in the morning. sometimes it's able to reconnect but sometimes not. it seems to be not related to both enabled options and internet block.

my next test will be allow internet access, enable cloud and keep wifi off during the night. if i'm right sooner or later the cams will disconnect again

any suggestion to go deeper into the wifi matter?

roleoroleo commented 4 years ago

I don't know how to help you. I begin to think a kernel driver problem. Probably you should add a network watchdog that reboot the cam when it's disconnected from the router.

fabiosci commented 4 years ago

thanks anyway. in your opinion could the kernel driver problem lead me to my issue also in case of internet allowed and cloud enabled? i can add a watchdog. do you think that a solution like the one described here would work? in case of firmware upgrades, would the script and crontab setting be deleted?

roleoroleo commented 4 years ago

in your opinion could the kernel driver problem lead me to my issue also in case of internet allowed and cloud enabled?

The problem still exists, but the effect is probably less.

i can add a watchdog. do you think that a solution like the one described here would work?

Yes, but check if all the commands exist.

in case of firmware upgrades, would the script and crontab setting be deleted?

If you add it to the sd card, it will not be deleted.

fabiosci commented 3 years ago

I finally found some time to spend on this issue. I prepared the script and it works (i'm trying to reset only the wifi connection rather than reboot the cam) but it seems that crond is not running. the output of ps command doesn't show crond.

/home/yi-hack/script # ps
PID   USER     TIME  COMMAND
    1 root      0:02 /init
    2 root      0:00 [kthreadd]
    3 root      0:01 [ksoftirqd/0]
    5 root      0:00 [kworker/0:0H]
    6 root      0:00 [kworker/u2:0]
    7 root      0:04 [rcu_preempt]
    8 root      0:00 [rcu_sched]
    9 root      0:00 [rcu_bh]
   10 root      0:00 [watchdog/0]
   11 root      0:00 [khelper]
   12 root      0:00 [writeback]
   13 root      0:00 [crypto]
   14 root      0:00 [bioset]
   15 root      0:00 [kblockd]
   16 root      0:00 [kworker/0:1]
   17 root      0:00 [kswapd0]
   18 root      0:00 [fsnotify_mark]
   31 root      0:00 [SCLDAZA_THREAD]
   32 root      0:00 [VIPDazaTask]
   36 root      0:00 [deferwq]
   39 root      0:00 /bin/ueventd
   40 root      0:01 [jffs2_gcd_mtd3]
   41 root      0:00 [jffs2_gcd_mtd2]
   82 root      0:00 [kworker/0:2]
   83 root      0:00 [mmcqd/0]
   94 root      0:00 [cryptodev_queue]
  130 root      0:00 [spi0]
  161 root      0:00 [cfg80211]
  177 root      0:01 [RTW_CMD_THREAD]
  180 root      0:00 ./log_server
  181 root      0:05 ./dispatch
  645 root     10:10 ./rmm
  648 root      0:01 /home/base/tools/wpa_supplicant -c/tmp/wpa_supplicant.conf -g/var/run/wpa_supplicant-global -Dnl80211 -iwlan0 -B
  683 root      0:00 [kworker/u2:3]
  763 root      0:00 udhcpc -i wlan0 -b -s /home/app/script/default.script -x hostname:yicam-taverna
 1079 root      0:00 httpd -p 8080 -h /home/yi-hack/www/ -c /tmp/httpd.conf
 1099 root      0:00 dropbear -R
 1104 root      0:00 ipc_multiplexer
 1119 root      0:01 dropbear -R
 1125 root      0:00 ntpd -p 192.168.188.2
 1165 root      0:00 -sh
 1315 root      0:00 ps

do I miss something?

EDIT: crond enabled

roleoroleo commented 3 years ago

I'm working on a new version where you can configure cron through the web interface.

fabiosci commented 3 years ago

I'm working on a new version where you can configure cron through the web interface.

thank you! the possibility to schedule a custom script it's a good news

roleoroleo commented 3 years ago

Release in 0.3.9 Let me know if it works correctly.

fabiosci commented 3 years ago

just upgraded. first of all i checked if crond was running and ps gave me: 1235 root 0:00 /usr/sbin/crond -c /var/spool/cron/crontabs/

but crontab -l doesn't seems to work:

/home/yi-hack # crontab -l
crontab: can't change directory to '/crontabs': No such file or directory

the root file in /var/spool/cron/crontabs/ exists and its content is just 0 * * * * /home/yi-hack/script/clean_records.sh 5

I tried the new feature by adding in "Configurations" page the following: */5 * * * * root /tmp/sd/watchdog/wd_wifi.sh then saved and rebooted. When the cam come back online, that job doesn't appear neither in Configuration page nor in root file, which is unchanged.

this is my wd_wifi.sh

#/bin/sh
now=$(date +"%Y_%m_%d")
LOGFILE=/tmp/sd/watchdog/log_$now.log

if ping -c4 -q fritz.box > /dev/null ; then
  echo '['$(date)'] Wifi OK' >> $LOGFILE
else
  echo '['$(date)'] CONNECTION KO. Reboot wifi' >> $LOGFILE
  sleep 1
  /sbin/ifconfig wlan0 down
  # Give interface time to reset before bringing back up
  sleep 10
  /sbin/ifconfig wlan0 up
  # Give WAN time to establish connection
  sleep 10
fi

and it is executable. If I manually execute it, it works as expected.

roleoroleo commented 3 years ago

I tested your line and it works. Did you clear the cache of the browser? Check il the system.conf is correctly updated and check if cron file contains the line:

cat /home/yi-hack/etc/system.conf
cat /var/spool/cron/crontabs/root
fabiosci commented 3 years ago

thanks, from another browser (so clean cache) it worked and both system.conf and root contains the new job (which is shown also in Configurations page after rebooting the cam). the only issue now is that the script is not executed. I run a ping for 10 minutes and no packets has been lost (so the wifi connection hasn't been reset) and the log file hasn't been created

roleoroleo commented 3 years ago

Try to remove "root" from the line.