Open jilleb opened 3 years ago
I don't get anything weird in dmesg, and it's been running 11 mins so far -- could also be something botched by the ppsapp mods (mine has exactly 1 byte changed).
Dmesg messages appear after crash, give it a few more minutes.. seems to be 15 - 20 when it crashes for me(it just happened again, this time online mode)
Spoke too soon, crashed too, so it is not just your version. May just be a bug they haven’t fixed in their app.
Mine ran for a little over 13 minutes as well (while RTSP streaming in HD). I only use it to view it briefly and regardless of how many times I have viewed the feed this is the first time I have seen it crash.
Alternate options would be: -Try SD to see if it crashes. -Try to 'fix' the issue in their app (I don't see that happening) -Work on a full RTSP version of streamer-arm (unfortunately I don't have the time to play with that). This should possibly be done using the variables (node, head, tail information) from the ppsapp ring buffer to make it more efficient than blind searching for changes in the buffer. I know for a fact that are ways to make the blind search more efficient too (and likely not need as much memory, but again I don't have all the time required to work on this. -Improve the MJPEG stream -- it should be possible to read the JPEG buffer AND mutex flag at the same time -- and discard the JPEG (and read another) when we detect the JPEG buffer was being updated (mutex flag).
I guess given this
[02:35:15.631 INFO pps_watchdog.c:147] exception ocurred: 1
we should try padding the watchdog. I'll look at it tomorrow.
@guino: I'll check again if I can modify the software_version
string. As far as I remember is hard-coded, though
@jandy123 @m11tch you know it may actually be simpler to download the hisilicon SDK and run your own RTSP application than trying to fix ppsapp -- the only things we'd have to figure out are: 1-how to signal the watchdog so it doesn't reboot the board 2-how to detect the button press from the doorbell (it may show up in the kernel log already) 3-how to do motion detection (although I am pretty sure they have samples for this on the SDK too) It may be required for us to load some kernel modules as it's possible ppsapp may be doing some of that (I'm not sure). I also believe there are some open projects with the required files/tools in progress... https://github.com/OpenIPC they may even have stuff on the watchdog (I have not looked)
some more interesting stuff: The watchdog is controlled by this function:
undefined4 FUN_00023ad0(int param_1,undefined4 param_2)
{
int iVar1;
undefined4 uVar2;
ulong local_c;
if (param_1 == 1) {
local_c = 0x40044401;
}
else {
if (param_1 == 0) {
local_c = 0x40044403;
}
else {
if (param_1 != 2) {
return 2;
}
local_c = 0x40044402;
}
}
iVar1 = ioctl(DAT_003a0a3c,local_c,param_2);
if (iVar1 == 0) {
uVar2 = 0;
}
else {
uVar2 = 2;
}
return uVar2;
}
param_1=1 means start watchdog, param_1=2 means stop watchdog, param_1=0 means reset watchdog. param_2 is the watchdog timeout (I think 30secs default) when starting the watchdog which would mean the ppsapp stopped for more than 30 seconds the board should reboot itself. It should be easy to make an application that manages it. The device handle for ioctl is an open file descriptor for /dev/Strnio .
also interestin ( don't know if it works): https://github.com/chertov/hi_minihttp -- I believe the openIPC software also uses 'minihttpd' application (don't know if it's the same one) which is what serves the RTSP/ONVIF for the hisilicon chips.
I've been trying to intercept MQTTS traffic between camera and tuya server using IOXY but still no luck because camera check for self signed certificate, basically here is what I do:
looks like ppsapp do TLS verification so it refuses to connect because of self signed certificate and the log says SSL Handshake error, I'm still looking to solve this issue using ghidra but i have very little knowledge in reverse engineering. Maybe if anyone here interested can help me too.
@guino I've seen that function too. But, as you also said, it's wired since the watchdog timer seems to expirein ~30 sec. or so. Why would it reboot when rtsp streaming after 10-15 minutes ? I still believe is more of a bug, rather than the wdt.
Interesting idea to look into own rtsp server. Do you have the sdk ?
@guino I've seen that function too. But, as you also said, it's wired since the watchdog timer seems to expirein ~30 sec. or so. Why would it reboot when rtsp streaming after 10-15 minutes ? I still believe is more of a bug, rather than the wdt.
Interesting idea to look into own rtsp server. Do you have the sdk ?
have you check CPU usage while streaming via RTSP, mine reach 100%, I think the watchdog disrupted because of full CPU usage.
@guino I've seen that function too. But, as you also said, it's wired since the watchdog timer seems to expirein ~30 sec. or so. Why would it reboot when rtsp streaming after 10-15 minutes ? I still believe is more of a bug, rather than the wdt.
Interesting idea to look into own rtsp server. Do you have the sdk ?
It reboots because ppsapp crashes/exits after 10-15 minutes of RTSP streaming. (So then ppsapp is no longer communicating to watchdog) I managed to restart ppsapp manually before the device rebooted, so throwing ppsapp in a loop could be a nasty workaround as suggested by @guino
@adwiraguna what are you looking for in the mqtt messages? Have you seen @jandy123 's logparser to send mqtt messages on motion/button press?
I left my camera with offline ppsapp running, and blocked from any internet access, while being connected to my Synology Surveillance Station by using the mjpeg.cgi. It didn't restart for 12 hours. Note: I didn't connect to RTSP during these hours. Previously, if I would connect to RTSP and left the connection open for more than 10 minutes or so, the cam would crash. I guess this confirms: it's not the watchdog causing the reboot. It's almost like there's a memory leak or something, that causes the app to crash after some time of streaming.
@guino I've seen that function too. But, as you also said, it's wired since the watchdog timer seems to expirein ~30 sec. or so. Why would it reboot when rtsp streaming after 10-15 minutes ? I still believe is more of a bug, rather than the wdt. Interesting idea to look into own rtsp server. Do you have the sdk ?
It reboots because ppsapp crashes/exits after 10-15 minutes of RTSP streaming. (So then ppsapp is no longer communicating to watchdog)
@adwiraguna what are you looking for in the mqtt messages? Have you seen @jandy123 's logparser to send mqtt messages on motion/button press?
I want to check mqtt message sent from the server when we open streaming from tuya app, looks like ipc p2p trigerred by mqtt message. Also if i know the message from and to the camera I can setup local mosquitto broker and use local mqtt broker instead of using cloud
I've updated my first post in this thread with a collection of the current information/files for easy access, let me know if I missed anything. (https://github.com/guino/BazzDoorbell/issues/4#issuecomment-740644879)
@adwiraguna I found a mqtt related CA.crt in /home/, perhaps if you place your CA there you can get TLS verification to work?
We should make sure that we feed the wdt and test again rtsp streaming off/on-line.
for reference, here is the log running the 'original' ppsapp-rtsp with free access to internet for the doorbell cloudconnected.log
I've updated my first post in this thread with a collection of the current information/files for easy access, let me know if I missed anything. (#4 (comment))
@adwiraguna I found a mqtt related CA.crt in /home/, perhaps if you place your CA there you can get TLS verification to work?
Tried that too, but still not working, I think that file is an old file and not used anymore
@adwiraguna it seems they are using the same self signed certificate for the mqtt server as for the webserver. probably the certificate hash is hardcoded in ppsapp? (I assume it's something like that, because why else use a certificate with validity untill 2118 :D )
@adwiraguna it seems they are using the same self signed certificate for the mqtt server as for the webserver. probably the certificate hash is hardcoded in ppsapp? (I assume it's something like that, because why else use a certificate with validity untill 2118 :D )
Yes, I think so, still looking in ghidra for those TLS verification
Feeding wdt seems to work. To test just run test_wdt
before ppsapp
. You may for instance kill ppsapp and device shouldn't reboot by itself.
Anyways, please test rtsp streaming making sure that test_wdt is started before ppsapp.
@jandy123 I will test this shortly, but I don't see how this will stop ppsapp from crashing? or am I missing something? :wink:
@m11tch We were thinking that maybe due to high CPU occupancy while streaming RTSP, ppsapp may be unable to feed the wdt every 30 seconds for some reason. IF this was so, that ppsapp would reboot device.
However, I've just tested and still get the exception, while making sure wdt is properly fed (every 10 seconds) using test_wdt.
So, this is not the issue....
@jandy123 just crashed for me aswell, exit code of ppsapp 255.
so time to make "our own" application to create RTSP Stream? :joy: the watchdog part is already covered by you :smile:
btw, I just had another thought, will ppsapp also crash if we restart the RTSP stream lets say each 5 minutes? (I'll try this, see if this makes any difference)
Hmm, I just found a wired thin, still related to wdt. I think it's supposed to stop after a while. Will get back to this.
Edit: The wdt exception is supposed to happen. I can only wonder why... The version below should disable that wdt exception, so please keep an eye on the log output.
Pls. test the version below. I'm testing it right now. ppsapp-rtsp4.zip
For the reference, I attached the wdt feeder with source code. feed_wdt.zip
@m11tch Still up and running :).
@jandy123 for what its worth, restarting VLC every 5 minutes still resulted in a crash of ppsapp after about 15 minutes.. going to test your new version of ppsapp now..
@jandy123 looks like you found it! stream has been up without crashes for atleast 23 minutes so far :)
@jandy123 still running after 1 hour. the nag will come back once they release 2.9.9 though :joy:
@jandy123 ohno, dit it maybe just reset because of higher version?
do you still see "ThankYouGuino" in: http://admin:056565099@192.168.x.x/proc/cmdline
or is the device completely unreachable now?
@jandy123 are the files still on the SD card? i saw something in the other thread about files being wiped from SD if it is full
@jandy123 can you still access it via telnet? (if you start telnetd before attempting to kill ppsapp in custom.sh?)
Also http://192.168.2.179/proc/self/root/tmp/custom.sh exists.... ???
Telnet, ftp httpd should all be started before rtspapp...
I know, that's why I was wondering if you can still access it via telnet?
Sorry ! False alarm ;) custom.sh got corrupted on sdcard.
So it's all fine, including the no-nag version ;). Will reupload.
@m11tch: Thanks for the thumbs up !
Right, so back on track.
the nag will come back once they release 2.9.9 though
If they'll make some fw update.
Till then, below is the no-fw-up-nag which includes all other fixes above and should be quite reluctant to perform fw update ;).
@jandy123 perhaps you can share more details on your fix for the RTSP Stream so people with other ppsapp versions can replicate :) if you have time for that ofcourse :D
btw I don't think we need to use test_wdt with this version of ppsapp do we? (would be even better not to so the doorbell can fix itself if ppsapp does crash for some reason, especially if it is actually mounted outside :D )
@m11tch So, rtsp seems fine, right ? Have you tried the offline version ? I'd still like to have the possibility to choose between SD and HD somehow. Maybe we can ask @guino since he knows the exact ring-buffer details.
@m11tch perhaps you can share more details on your fix for the RTSP Stream
Sure, but I'll have to remember the details. There were three changes, I'll have to look through my notes and trace them in ghidra.
@m11tch btw I don't think we need to use test_wdt with this version of ppsapp do we?
Yes, we shouldn't need it, probably. Pls. test without it. Also in off-line mode. Btw. do you need to change versions ? On/off line ? Here it seems fine to use the same version in both cases.
@m11tch So, rtsp seems fine, right ? Have you tried the offline version ? I'd still like to have the possibility to choose between SD and HD somehow. Maybe we can ask @guino since he knows the exact ring-buffer details.
@m11tch
perhaps you can share more details on your fix for the RTSP Stream
Sure, but I'll have to remember the details. There were three changes, I'll have to look through my notes and trace them in ghidra.
currently i'm in offline mode running the RTSP stream for about 2 hours without any crashes without issues, have yet to test online mode with this version of ppsapp
edit: actually, ppsapp did not crash, but it seems the stream froze, image shows timestamp of 14:01 so something is up... hmm.. as I'm writing this the timestamp updated to 15:14:26 but then froze again...
vlc log shows:
[00007f532c133490] main decoder error: Could not convert timestamp 206338203841 for g711
[00007f532c133490] main decoder error: Timestamp conversion failed (delay 1000000, buffering 0, bound 3000000)
[00007f532c133490] main decoder error: Could not convert timestamp 206338283841 for g711
[00007f532c133490] main decoder error: Timestamp conversion failed (delay 1000000, buffering 0, bound 3000000)
[00007f532c133490] main decoder error: Could not convert timestamp 206338403841 for g711
and later on:
fb0] avcodec decoder error: more than 5 seconds of late video -> dropping frame (computer too slow ?)
[00007f532c13dfb0] avcodec decoder error: more than 5 seconds of late video -> dropping frame (computer too slow ?)
[00007f532c13dfb0] avcodec decoder error: more than 5 seconds of late video -> dropping frame (computer too slow ?)
[00007f532c13dfb0] avcodec decoder error: more than 5 seconds of late video -> dropping frame (computer too slow ?)
[00007f532c13dfb0] avcodec decoder error: more than 5 seconds of late video -> dropping frame (computer too slow ?)
[00007f532c13dfb0] avcodec decoder error: more than 5 seconds of late video -> dropping frame (computer too slow ?)
[00007f532c13dfb0] avcodec decoder error: more than 5 seconds of late video -> dropping frame (computer too slow ?)
[00007f532c13dfb0] avcodec decoder error: more than 5 seconds of late video -> dropping frame (computer too slow ?)
[00007f532c13dfb0] avcodec decoder error: more than 5 seconds of late video -> dropping frame (computer too slow ?)
[00007f532c13dfb0] avcodec decoder error: more than 5 seconds of late video -> dropping frame (computer too slow ?)
a restart of VLC fixed this... need to do some more monitoring of the RTSP stream over longer time...
@m11tch Maybe it's just connection issue ? I do not suppose the device is very fast qua wifi speed.
That's why I'd really like to see an SD version too ;).
@jandy123 looking at the log it seems there was indeed a connection issue around 13:00 which caused the RTSP stream to stop and start again, perhaps VLC got confused at this stage.. so could also be VLC related..
Hopefully @jilleb can do some tests with surveillance station and report back :)
edit: stream froze again after roughly one hour.. will try mplayer again to see if that shows the same behaviour
edit2: mplayer seems to have more issues.. will report back later..
@m11tch
looking at the log it seems there was indeed a connection issue around 13:00
Exactly my point.
Some brief instructions for patching the ppsapp to work in off-line mode:
locate main function, see post above by @guino
look for a while loop similar to:
while (iVar2 = FUN_0009139c(), iVar2 != 1) {
sleep(1);
}
- we need to disable the loop so that the application can proceed even if mqtt connection not yet established.
- for this, checkout` FUN_0009139c`. It simply returns a (global) variable, in my case `DAT_0040c81c`. We need to force this variable to be set to 1, so that the loop is bypassed. Click on the variable and look at its XREFs. Clicking the very first (W) reference lands in a function which sets this variable either to 0 or 1, depending on mqtt connection status. We need to force both branches to set the var to 1. The patch consists in replacing #0x0 by #0x1 below
0009135c 00 20 a0 e3 mov r2,#0x0
00091360 00 20 83 e5 str r2,[r3,#0x0]=>DAT_0040c81c
i.e., 00 20 a0 e3 -> 01 20 a0 e3.
////////////////////
- the mod above is still not enough, since the application insists connecting to the NTP tuya servers; we need to disable this too, but need to set correct date/time before the application is started.
- search for string `tuya_ipc_get_service_time`. Once located, click on the first XREF. Look for a variable `uVar1` which is set either to 0 (no error) or `0xffffffff` (error connecting). We need to force this to be set to 0. The assembly should look something like:
000d37f8 00 00 50 e3 cmp r0,#0x0
000d37fc 00 00 a0 13 movne r0,#0x0
000d3800 00 00 e0 03 mvneq r0,#0x0
- we need to change the last instruction to read `moveq r0, 0x0`. This boils down to: 00 00 e0 03 -> 00 00 A0 03.
//////////////////
- finally, when streaming over rtsp, the application generates a wdt exception and reboots after a certain amount of time has passed. To fix this we need to disable this time limit. Search for string: `exception ocurred: %d\n`. Once located, click first XREF. This function (in my case `FUN_0004cd70`) is responsible for rebooting when the "exception" happens. Locate the XREF of this function that calls it passing to it value 1. In my case this is the second XREF. Clicking it leads to:
iVar2 = FUN_0004cf60();
if (iVar2 != 0) {
FUN_0004cd70(1);
}
Look at the function above it, `FUN_0004cf60` (click it). There we find something like:
uint FUN_0004cf60(void) { int iVar1;
iVar1 = FUN_000822d0(); if (iVar1 < 0x5b) { DAT_00406620 = 0; } else { DAT_00406620 = DAT_00406620 + 1; } return (uint)(100 < DAT_00406620); }
As you see, This function returns a non-zero value (triggering the reboot) when `DAT_00406620` increases to value 100. We modify it so that it never increments `DAT_00406620`, i.e., we modify the add instruction below
0004cf90 01 30 83 e2 add r3,r3,#0x1
0004cf94 38 20 9f e5 ldr r2,[PTR_DAT_0004cfd4] = 00406620
0004cf98 00 30 82 e5 str r3,[r2,#0x0]=>DAT_00406620
as add r3,r3,#0x0: 01 30 83 e2 -> 00 30 83 e2.
@adwiraguna my cpu stays between 80-95% while streaming. @jandy123 I do have a version of the SDK (Hi3516EV200R001C01SPC010) around 2GB if you want to download it -- you can usually find it online somewhere (i.e. some aliexpress posts of hi3516/hi3518 oem boards have links to it). Also loved the write-up on the mods.
@guino Could you please have a look if it's possible without too much effort ;) to make an SD version ? I mean, you already invested lots of time in figuring out the ring buffers....
@guino Shall we wait a bit and see how this goes, before jumping into making our own rtsp server ? But, it's good to know that you have the SDK. Also the resources you posted seem very promising, indeed.
Regarding the tuya app upgrade nag, in my case (doorbell from Action NL) running software_version 2.9.6, it's enough to change two occurrences of string "2.9.6" to "2.9.x" (x is set to 8 in my case) in ppsapp. This will calm down the tuya app continuously asking for a fw upgrade...
Edit: The no-fw-upgrade version posted somewhere above, contains additional patches to increase chances of surviving fw upgrade attempts by tuya, though ;)
@jandy123 for SD version (from what I have seen) all you have to do is find the init echoshow function as described in https://github.com/guino/ppsapp-rtsp):
Then modify that uStack32 parameter to 1 (which is the parameter that goes to the start_echoshow function (highlighted in yellow). If I remember correctly both uStack32 and uStack28 get set from a same 'mov' assembly command (just click on uStack32 = 0 line to show it). I tried setting that to 1 as a way to bypass the HD channel check in the start_echoshow function which worked but created a SD RTSP version instead of HD -- should be easy for you to try it.
@m11tch Looking at your edited post above, seems that "13" minutes still create problems when rtsp streaming ? Is this timing consistent among multiple runs ?
I mean, given the high CPU occupancy and temperature, there may be some reason that they stop rtsp streaming after xxx time.
Another idea. Has anyone looked into what happens if the tuya app streams for longer than 13 minutes ? This without any mod, with inet access, etc.
@jandy123 no it is longer then 13 minutes.. just written down the start time: started stream at 16:52:56 image froze at: 17:37:36
VLC on windows this time btw..
@guino, I hope you don't mind me creating a seperate issue, but I think it helps to keep the other topics on-topic and easier to read.
So let's discuss here what's what when it comes to off-cloud/offline usage of the doorbell. Currently it uses the Tuya cloud, which means the following, according to @m11tch research