roleoroleo / yi-hack-Allwinner

Custom firmware for Yi 1080p camera based on Allwinner platform
MIT License
446 stars 66 forks source link

Uploading video recording in semi-realtime #41

Closed Xandrios closed 4 years ago

Xandrios commented 4 years ago

Separate topic as this is a fairly separate item.

I'd like to be able to send/upload the tmp-video file as its being written to. This to minimize the 1-minute delay that is introduced if you wait for the tmp-file to first be completed and renamed to its 'final' filename.

However as we have no control over when the tmp-file is rolled (about once per minute) I'd thought it would be great to make a hardlink. Uploading data from that hardlink should work even if the tmp-file is being rolled in the meantime. It basically allows to attach a 'final' filename to a recording still in progress.

Typically the SD card uses FAT, and unfortunately FAT doesnt support hard links. I'm guessing the camera does not support the newer unix filesystems...but maybe it does support something basic.

In /home/base/tools we only have the commands to create FAT file systems. The typical ways of checking support for filesystems does not seem to work well. Do we know what filesystems are supported other than FAT, squashfs, tmpfs all of which are not usable in this case?

roleoroleo commented 4 years ago

This is the list returned from proc filesystem:

root@yi-hack-allwinner:~# cat /proc/filesystems
nodev   sysfs
nodev   rootfs
nodev   ramfs
nodev   bdev
nodev   proc
nodev   tmpfs
nodev   devtmpfs
nodev   configfs
nodev   debugfs
nodev   sockfs
nodev   pipefs
nodev   devpts
        squashfs
        vfat
nodev   jffs2
nodev   mqueue
Xandrios commented 4 years ago

Oh I should have been able to find that myself, that was too easy - I thought the only way to get the kernel supported ones was to mess about with modprobe. Thanks.

But alas, it seems none of those would be actually suitable for the SD card. Would we be able to add support (for e.g. ext4) without recompiling the kernel? At least I don't think that recompiling the kernel currently is something that we can do?

A workaround may be to use the raw video data stream. Could we somehow get access to a continuous stream of the mp4 video and write that to disk as long as we see motion? Maybe by using rtsp? (Although I don't know of rtsp data can just be saved to mp4 and played back that way?)

That way we would create a 'secondary' video recording file, but...that allows to start uploading straight away from the first second motion is detected. All of this will help, of course, when using the cam for security purposes...where the biggest risk is the camera being unplugged (and/or SD card removed).

roleoroleo commented 4 years ago

But alas, it seems none of those would be actually suitable for the SD card. Would we be able to add support (for e.g. ext4) without recompiling the kernel? At least I don't think that recompiling the kernel currently is something that we can do?

This is correct. I already tried to recompile kernel to add a module and I only got a lot of kernel panics. I could try ext4 but I don't think I will succeed.

A workaround may be to use the raw video data stream. Could we somehow get access to a continuous stream of the mp4 video and write that to disk as long as we see motion? Maybe by using rtsp? (Although I don't know of rtsp data can just be saved to mp4 and played back that way?)

That way we would create a 'secondary' video recording file, but...that allows to start uploading straight away from the first second motion is detected. All of this will help, of course, when using the cam for security purposes...where the biggest risk is the camera being unplugged (and/or SD card removed).

We can extract an h264 raw stream (it's the first function performed by rRTSPServer reading it from the buffer /dev/shm/fshare_frame_buf) and copy it somewhere. But at this point why not use rtsp?

Xandrios commented 4 years ago

Yes, I think that re-building the kernel might be a whole other project on its own indeed.

Fully agree, RTSP may be a good way to get a copy.. at least that doesn't require any code changes. The only trick will be to convert the RTSP stream into mp4. Ffmpeg can do that, and although I did see it being used with building one of the modules, I don't think we have the ffmpeg binary available/accessible. Maybe build RTSPClient as that seems to be a popular client for these kind of things..

roleoroleo commented 4 years ago

I have ffmeg compiled but we don't have hw resources to do this conversion inside the cam. Ram and cpu... Probably would be better to use an external dvr (sw or hw) like motion or zoneminder.

Xandrios commented 4 years ago

There should not be a need to convert anything though...rtsp also uses the mp4 codec, right? Theoretically it should be just about removing the RTSP initialisaion messages (SETUP/PLAY messages, SDP parsing, etc)..but the video data stream itself should just be stored byte-for-byte if I'm understanding it right. Something like this:

ffmpeg -i rtsp://10.2.2.19/live/ch01_0 -c copy -map 0 -f segment -segment_time 300 -segment_format mp4 "capture-%03d.mp4"

Maybe I'll play around with ffmpeg and/or the Live555 tool if I find some time. I'm looking at how the cross compilation works..theoretically this should work, right?

export CROSSPATH=/opt/yi/toolchain-sunxi-musl/toolchain/bin
export PATH=${PATH}:${CROSSPATH}

export TARGET=arm-openwrt-linux
export CROSS=arm-openwrt-linux
export BUILD=x86_64-pc-linux-gnu

export CROSSPREFIX=${CROSS}-

export STRIP=${CROSSPREFIX}strip
export CXX=${CROSSPREFIX}g++
export CC=${CROSSPREFIX}gcc
export LD=${CROSSPREFIX}ld
export AS=${CROSSPREFIX}as
export AR=${CROSSPREFIX}ar

./configure --enable-cross-compile --cross-prefix=$(CROSSPREFIX) --arch=armel --target-os=linux --prefix=$(CROSSPATH) [... other compile options ...]
Xandrios commented 4 years ago

Yes, I got that to work. ffmpeg had heaps of problems properly writing the video data (even when i re-compiled ffmpeg with all options enabled). Live555 works:

./bin/openRTSP -4 -b 200000 "rtsp://192.168.1.9/ch0_0.h264" > video.mp4

That just fills the video file realtime with data. I think I should be able to make this work with a 'realtime' file upload.

Xandrios commented 4 years ago

Playing around a bit more with this. An issue may be that the video starts with a 'green' picture. When splitting into smaller files each file has a second of green..that is kind of annoying. When using 2-second files it basically means that all video data is green.

However when not splitting into smaller files the video is not playable because the file is incomplete, I'm assuming that mp4/h264 needs some kind of header/footer data for playback? I'm not sure if it would be possible to 'repair' a half-uploaded video file if that header/footer data is not there.

Maybe because of missing keyframes?

roleoroleo commented 4 years ago

I don't know... Do you need both audio and video or only video?

Xandrios commented 4 years ago

Preferably both; but mainly video.

One thing that I may be able to do is just record to one large file. The first few seconds may be green, but the rest of the video should not. If the power would be interrupted while recording (and uploading) I may be able to repair that incomplete video file. VLC does not play it (at all), but perhaps there are ways..

roleoroleo commented 4 years ago

Another way... Export h264 raw from the buffer with h264grabber and convert it with this library: https://github.com/lieff/minimp4 At the moment without audio. Do you wanto to try? Tools.zip

Xandrios commented 4 years ago

That actually works incredibly well, thanks! That works much better than using RTSP! Could it be that the video quality is also better or is that just on my mind?

The only thing I noticed is that on one of my test recordings, somewhere in the middle, it shows these artifacts for just over a second. But I'd say that is pretty minor...it also only showed up with one of my tests so far.

image

roleoroleo commented 4 years ago

That actually works incredibly well, thanks! That works much better than using RTSP! Could it be that the video quality is also better or is that just on my mind?

I think it's the same quality.

The video recording starts in about ~1 second after issuing the h264grabber command, not much loss there

The time needed to receive the 1st i-frame (there is 1 i-frame every 2 seconds).

The only thing I noticed is that on one of my test recordings, somewhere in the middle, it shows these artifacts for just over a second. But I'd say that is pretty minor...it also only showed up with one of my tests so far.

It's an h264grabber issue, I will try to study it.

Xandrios commented 4 years ago

Thanks! Would you need samples for the h264grabber issue, and/or is there anything that I can do to help with this?

roleoroleo commented 4 years ago

No, thanks.

pbanj commented 4 years ago

To kind of piggy back off of this. Would we be able to add recording to say a network drive?

Xandrios commented 4 years ago

That probably depends on what protocols the network drive supports. And keep in mind that you'll need about 1Mbit of bandwidth available towards your network drive (Especially important if your network drive is, for example, a cloud-based drive that requires your internet connection).

If your drive supports ONVIF/RTSP (Like the Synology/Qnap ones) you're probably best off using the RTSP streams. That allows realtime recording of the video stream.

If it supports FTP/SFTP/RSYNC you can use the file-based approach where you sync the recorded files from the SD card towards your network drive. The files that are generated by the Yi software are easiest but those are delayed by about a minute (which in some cases may be an issue, like when you use the camera for security purposes).

This issue/topic is about improving on that mechanism, allowing to start moving files to a remote drive almost realtime (just a few seconds delay). Issue #30 is about making that sync-process motion-driven. Still work-in-progress though.

pbanj commented 4 years ago

for me having it uploaded in real time isnt an issue as i need it more for going back and looking at it. i use a wdmycloud and access it with smb it also has ftp but i dont think i can set it to grab from anything it is more for accessing it with ftp

Xandrios commented 4 years ago

I noticed something interesting on the 'official' recording feature: The tmp file that is being recorded to can be renamed to another file - and will continue to grow/be written to by the Yi software - even after renaming. Apparently the fact that the file descriptor/pointer was already opened by Yi is enough to keep the writing working, even though the file is moved. This also means that it would be possible to start uploading this file after renaming it - as the name won't change any more at that point.

When working off motion triggers it still may have up to 59 seconds of "non-interesting" video in that file though. Unless we have Yi only record when it sees motion...but then we would have to ind another way to record the continues 24/7 files to SD (for the archive).

Problem with this motion is that if the upload is cut off (i.e. camera shut down), this results in an incomplete file on the remote host. I've tried various recovery tools but was unable to repair one of these mp4 files that was not fully completed...so this does not seem like a feasible method. Unless somebody else knows of a way to get these mp4 files to work if they were cut while recording? That would be kind-of perfect if that were possible.

Using h264grabber this issue does not exist and even cut files can be easily played still. But we are seeing some artifacts with h264grabber every now and then. In general, the quality of the video that is being produced isn't great ...most likely due to compression. h264grabber obviously grabs the already compressed video so at that stage there is not much that can be done. Do we have a way to change some of the video quality settings within Yi software?

In general having a not-so-strongly-compressed 720p stream would probably be better than the heavily compressed 1080p stream. Question is of we can get the camera to produce such...most likely not. How do we get the low-quality rtsp stream? Would that be less-compressed since the resolution is lower?

roleoroleo commented 4 years ago

How do we get hold of the low-resolution stream for RTSP?

You can choose the stream passing an argument to h264grabber. But the sub stream is 640x360, I think not enough.

About compression parameters, unfortunately we cannot change anything.

Xandrios commented 4 years ago

640x360 is, indeed, a bit too low. 720p would be ideal but it seems that we don't have access to this format then. At least not without reverse-engineering or decompiling the Yi software.

So I installed one of these cameras as a security camera today. Its still running the default Yi software. It looks out through an alleyway to a street in front, like this:

image

When somebody passes on the street in front we get a movement notification, and the standard 6-seconds movement clip. What I noticed though is that this 6 second clip often starts just before the person passes. So before the motion was detected..

The only way I can explain this is that the Yi software seems to have some kind of buffer of most-recent-video, which is being used as a starting point when movement is detected. It seems to do this in-memory (the temp-file only exists when actually recording). But it is kind-of neat feature to be honest.

Theoretically I can build something like this using the h264grabber tool as well, but it would be tricky. To do that we may have to start a new recording every 5 seconds or so, and only in case of movement keep recording (and upload the file). In case of no movement discard the 5 second clip and start with a new 5-second clip. Its possible..but tricky to implement.

It would be nice if we can use the native videos that are recorded by Yi when motion is observed. They also have audio which is a plus. But the main problem is, when we are syncing the video files (in realtime) and the camera is cut off (Which is a realistic scenario for a security camera), the mp4 file that has so far been uploaded cannot be played because of some mp4 metadata missing at the end of the file.

I think it would be useful if we could find a way how to repair mp4 files that have been cut short. For this specific scenario, but also other cases where people may end up with incomplete files. If you don't mind I'll open another topic/issue for that.

roleoroleo commented 4 years ago

Ok, no problem.

Xandrios commented 4 years ago

So I implemented this last night using h264grabber (from here) , just to see how this would work out. It basically records 4-second videos, two in parallel with a 2-second offset. Like this:

image

When no motion is detected the 4-second clips are deleted. However, when motion is detected, there is always a recording already underway for ~2 seconds...meaning that this recording already has had keyframe(s) and contains a little bit of history. Like so:

image

But I noticed something odd when using h264grabber in this way, basically having it record the stream twice. This is something you can easily test.

In many cases I end up with two files that are almost identical in size/video length. Even though the two h264grabber processes were started with 5-10 seconds time in between. I would have expected at least 0.5 to 1MB of file size difference between them.

In these cases, when playing back those videos, they actually do start at the same point. Basically one of them 'went back in time', catching up to the other one. Its very strange. I don't believe that we have the h264grabber source code public in this project? I'm wondering if there is some kind of shared-buffer that doesn't work well when the h264grabber executable runs multiple times in parallel.

Another thing I noticed with the h264grabber video result is that the time is not consistent. When you record a clock for example, you will see that some seconds on the clock take looong, while others are almost totally skipped. I have the feeling that we are not capturing all video data correctly yet with the h264grabber. Do we see the same behavior with RTSP streams? Probably as it uses the same mechanism right?

roleoroleo commented 4 years ago

But I noticed something odd when using h264grabber in this way, basically having it record the stream twice. This is something you can easily test.

* Start h264grabber, write to file. You will see the file size increase slowly..100K, 200K, 300K, etc.

* After 10 seconds start (in another window/thread) a second instance if h264grabber, writing to another file. It starts 100K, 400K, 800K, 1.2M...until it is in sync with the first instance.

In many cases I end up with two files that are almost identical in size/video length. Even though the two h264grabber processes were started with 5-10 seconds time in between. I would have expected at least 0.5 to 1MB of file size difference between them.

This is the normal behavior. Because the read index is initialized to the buffer start. Regardless of the write index. The buffer is about 1.7 MB, so many seconds of video. If you want to start the recording from "now" we have to initialize buf_idx_1=buf_idx_w.

In these cases, when playing back those videos, they actually do start at the same point. Basically one of them 'went back in time', catching up to the other one. Its very strange. I don't believe that we have the h264grabber source code public in this project?

No problem, h264grabber is my source. https://github.com/roleoroleo/yi-hack-Allwinner/tree/0.1.3/src/h264grabber

Another thing I noticed with the h264grabber video result is that the time is not consistent. When you record a clock for example, you will see that some seconds on the clock take looong, while others are almost totally skipped. I have the feeling that we are not capturing all video data correctly yet with the h264grabber. Do we see the same behavior with RTSP streams? Probably as it uses the same mechanism right?

h264grabber is not perfect, I know. It's obtained from a complicated reverse engineering. We can probably do better (with a lot of time available).

Xandrios commented 4 years ago

Ah it was removed from the latest source, thats probably why I could not find it. Would it be worth keeping it in the source tree so that its being built, but maybe not package it (by default) with a release? This way it won't take space on the device for people that will not use it.

No problem for it not being perfect - that's why we are here...trying to improve and build a capable product out of a trashy 18 Euro Chinese camera. The work already done on this is impressive.

The buffer is maintained by Yi software right? I'm not very well versed in C, from what I understand you find two buffer indices (buf_idx_1, buf_idx_2). What do these two positions represent?

If the buffer is a FiFo-like system and you will always get ~1.7MB of video data because that is already in the buffer, then I don't need to do the trickery as described in my previous post. Because that would mean that, every time you start a recording, you will get 1.7MB of history anyway.

But that would only work if the buffer is always filled (continuously). Is it circular, does it go from 0 to 100% and back to 0.. how does the buffer work?

One other thing that I was wondering... while mmap() is being called, munmap() is not upon SIGINT or TERM/KILL. Should we catch signals and perform munmap() to be sure that we release these resources? To prevent running out of memory or file-descriptors when we start/stop h264grabber many times?

Xandrios commented 4 years ago

The only thing I noticed is that on one of my test recordings, somewhere in the middle, it shows these artifacts for just over a second. But I'd say that is pretty minor...it also only showed up with one of my tests so far.

It's an h264grabber issue, I will try to study it.

This issue is a bit..weird. Sometimes it does not happen for a long time...and then it happens almost continuously. As if something gets out of sync.

What I noticed is that when this happens for a longer amount of time, that the regular frames are distorted...but the keyframes are not. So it has a proper keyframe, then 2 seconds of broken video, a good keyframe again, 2 seconds of broken video, and so on. This seems to point to something related to the code for regular-frame handling perhaps..

roleoroleo commented 4 years ago

The buffer is maintained by Yi software right? I'm not very well versed in C, from what I understand you find two buffer indices (buf_idx_1, buf_idx_2). What do these two positions represent?

The buffer is maintained by rmm (the core yi process) and it's a circular buffer. There is a complicated system of lock and semaphore that I can't understand completely.

-rw-------    1 root     0          1786088 Jun  2 13:32 fshare_frame_buf
-rw-------    1 root     0               16 Jun  2 13:32 sem.fshare_read_lock
-rw-------    1 root     0               16 Jun  2 13:32 sem.fshare_read_notify_0
-rw-------    1 root     0               16 Jun  2 13:32 sem.fshare_read_notify_1
-rw-------    1 root     0               16 Jun  2 13:32 sem.fshare_read_notify_10
-rw-------    1 root     0               16 Jun  2 13:32 sem.fshare_read_notify_11
-rw-------    1 root     0               16 Jun  2 13:32 sem.fshare_read_notify_12
-rw-------    1 root     0               16 Jun  2 13:32 sem.fshare_read_notify_13
-rw-------    1 root     0               16 Jun  2 13:32 sem.fshare_read_notify_14
-rw-------    1 root     0               16 Jun  2 13:32 sem.fshare_read_notify_15
-rw-------    1 root     0               16 Jun  2 13:32 sem.fshare_read_notify_16
-rw-------    1 root     0               16 Jun  2 13:32 sem.fshare_read_notify_2
-rw-------    1 root     0               16 Jun  2 13:32 sem.fshare_read_notify_3
-rw-------    1 root     0               16 Jun  2 13:32 sem.fshare_read_notify_4
-rw-------    1 root     0               16 Jun  2 13:32 sem.fshare_read_notify_5
-rw-------    1 root     0               16 Jun  2 13:32 sem.fshare_read_notify_6
-rw-------    1 root     0               16 Jun  2 13:32 sem.fshare_read_notify_7
-rw-------    1 root     0               16 Jun  2 13:32 sem.fshare_read_notify_8
-rw-------    1 root     0               16 Jun  2 13:32 sem.fshare_read_notify_9
-rw-------    1 root     0               16 Jun  2 13:32 sem.fshare_write_lock

The buffer is simply a mmapped file where rmm writes and other processes read. If you read the first 32 bytes of the buffer /dev/fshare_frame_buf you can see 2 4-bytes values changing, from the start of the buffer (offset 300) to the end. I don't know exactly what's the content of these indices but they are near to the write index of the buffer. So I made a procedure that reads the buffer down to the "write index" looking for a NAL_START sequence (00000001). Then it repeats the search looking for the next NAL_START. In this manner I have a complete frame that I can analyze and write to stdout. This step is repeated indefinitely going forward in the buffer.

If the buffer is a FiFo-like system and you will always get ~1.7MB of video data because that is already in the buffer, then I don't need to do the trickery as described in my previous post. Because that would mean that, every time you start a recording, you will get 1.7MB of history anyway.

Yes, the buffer contains 1.7 MB of history.

But that would only work if the buffer is always filled (continuously). Is it circular, does it go from 0 to 100% and back to 0.. how does the buffer work?

It's circular. There is an header 300 bytes long.

One other thing that I was wondering... while mmap() is being called, munmap() is not upon SIGINT or TERM/KILL. Should we catch signals and perform munmap() to be sure that we release these resources? To prevent running out of memory or file-descriptors when we start/stop h264grabber many times?

The file is already mmapped by yi processes. If I am not wrong, the new mmap should not take up new memory.

Xandrios commented 4 years ago

Thanks for the details.

Interestingly I noticed that the behavior currently is that:

Look at this example file. The moment that the movement starts, the distortion starts as well. It does resolve itself after a while though. Additionally I get the feeling, though not proven yet, that it happens only once after a camera restart (though only with large amount of motion).

Maybe, if we cannot determine what is causing it, we can reduce the amount of time/frames the problem is showing? Assuming it is h264grabber that now detects the issue and resets, causing the distortion to stop.

What kind of tools do you use to visualize the buffer contents, to see changes happening and so on?

roleoroleo commented 4 years ago

Maybe, if we cannot determine what is causing it, we can reduce the amount of time/frames the problem is showing? Assuming it is h264grabber that now detects the issue and resets, causing the distortion to stop.

I don't know if I able to detect the problem from h264grabber. I need to check it. Lost frames? Frames corrupted? I would like to know if the problem is the grabbing or the sequence grabber -> rtsp server. Could you make the following test?

What kind of tools do you use to visualize the buffer contents, to see changes happening and so on? I wrote and compiled small C utils.

Xandrios commented 4 years ago

Apologies for the delayed reply. The above video sample was recorded with h264grabber and then converted into mp4 with the minimp4 tool. I have not used RTSP in any way.

Original h264 file made by piping h264grabber output to file: http://upload.xandrios.net/20200605T200852.raw

Converted into mp4: http://upload.xandrios.net/20200605T200852.raw.mp4

VLC shows the following statistics. Interesting enough it does show lost frames...but that happens from the beginning, even when there is no distortion. It is losing frames even when the video looks fine. When the distortion happens the ratio of good/lost frames does not seem to change.

image

So perhaps we are seeing two separate problems here? Lost frames along the whole video, and distortion when a lot of motion is visible.

Since the distortion is definitely motion-related, and since keyframes seem to be fine, I am wondering if the format of regular frames in the buffer changes when there is a lot of motion. More motion means larger changes in each frame, and thus a larger size. Could it be that we are seeing a situation where the frame size grows larger than what we are expecting - causing the grabbing to get out of sync?

roleoroleo commented 4 years ago

So perhaps we are seeing two separate problems here? Lost frames along the whole video, and distortion when a lot of motion is visible.

I think the 1st problem it's not a real problem. Probably vlc understands the movie as 25 fps while it is 20 fps. If you look at the counter, it increases by 5 at a time.

Since the distortion is definitely motion-related, and since keyframes seem to be fine, I am wondering if the format of regular frames in the buffer changes when there is a lot of motion. More motion means larger changes in each frame, and thus a larger size. Could it be that we are seeing a situation where the frame size grows larger than what we are expecting - causing the grabbing to get out of sync?

I couldn't find any problems when there is a video distortion. If you run the grabber with debug option you can see the frame size and the frame counter (extracted directly from the frame buffer). There is a specific log when the counter skips a value. If there were problems (eg sync lost) we would have lost frames.