scottlamb / moonfire-nvr

Moonfire NVR, a security camera network video recorder
Other
1.22k stars 137 forks source link

NVR is recording, but api does not show up-to-date stream info #156

Closed clydebarrow closed 3 years ago

clydebarrow commented 3 years ago

Via the web gui, I'm not seeing recordings after 9:53AM today, but the live view works, and I see files in the sample directory corresponding to current recordings with no obvious gaps.

Checking via the api, the /api/ call returns a maxEndTime90k corresponding to now, but the calls to the endpoint for each stream (e.g. /api/cameras/7f2e2a50-1e68-4647-817b-03089ca2003e/sub/recordings?... returns an endTime90k of 9:53AM.

Any ideas on what the issue is and/or how to troubleshoot?

clydebarrow commented 3 years ago

Stop and restart makes no difference.

As an aside, what was the reason for choosing to represent timestamps in units of 90kHz?

clydebarrow commented 3 years ago

After restart, the recording table now is being populated, but the missing recordings are not in the database:

select datetime(start_time_90k/90000, 'unixepoch', 'localtime')  from recording where stream_id=1 order by start_time_90k desc limit 10;
2021-08-30 13:39:07
2021-08-30 13:37:42
2021-08-30 13:32:13
2021-08-30 13:30:12  <-- After restart
2021-08-30 09:52:33
2021-08-30 09:51:02
2021-08-30 09:50:17
2021-08-30 09:49:14
2021-08-30 09:48:28
2021-08-30 09:47:12
scottlamb commented 3 years ago

I've never seen/noticed a disparity between maxEndTime90k and the actual max end time 90k. I'd probably start diagnosing by looking for a particular recording of interest and looking at its current state in the db (looks like you're on top of that) and in logs.

Missing recordings could happen if some problem prevented Moonfire from flushing the database. Then on restart it will delete the actual files to match the database. Here I'd start with the logs.

Also, keep in mind the possibility the recordings are there but not with the expected timestamp, as mentioned [here]9https://github.com/scottlamb/moonfire-nvr/blob/master/guide/troubleshooting.md#incorrect-timestamps). Doesn't explain the maxEndTime90k thing though.

scottlamb commented 3 years ago

As an aside, what was the reason for choosing to represent timestamps in units of 90kHz?

This is the timescale H.264 over RTP uses. It just seemed simplest to keep everything matching that. Although I may revisit this when adding audio support.

clydebarrow commented 3 years ago

Ok, so it turns out the Pi was power cycled at around 9:53 (someone who shall remain nameless unplugged it.) The system came up with a time over a month in the past, but corrected this within a minute or so via ntp. So this is the same as issue #9 .

I am puzzled as to why the nvr did not pick up the corrected time - if it only reads the system time once on startup, how does it actually keep track of time - does it use an internal timer? And why would it not update its time from the system, at least when starting a new recording file?

Secondly, I see the comments about using an RTC and clearly for a system that has to work offline this would be a must, but for a network connected system it should be possible to make sure the time has been updated before starting NVR. I have added ntp.service to the Want line in the Docker service config. That should help.

scottlamb commented 3 years ago

Yes, sounds like #9, except I'm confused about this part:

Checking via the api, the /api/ call returns a maxEndTime90k corresponding to now,

Is it possible you were mistaken about this? If the new recordings were getting added with an old time, I don't know how the maxEndTime90k would have been set to current.

I am puzzled as to why the nvr did not pick up the corrected time - if it only reads the system time once on startup, how does it actually keep track of time - does it use an internal timer? And why would it not update its time from the system, at least when starting a new recording file?

It reads the system time when it starts an RTSP session / "run". All the recordings from a run are supposed to be seamless, so later ones' wall times are just the first one's wall time + elapsed wall duration. (see the glossary and time handling doc)

The downside is this behavior. If the time switches mid-session, the newer time is probably more accurate. So maybe we should go back and update the times in some way. Eg maybe shift all times in the current "open" by how much wall time has jumped relative to monotonic time. This could be a very large database operation though, because there's no bound on how long Moonfire NVR can be running or how long a run can last.

Secondly, I see the comments about using an RTC and clearly for a system that has to work offline this would be a must, but for a network connected system it should be possible to make sure the time has been updated before starting NVR. I have added ntp.service to the Want line in the Docker service config. That should help.

Does that have a timeout, or will it just not bring up Moonfire NVR when the network is down?

scottlamb commented 3 years ago

I suppose I could force a new run to start if there's been a time jump (details below), splitting the RTSP session into two runs. The older run would still have the wrong time, but the newer one would be correct. Would this be less surprising?

(Detecting a time jump: one option would be to check periodically / on new recording and see if difference in (wall - monotonic) since previous measurements has exceeded some threshold. Another would be to use timerfd + TFD_TIMER_CANCEL_ON_SET on Linux as systemd does and not support this on other platforms. (I keep macOS working because it's convenient for developers, but Linux is a much better deployment target anyway.))

clydebarrow commented 3 years ago

Is it possible you were mistaken about this?

It's possible. Unfortunately the terminal output for the query I ran has scrolled off - largely due to doing an ls in the sample directory - there are over 55000 files in there. Maybe that should have a hierarchical structure.

I was composing the original message in real time so I'm fairly certain that what I wrote reflected what I saw. I could try to reproduce the situation and check.

Does that have a timeout, or will it just not bring up Moonfire NVR when the network is down?

The service (I'm actually using chrony, not ntp, basically the same thing) should always start (something is broken if it does not) but without internet connectivity will not get a new time. If it is connected, the first time update should be almost immediate, so hopefully forcing the order of service startup will be enough in most cases.

I suppose I could force a new run to start if there's been a time jump

That makes sense to me, especially since there shouldn't be much of a delay updating the time. That should be fairly easy, whereas trying to go back and rewrite history would be a much bigger job and fraught with peril.

I just skimmed your time handling doc - clearly you have put a lot of thought into this. I'm also impressed at the system's ability to deliver seamless video streams for arbitrary time periods. The UI is coming along well, BTW.

scottlamb commented 3 years ago

Copied the splitting idea over into #9. I'll close this issue now. Please let me know though if you run across a repeatable case of maxEndTime90k being inconsistent with the max endTime90k.

The service (I'm actually using chrony, not ntp, basically the same thing) should always start (something is broken if it does not) but without internet connectivity will not get a new time. If it is connected, the first time update should be almost immediate, so hopefully forcing the order of service startup will be enough in most cases.

Hard for me to guess if it will win that race or not. If you can find an easy way to say in moonfire-nvr.service that it wants to wait eg x seconds for time-sync.target, I'll suggest that in the instructions. I don't see one though.

clydebarrow commented 2 years ago

FYI, adding the dependency on chrony seemed to work fine when I rebooted yesterday. The initial time was about a month ago, this was corrected by chrony well before moonfire started.