marcus-j-davies / nvr-js

A simple, lightweight, but very functional NVR aimed at 24/7 recording using nodejs.
MIT License
25 stars 13 forks source link

[Enhancement]: Review Storage mechanics PLUS! #10

Open KeithHanson opened 2 years ago

KeithHanson commented 2 years ago

Hello!

Firstly, I would like to commend you on the work you've done here. It's precisely what I need for a very important project for my city as well.

I am creating a fork of the repository now to begin diving into the codebase and attempt to determine what may be happening on our end, so please know I plan on rolling up my sleeves and helping contribute to a solution, NOT just log the bug :P

Anyhow, please see below:

ISSUE: Many hours after starting PM2 service for NVRJS, the timeline does not show video segments on the timeline. Context: We are looking to use NVRJS for the camera systems we've built utilizing Raspberry Pi's, PoE cams + switch, and USB harddrives.

We are so very close thanks to your work here. But after testing for roughly a week, we see timeline issues.

We DO see that it is properly restarting the ffmpeg processes and the files are logging to disk.

So everything seems to be working (ffmpeg, UI), except for some reason the segments stored to disk.

Also, NVRJS runs rock solid (haven't seen it rebooting over and over or anything for days at a time).

Once I DO end up restarting NVRJS, the timeline begins working normally, though is missing files that are definitely on disk.

I'll log here if I make progress on it!

Thank you for any insight you can help with though :)

marcus-j-davies commented 2 years ago

@KeithHanson

The patch is ready to test.

To Disable Login: add this to your config file.

module.exports = {
    /* System Settings */
    system: {
        /* Disable Security - Know what your doing before changing this! */
        disableUISecurity: true,
        .....
        }
}

Changes

{timestamp}_placeholder.json A file used to store a temp metadata payload (created on FFMPEG process start), this also allows to continue creating events, before the main metafile is written

{timestamp}.json The main metafile, created at the end of each segment (as advised by FFMPEG). any events created in {timestamp}_placeholder.json are moved into this file, then finally a new {timestamp}_placeholder.json is created.

metafile creation is now tied to FMPEG activity, and is no longer 'watching' for new files, which I still believe seem to have problems recovering after IO errors.

KeithHanson commented 2 years ago

THANK YOOOOOU!!!! Will deploy tonight!

KeithHanson commented 2 years ago

Got excited. Deployed to one of our systems for testing and so far things are working :P

I've got about 30+ systems I'll deploy to tonight to verify.

KeithHanson commented 2 years ago

Ok - I've got this running on 10 of our live deployments as of about 20 minutes ago :)

Going to let this run for 24 hours or so and report back :)

KeithHanson commented 2 years ago

Oooh, as a bonus - I moved the files from one folder to the renamed camera folders and bam - everything was picked up. I did have to restart the service for it to show in the timeline, but suuuuper convenient!

KeithHanson commented 2 years ago

Ok, I have deployed to all systems we have under control. ~30.

Something else I thought of that is a win because of this is the way we want to backup our data.

I'm planning on having a second NVRJS running on 81, but it will pull the low quality streams to disk. We're designing an algorithm that will pick the most appropriate "buddy" to back up to and from and kick off some kind of copy process.

Because of these changes, we can easily obtain at least low quality footage by simply using the interface and update the config to display the backed up folder <3

This lets us focus on the hard/critical part for us (choosing the right pole to back up to with a variety of factors) while getting a UI for free! :) :) :)

marcus-j-davies commented 2 years ago

You are excited! 😆

There are APIs built in to fetch segments (metadata), current system utilisation and camera details, if you wanted to create a dashboard to monitor all instances (not footage, more health)- i have not spent a great deal documenting them but they do exist.

Also if I am reading your comments correctly - NVRJS has not been tested to run 80+ cameras (i.e pole 81) - just one to bare in mind.

Let me know if the recent patch fixes the missing meta files (I'm hoping they have)

KeithHanson commented 2 years ago

30 total systems - We'll get to 80 before end of year :)

Excellent - we will definitely tap into that - we have a heartbeat service that checks all kinds of things relevant to us (disks, encryption status, temps, software versions, etc).

KeithHanson commented 2 years ago

We did hit an issue on two systems.

If the metadata file is corrupted for any reason, parsing it fails and NVRJS goes into a boot loop.

2|NVRJS  | SyntaxError: Unexpected end of JSON input
2|NVRJS  |     at JSON.parse (<anonymous>)
2|NVRJS  |     at /home/pi/nvr-js/NVRJS.js:494:19
2|NVRJS  |     at Array.forEach (<anonymous>)
2|NVRJS  |     at InitCamera (/home/pi/nvr-js/NVRJS.js:492:15)
2|NVRJS  |     at /home/pi/nvr-js/NVRJS.js:393:2
2|NVRJS  |     at Array.forEach (<anonymous>)
2|NVRJS  |     at Object.<anonymous> (/home/pi/nvr-js/NVRJS.js:391:9)
2|NVRJS  |     at Module._compile (internal/modules/cjs/loader.js:1068:30)
2|NVRJS  |     at Object.Module._extensions..js (internal/modules/cjs/loader.js:1097:10)
2|NVRJS  |     at Module.load (internal/modules/cjs/loader.js:933:32)
PM2      | App [NVRJS:2] exited with code [1] via signal [SIGINT]
PM2      | App [NVRJS:2] starting in -fork mode-
PM2      | App [NVRJS:2] online

I am fairly sure we just need a try-catch here: https://github.com/marcus-j-davies/nvr-js/blob/v3.0.0/NVRJS.js#L494

I patched in a try/catch and things seemed to work:

$ node NVRJS.js 
 - Checking config.
 - Config loaded: /home/pi/nvrjs.config.js
 - Checking volumes and ffmpeg.
 - Creating express application.
 - Compiling pages.
 - Configuring camera: Camera 1
Unable to parse: 1660650121.json
SyntaxError: Unexpected end of JSON input
    at JSON.parse (<anonymous>)
    at /home/pi/nvr-js/NVRJS.js:495:20
    at Array.forEach (<anonymous>)
    at InitCamera (/home/pi/nvr-js/NVRJS.js:492:15)
    at /home/pi/nvr-js/NVRJS.js:393:2
    at Array.forEach (<anonymous>)
    at Object.<anonymous> (/home/pi/nvr-js/NVRJS.js:391:9)
    at Module._compile (internal/modules/cjs/loader.js:1068:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1097:10)
    at Module.load (internal/modules/cjs/loader.js:933:32)
 - Configuring camera: Camera 2
Unable to parse: 1660650120.json
SyntaxError: Unexpected end of JSON input
    at JSON.parse (<anonymous>)
    at /home/pi/nvr-js/NVRJS.js:495:20
    at Array.forEach (<anonymous>)
    at InitCamera (/home/pi/nvr-js/NVRJS.js:492:15)
    at /home/pi/nvr-js/NVRJS.js:393:2
    at Array.forEach (<anonymous>)
    at Object.<anonymous> (/home/pi/nvr-js/NVRJS.js:391:9)
    at Module._compile (internal/modules/cjs/loader.js:1068:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1097:10)
    at Module.load (internal/modules/cjs/loader.js:933:32)
 - Configuring camera: Camera 3
Unable to parse: 1660650120.json
SyntaxError: Unexpected end of JSON input
    at JSON.parse (<anonymous>)
    at /home/pi/nvr-js/NVRJS.js:495:20
    at Array.forEach (<anonymous>)
    at InitCamera (/home/pi/nvr-js/NVRJS.js:492:15)
    at /home/pi/nvr-js/NVRJS.js:393:2
    at Array.forEach (<anonymous>)
    at Object.<anonymous> (/home/pi/nvr-js/NVRJS.js:391:9)
    at Module._compile (internal/modules/cjs/loader.js:1068:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1097:10)
    at Module.load (internal/modules/cjs/loader.js:933:32)
 - Strting purge timer.
 - NVR JS is Ready!
 - Purging data.
marcus-j-davies commented 2 years ago

Yeah - a try catch should be added in a couple of places really. I am more interested in the root problem.

What did 1660650120.json & 1660650121.json look like?

Just want to make sure I am doing nothing silly, and its more IO Disk errors, when writing files.

KeithHanson commented 2 years ago

Almost undoubtedly disk error - we have to deal with it on our crappy power grid and recover automatically (sad, I know - drink one for me in commiseration :P).

Both files were empty.

KeithHanson commented 2 years ago

If you want I can add in this try catch for now and submit a pull request. I've patched the two systems that failed manually for now.

If you're already off to the races on the patch I can hang back ofc :)

marcus-j-davies commented 2 years ago

Cool - well not, but you know 😅

I'll add a try catch to any reading of a file (and writing for good measure). I cant do anything special here, other than backing out of the current operation - in favour of not bringing down the instance in its entirety.

EDIT:

If you're already off to the races on the patch I can hang back ofc :)

Already on it

KeithHanson commented 2 years ago

Awesome :) And that is fine - I'm not sure if there is any magical code that could solve that lol. But that's also a reason I reduce things to 3 minute chunks (faster streaming, faster downloading, less risk of missing important things because of gremlins/dragons like this).

KeithHanson commented 2 years ago

Just did a count of all the systems. Deployed most recent commit to 41 raspberry pi's with 2TB and 4TB drives attached :)

I tested it on the previously failing system, and everything went smoothly.

Thank you! :) I'll report in if I find anything.

KeithHanson commented 2 years ago

Rock solid :D

marcus-j-davies commented 2 years ago

Nice!

This is with the recent patch to stop reading Corrupted JSON files?

The changes mode here.

https://github.com/marcus-j-davies/nvr-js/commit/5639de6b9d71b975a18ffc7a9c73df6731618316

KeithHanson commented 2 years ago

Correct! I deployed it after I saw the update.

I see some gaps here and there, but that's from whatever failures happened.

But the only problem I've had is HDD space filling up at this point :D Just means I need to tune the retention.

I've spot checked about 10 of the 41 systems and those that didn't fill up their drive have a full 2 days of history!

Brilliant! :)

EDIT: they all have little gaps here and there (3-6 minutes), but it's obvious our code is recovering from the issue, and your code is handling the recovery gracefully :)

marcus-j-davies commented 2 years ago

Assuming all is well? I am aiming to publish 3.0 soon.

trc-turing commented 2 years ago

i think this is a great project, and it gives me a good idea to make a NVR system. thank you very much. i tested v2.0 and v3.0 version. every .mp4 file record video for one minute. the record video part works well. after i record video for 16 hours, if i scroll the mouse wheel many times to zoom in/out timeline quickly, and then scroll the timeline from right to left a few seconds. it will crash, although i press the refresh button, it also doesn't work. two versions all happened the same issue.

  1. i tested the vis-timeline using the link: https://visjs.github.io/vis-timeline/examples/graph2d/08_performance.html, the issue didn't happen.
  2. record video for 5 minutes(not many hours), the issue didn't happen.

i think when i scroll the mouse wheel, the timeline.on('rangechanged') will be called many times,
query SQLite or ReadMetaFile maybe have some delay.

marcus-j-davies commented 2 years ago

Hi @trc-turing,

Version 3 has removed SQLIte entirely, and its now based on a JSON file per segment, this seems to have removed a lot of problems, v3 is currently being used in a very large installation, that seems to be quite stable.

See v3 Change Log https://github.com/marcus-j-davies/nvr-js/blob/v3.0.0/CHANGELOG.md

As for rangechanged - yes! this is what causes the query (previously SQLite), but now uses the file mech. rangechanged will only continue if rangechange is not triggered within 500ms.

https://github.com/marcus-j-davies/nvr-js/blob/5639de6b9d71b975a18ffc7a9c73df6731618316/web/static/js/scripts.js#L78 in other other words, the scroll/zoom must be static for 500ms - try messing with this value.

trc-turing commented 2 years ago

thanks for your reply. yes, you are right. if i change the settimout duration from 500 to 1000 or 3000. it doesn't crash. but after i did it, when i click the timeline, because the duration it will have a delay, or click event doesn't work.

marcus-j-davies commented 2 years ago

Yup, the timeline needs to load data based on the time span in view +- 2 hours.

As the NVR can be running for weeks/months/years at a time, I don't load everything - as the browser could be overloaded with data, can you imagine 1 min segments over a week? That's 10,080 segments in the timeline!

But then zooming out will do the same 😅

I therefore need to cap it based on the current view.

I load what timespan is in view (plus 2 hours)

You can override the 2 hour buffer by changing SearchTimeBufferHours in the scripts file.

At the moment, I don't have a method (or time) to stop the unnecessary loading.

I can probably improve it, but I need to find the time todo so, and it's currently not a priority of mine.

I welcome PRs if you want to contribute 😇

baozi510 commented 1 year ago

你好@trc-turing,

版本 3 完全删除了 SQLIte,现在它基于每个段的 JSON 文件,这似乎消除了很多问题,v3 目前正在一个非常大的安装中使用,这似乎相当稳定。

请参阅 v3 更改日志 https://github.com/marcus-j-davies/nvr-js/blob/v3.0.0/CHANGELOG.md

至于rangechanged——是的!这就是导致查询的原因(以前是 SQLite),但现在使用文件 mech. 只有在 500 毫秒内未触发 rangechanged时才会继续。rangechange

https://github.com/marcus-j-davies/nvr-js/blob/5639de6b9d71b975a18ffc7a9c73df6731618316/web/static/js/scripts.js#L78

换句话说,滚动/缩放必须是静态的 500 毫秒 - 尝试弄乱这个值。

published yet to NPM

wait published to npm

marcus-j-davies commented 1 year ago

@baozi510

The 3.0 Pull Request is ready -> https://github.com/marcus-j-davies/nvr-js/pull/13

I'll publish in a day or so.