ZoneMinder / zoneminder

ZoneMinder is a free, open source Closed-circuit television software application developed for Linux which supports IP, USB and Analog cameras.
http://www.zoneminder.com/
GNU General Public License v2.0
5.12k stars 1.22k forks source link

nph-zms does not terminate and continues to use CPU #3375

Closed gcormier closed 1 year ago

gcormier commented 3 years ago

Describe Your Environment

Describe the bug nph-zms does not terminate after watching a video event.

To Reproduce Steps to reproduce the behavior:

  1. Restart zoneminder and launch htop or other process monitor
  2. Open zoneminder web console
  3. Click on one of the events and then watch the video
  4. Browse to another site or close the tab or go back
  5. Process does not terminate
  6. Repeate step 3 and observe multiple instances begin to stack up

Expected behavior Process should terminate and no longer consume CPU when done watching a video clip.

image

welcome[bot] commented 3 years ago

Thanks for opening your first issue here! Just a reminder, this forum is for Bug Reports only. Be sure to follow the issue template!

mcesnik commented 2 years ago

I also noticed this on my setup (same as OP using dlandon/zoneminder.machine.learning without any of the event server or ML stuff running). From what I see it seems to happen on the "preview" thumbnails but not always. In my case I can end up with 5-10 processes using up all 100% of my CPU.

connortechnology commented 2 years ago

Is suspect this is specific to the dlandon docker. It does not happen here on bare ubuntu. You could try posting debug logs, but I suspect that the docker is using a different cgi method, like php-fpm and it is not noticing the dropped connection. Not sure how we can fix that ZM side.

gcormier commented 2 years ago

I'm not using dlandon, but rather this one https://github.com/zoneminder-containers/zoneminder-base

connortechnology commented 2 years ago

Then you are going to have to take it up with them there.

mcesnik commented 2 years ago

I will need to do some further digging. I played around with my configuration to use nginx instead. I have the following

    location /zm/cgi-bin {
        gzip off;
        alias /usr/lib/zoneminder/cgi-bin;

        include /etc/nginx/fastcgi_params;
        fastcgi_param SCRIPT_FILENAME $request_filename;
        fastcgi_intercept_errors on;
        fastcgi_pass  unix:/run/fcgiwrap.socket;
    }

Which uses fcgiwrap and spawn-fcgi

spanchy commented 2 years ago

same problem, version of 1.36.7 image

Carbenium commented 2 years ago

@freexmonster This report is specifically about zms and not zmc. Additionally, I'd say that CPU load is to be expected with the amount of cameras you try to run on 2 cores.

jangrewe commented 2 years ago

I'm not using any Docker containers, but just bare-metal ZM & ZM-ES, and i experience the same issue. There's always 1-2 nph-zms running, consuming ~25% CPU (each). No web console or ZMNinja - or any other client - running.

When i do have the web console open, there are plenty more nph-zms running, but they're all at ~2-4% CPU (each).

maverick214 commented 2 years ago

Have the same, running ZM plugin in Truenas 12.0-U6 each time I view a thumbnail it opens a new ZMS process. Turning off the option WEB_ANIMATE_THUMBS will stop it.

mcgman commented 2 years ago

same issue here ZM plugin in Truenas. Live video feed works fine in that clicking back or closing the tab kills the zms process. However, when viewing an event video, zms is spawned but never exits.

I should mention I saw processes running named php-fpm (mentioned by @connortechnology above) but I don't understand what that means, how it relates, and/or why the behavior is different between zms spawned from live feed and zms spawned from event view.

connortechnology commented 2 years ago

I believe this is fixed. Please update to 1.36.17 or higher and report.

daveymg-nz commented 2 years ago

Using Zoneminder-containers/zoneminder-base docker image with Zoneminder 1.36.19 I can report this issue still exists. Disabling WEB_ANIMATE_THUMBS does not appear to affect the issue. I can reproduce by replaying events and clicking on the "next event" button before the currently playing event has completed. Doing this repeatedly results in more and more nph-zms processes existing and gradual depletion of free ram. Edit: I tried letting the replays run to completion before watching the next, after 8 replays I had 8 nph-zms processes listed, so the clicking before replay completion isn't relevant.

pgrandin commented 2 years ago

I can confirm this issue on v1.36.21 from the docker image here: https://github.com/zoneminder-containers/zoneminder-base/pkgs/container/zoneminder-base/29421610?tag=amd64-nightly-822

crazy28 commented 2 years ago

Yea it’s happening to my setup.

debian 11 Nginx Php fpm 7.4 Zoneminder v1.36.21

Not closing the zms after use

SmokeyBR commented 2 years ago

Is suspect this is specific to the dlandon docker. It does not happen here on bare ubuntu. You could try posting debug logs, but I suspect that the docker is using a different cgi method, like php-fpm and it is not noticing the dropped connection. Not sure how we can fix that ZM side.

this happens with me on ubuntu20.04 nothing under docker but i do use nginx with php-fpm, i noticed happening since 1.36 never figure it out how to reproduce thou, sometimes it happens sometimes doesn't, seems more likely to happen using zmninja app although cant say for sure, wish i could be more help is there any specific log i could watch for the problem?

waynieack commented 1 year ago
Version of ZoneMinder 1.37.25
How you installed ZoneMinder PPA: deb-src http://ppa.launchpad.net/iconnor/zoneminder-master/ubuntu focal main
Ubuntu 20.04.5 LTS
apache2 2.4.41
Not using php-fpm

I also see this issue, it happens when I replay recorded events as described by @daveymg-nz.

In addition to the hung process, I also noticed that when you are viewing a recorded event and you delete it, the nph-zms process will not only hang but will consume 100% CPU as well.

PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND

240985 www-data 20 0 310760 42836 33848 R 100.0 0.1 0:18.83 /usr/lib/zoneminder/cgi-bin/nph-zms

waynieack commented 1 year ago

I finally did some digging on this and found that there's 2 issues here.

  1. When you watch a recorded video and click the next button, the nph-zms process for the last event is left running.

    For this issue I noticed that the apache connection was listed as tcp6 for ipv4 connections in netstat: tcp6 0 0 192.168.99.35:80 192.168.99.179:60448 ESTABLISHED 37662/apache2

    To resolve this, I disabled ipv6 in the apache config by changing all the "Listen 80" and "Listen 443" to "Listen 0.0.0.0:80" and "Listen 0.0.0.0:443", then restarting apache. So far, I don't see nph-zms processes continuing to run after going to the next event. No idea what the actual issue here is but I don't use ipv6 so I don't care to have it enabled.

  2. When you watch an event and delete it, the nph-zms process hangs and uses 100% cpu. I also see errors like this "Can't open /mnt/ext-video/events/5/2022-12-23/14326436/086-capture.jpg: No such file or directory" because it seems that the event is deleted while the nph-zms process is accessing it.

    For this issue I added "streamReq({command: CMD_QUIT});" in "function manageDelConfirmModalBtns()" in file "www/skins/classic/views/js/event.js" which seems to stop nph-zms before deleting the event.

This may not be the right way to fix this because I am getting an error like the below after the delete the event but at least the process isn't hung using 100% CPU anymore. Can't send /mnt/ext-video/events/7/2022-12-23/14326377/110-capture.jpg: Broken pipe

Below is the code update:

File: www/skins/classic/views/js/event.js

867 function manageDelConfirmModalBtns() { 868 document.getElementById("delConfirmBtn").addEventListener("click", function onDelConfirmClick(evt) { 869 if (!canEdit.Events) { 870 enoperm(); 871 return; 872 } 873 874 pauseClicked(); 875 evt.preventDefault(); 876 streamReq({command: CMD_QUIT}); 877 $j.getJSON(thisUrl + '?request=event&action=delete&id='+eventData.Id)

connortechnology commented 1 year ago

Some great insights here. We definitely need to stop streaming before delete. I will look at getting that code in. It's kindof a bigger problem though, someone else could delete the event while we are viewing it. We almost need to lock the db record while viewing in order to prevent any such action.

Also, zms shouldn't consume 100%... if it can't find a jpeg, it should just sleep and try to send the next as per usual... and if we are replaying a single event.