matiasdelellis / facerecognition

Nextcloud app that implement a basic facial recognition system.
GNU Affero General Public License v3.0
528 stars 47 forks source link

Test if we are resilient to OOM during face recognition #63

Closed stalker314314 closed 5 years ago

stalker314314 commented 5 years ago

And what to do here? See if PHP just crashes immediately and we cannot do anything or we can catch some exception from dlib? If we can catch it and act on it, what should we do? Skip image and go to next one? I don't think so, as I think all images will fail in similar fashion, so we need to stop. Also, how to notify user on this, as obviously it doesn't have enough memory. We will need to have some notification in settings/user/facerecognition, but backend needs to know that last iterations ended with OOM, not sure how to preserve that info (in config?). @matiasdelellis - I am interested in your ideas here?

matiasdelellis commented 5 years ago

we can catch some exception from dlib?

No. dlib has no consideration in this regard. In all the tests I did, the kernel ends up killing the process that consumes all my memory.

Simil that.

Dec 25 15:52:24 Ubuntu-1604-xenial-64-minimal kernel: [1818692.409025] Out of memory: Kill process 2059 (a.out) score 961 or sacrifice child
Dec 25 15:52:24 Ubuntu-1604-xenial-64-minimal kernel: [1818692.409162] Killed process 2059 (a.out) total-vm:99272240kB, anon-rss:64928316kB, file-rss:0kB, shmem-rss:0kB
Dec 25 15:52:25 Ubuntu-1604-xenial-64-minimal kernel: [1818693.695334] oom_reaper: reaped process 2059 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

See if PHP just crashes immediately and we cannot do anything.

I understand that it would kill the same php process that executes the the background task. So we can not do anything.. We can try to work with threads, but I see it very difficult...

Also, how to notify user on this, as obviously it doesn't have enough memory.

This must find the administrator, through the logs. If it is executed from cron, the report remains in syslog. If it runs as a nextcloud fund task, you probably see it in your log.

p.s: I must go, then continue answering..

stalker314314 commented 5 years ago

I was afraid that PHP will just crash. Not sure about you, but I am very worried about this and this is one of the biggest blocker for some "easy install/adoption". Biggest one, I think. I have to think about this a lot, I hope you will too. I have no solution at the moment. We could:

  1. Forbid cron job (you can take down other apps' cron jobs when you crash, we don't want that) and just go with command
  2. Be smarter. For example, before doing face detection, update row in column oc_face_recognition_images.last_try telling which image size is rescaled. After detection, clear this column. If there is any existing images with this value, assume that there was OOM and that give image scaling is not good enough and either:
    • Refuse to work any more, until admin resolves it manually (sets different limits)
    • Try with lower image size (here we are entering "we-are-smart/dynamic-resizing" territory, which I don't particularly like, unless there are other options)

Huh, hard decisions, will have to think about this more.

matiasdelellis commented 5 years ago

I also keep thinking about it..

Forbid cron job (you can take down other apps' cron jobs when you crash, we don't want that) and just go with command

When I say an cron job, I mean a particular one, configured by the administrator, and that only does this. If it crash it should not affect other tasks. However, I think we should suggest effusively, perform the task by hand for the first time, which will be the hardest. This must be the fundamental test to know if it is viable on the server. If successful probably does not have problems in the future, or they will be isolated by system load.

matiasdelellis commented 5 years ago

https://github.com/matiasdelellis/facerecognition/wiki/Performance-analysis-of-DLib%E2%80%99s-CNN-face-detection#results-for-the-project https://github.com/matiasdelellis/facerecognition/wiki/FAQ

:wink:

nervnet65 commented 5 years ago

Referring to issue #97: I am using Nextloud on a machine with 16 GB of memory. I have watched the used memory during face recognition. An image analysis takes about 40 seconds. During this time PHP uses 40 till 60 % of memory. Shortly before the recognition was successfully finished, this value rises to over 90 %.