jakowenko / double-take

Unified UI and API for processing and training images for facial recognition.
https://hub.docker.com/r/jakowenko/double-take
MIT License
1.21k stars 94 forks source link

[FEAT] Person/body detection? (I know, I know...) #192

Open Hukuma1 opened 2 years ago

Hukuma1 commented 2 years ago

Like you, I went down the rabbit hole of presence detection and tried all sorts of methods, Double Take being the latest endeavor. Thank you for making this! I'm currently geeking out as I type this. I just set it up and training some images. Btw Room Assistant beacon BLE was close, but you're right, carrying a phone or something else as a requirement felt more like a workaround, rather than a cool solution.

The issue I seem to get with Double Take is that if my cameras are set up to cover a wide field of view, the faces are obviously going to be small. And I can't exactly be running 8MP cameras indoors, let alone feed that into Frigate without an accelerator. This problem is further exacerbated if you don't have full proper lighting in the rooms. Our bedroom for instance has less brightness than the living room during the evening.

I know Double Take is for faces. But is there a way to extend it to train to discern people as a whole? We already have object type person from Frigate. Could the same potentially be done to train a bunch of body type images to tie it to a specific person? I am merely doing it for my home and family members inside the house. For all of us who don't have a camera directly pointed at our face to get a solid reading in Double Take currently, I think it could be another game changing feature, or in the extreme case I suppose a standalone product.

I found this other issue that's closed: https://github.com/jakowenko/double-take/issues/142

654456 commented 2 years ago

I don't know how you would effectively do this as body shape changes so much depending on clothing, distance to the camera, and focal lengths.

Hukuma1 commented 2 years ago

It would definitely be limited at first. Not designed for crowds, for sure. But surely we can start with basics like man/woman detection? For example hair length could be an attribute (yes, I know man buns are a thing...). Difference in height?