vladmandic / human

Human: AI-powered 3D Face Detection & Rotation Tracking, Face Description & Recognition, Body Pose Tracking, 3D Hand & Finger Tracking, Iris Analysis, Age & Gender & Emotion Prediction, Gaze Tracking, Gesture Recognition
https://vladmandic.github.io/human/demo/index.html
MIT License
2.36k stars 323 forks source link

Face genders are the same for multiple faces. #119

Closed ikebastuz closed 3 years ago

ikebastuz commented 3 years ago

Within the RAW output of detector (specifically faces) all gender and genderConfidence fields are the same for each person.

  const result = await human.detect(source, config);
  console.log(result.face.map(f => `${f.gender}-${f.genderConfidence}`));

Generates logs like this:

CV Accuracy 2021-05-18 15-27-16

So, adjacent frames with the same people (almost the same, probably 1 person goes out and 1 comes in) - are comepletly different (X males might X females in a single frame). And all of them share the same genderConfidence.

Config is default with almost everything enabled

CV Accuracy 2021-05-18 15-32-27

Environment

vladmandic commented 3 years ago

will look into it, most likely bug around cached values.
can you try with config.videoOptimized = false as a quick test (that disables caching).

also, what is the type of your source? and if it's an image, can you share so i can reproduce exact results.

ikebastuz commented 3 years ago
CV Accuracy 2021-05-18 16-21-47

Seems like it did the trick, thank you! (some {genderConfidence: 0, gender:""} appeared, but thats not a problem)

We are using you library for webcam feed (real time age/gender/... estimation) In order to test its accuracy I was processing video (as usual HTML element)

Should this config flag be used for webcam also?

vladmandic commented 3 years ago

basically videoOptimized flag means that some calculations will be skipped for n number of frames and instead Human will return cached values. Goal is to avoid re-running models for items that change rarely on relatively stable inputs such as webcams (e.g., if you're sitting in front of webcam, chances are gender or age do not change).

if videoOptimized is enabled, then each model has it's own skipFrames value that is used.

videoOptimized is disabled automatically if input type is image as then it makes no sense to perform caching.

there is still room for improvement, i'm working on input similarity check to dynamically determine if caching should be used or not instead of relying on fixed skipFrames values.

but i do see one more error in your configuration - face.description is a new combo model that replaces older individual face.age, face.gender and face.embedding. so if face.description is enabled, other 3 should be disabled (they are actually removed in latest version of Human). output format is the same, so no need to change anything else in your code.

ikebastuz commented 3 years ago

Ah, that seems pretty reasonable. Will tweak for my needs (need to track and calculate the amount of people also)

Yep, I've seen face.description, couldn't yet handle custom model paths, but will figure it out.
Thank you for your instant help! Will keep an eye on the updates

vladmandic commented 3 years ago

Human version 1.9.0 beta is now on github, but I'm going to keep it in testing for few more days before publishing on npmjs.

Breaking difference is that config.videoOptimized has been removed and instead there is config.cacheSensitivity = number. Cache now automatically reset if input is determined to change more than this much percentage. So no need to change it depending if input is video or images or if video scene changes.

Current default is cacheSensitivity = 0.005 meaning if input image is exactly the same within 0.5%, use cached values.

vladmandic commented 3 years ago

anyhow, i'll close this issue as original problem is resolved. feel free to open new issue for any problems.