vladmandic / human

Human: AI-powered 3D Face Detection & Rotation Tracking, Face Description & Recognition, Body Pose Tracking, 3D Hand & Finger Tracking, Iris Analysis, Age & Gender & Emotion Prediction, Gaze Tracking, Gesture Recognition
https://vladmandic.github.io/human/demo/index.html
MIT License
2.39k stars 326 forks source link

Gaze Direction #105

Closed lghasemzadeh closed 3 years ago

lghasemzadeh commented 3 years ago

Hello,

Regarding the talk we had I attached 4 photos. It think there is sth wrong because I don't get even a single frame correct gaze direction. IMG_5710 IMG_5709 IMG_5708 photo6046379394110436368

and I have another question: where can I find the FACE related prints on the left up corner, first line? I just want to simply change the 'FACE: FACING CAMERA' to sth els, e.g. FACE: FACING CENTER. just changing a word. I check index.js, node.js files but did not find where you print those words.

Thank you

vladmandic commented 3 years ago

Regarding the talk we had I attached 4 photos. It think there is sth wrong because I don't get even a single frame correct gaze direction.

Those are good examples - and yes, my simplified math is just too simple - comparing area sizes of irises is insufficient when you're facing camera, but looking away from camera

I'll update here when I have a new solution ready

where can I find the FACE related prints on the left up corner, first line? I just want to simply change the 'FACE: FACING CAMERA' to sth els, e.g. FACE: FACING CENTER. just changing a word. I check index.js, node.js files but did not find where you print those words.

inside demo/index.js:drawResults() it just calls built-in helper functions indside Human to draw
and the one you're looking for is src/draw/draw.ts:face()
but again, they just print what they get from human.detect() result object - words are defined in src/gesture/gesture.ts

so you can either:

for example, really silly but simple way of converting entire result to string, replacing strings and returning back as result
(better way would be to walk the object and replace values as needed)

let result = await human.detect(input);
const replace = JSON.stringify(result).replace('facing camera`, 'facing center');
result = JSON.parse(replace);
lghasemzadeh commented 3 years ago

These are what I have in the Human folder. I don't have demo and src folders. Screenshot from 2021-04-19 01-36-42 Screenshot from 2021-04-19 01-35-50 is it ok to just download the missing folders and putting them into my Human folder?

vladmandic commented 3 years ago

this does not look like human official npm package or git clone.
this looks like someone manually copied demo folder and placed dist and models folders inside demo and that's what you're using.

you're missing all of the sources and documentation.

where does this copy come from and how was it installed?

lghasemzadeh commented 3 years ago

A friend of mine prepared it and gave it to me. I just got the file and extracted it. how should I fix it or can I continue with it?

vladmandic commented 3 years ago

you can continue using it, but then you wont be able to update easily as i rollout any fixes or changes (your friend would have to prepare it again)

documentation you can see online here on github, so that's not an issue
but you're missing actual sources, you only see the demo (things in '/dist' are compiled and minimized, so not readable) - that's why you cannot find things when you're searching for strings

you could also download everything from here on github, but then you also wouldn't be able to update automatically
so it would be best to either use npm install ... to install NPM package or git clone ... to clone repository

anyhow, for purpose of this issue, to rename facing camera to facing center you don't need any of that, you can still do that by doing either:

vladmandic commented 3 years ago

ok, i've just made some changes to gestures:

i've tried also 'looking up' and 'looking down', but human eye just doesn't have height compared to iris height to be able to run math precise enough - it's ok for close zooms of eye, but otherwise not precise enough so i won't enable it.

new code is already on github and will be published on npmjs later this week (waiting for some fixes from tfjs team to be able to bundle new tfjs 3.4.0)

lghasemzadeh commented 3 years ago

Hello Vladimir,

* iris: 'looking center', 'looking left' and 'looking right' (new algorithm comparing distance of iris center to corner of eye)

yes, this is exactly the thing I was looking for. thank you. Now,Looking left and right works very well since you get the corner points as reference. For Up and Down, it is possible to get the up and down extreme points of eyes as the reference. since each eye hase several landmark points, it is possible to get the upper most and lower most landmark points' ID/index and referring to it, looking up and down should work. This is what I did in the algorithm ı was previously wrote.

photo6046379394110436412

you changed the text but, than you, but I still need to find where you print those stuffs at the left up corner. the only thing I found from the folder that I have is the below screenshot. Screenshot from 2021-04-19 08-24-41

I don't know maybe js is completely different from what I know (python). I want to see the lines of code that you print those strings, Especially the first line (FACE: .....)

here by commenting the line 165, all the texts at left-up corner will disappear. but I want to access each line of that corner text separately. Screenshot from 2021-04-19 09-17-30

İt is very straight forward when I go through the github folders -> src -> gesture -> gesture.ts -> line 42 to 65. I don't have the src file and I can not find the source of last explanation you made (simple object -> string -> replace -> object conversion - walk the object and replace values as needed). Screenshot from 2021-04-19 09-59-21

vladmandic commented 3 years ago

or Up and Down, it is possible to get the up and down extreme points of eyes as the reference. since each eye hase several landmark points, it is possible to get the upper most and lower most landmark points' ID/index and referring to it, looking up and down should work.

Yes, it can be done and I've tried it - the problem is reliability of results:

and I can not find the source of last explanation you made (simple object -> string -> replace -> object conversion - walk the object and replace values as needed).

There is no source, that is just how it can be done on your side
And those are two different methods, not two steps in one method:

simple object -> string -> replace -> object conversion:

let result = await human.detect(input);
if (result && result.gesture) {
  const replace = JSON.stringify(result).replace('facing center`, 'i want to say something different');
  result = JSON.parse(replace);
}

walk the object and replace values as needed):

const result = await human.detect(input);
if (result && result.gesture) {
  for (const gesture of result.gesture) {
    if (gesture.gesture === 'facing center') {
      gesture.gesture = 'i want to say something different';
    }
  }
}

or the same with a map and regex function:

const result = await human.detect(input);
if (result && result.gesture) {
  result.gesture = result.gesture.map((gesture) => {
    gesture.gesture = gesture.gesture.replace(/facing center/, 'i want to say something different')
    return gesture;
  });
}

if you do that before human.draw is called, it will replace values as you want and then draw will draw your new results.

vladmandic commented 3 years ago

ok, i've changed it to include up/down:

as suspected, precision is not perfect, but why not

vladmandic commented 3 years ago

let me know if there are any remaining questions regarding gaze detection or this issue can be closed?

lghasemzadeh commented 3 years ago

Currently, I don't have but I will have as soon as I get able to install the library (Human). As you said it is better to have the right package but I have difficulty to install it. I clone the repository and do the npm i but I get this error. I tried some solutions but it didn't resolve the error. Screenshot from 2021-04-20 12-36-32

Hope you can help :)

vladmandic commented 3 years ago

to install human, you don't need to clone it and then install it's dependencies
instead, just run npm install @vladmandic/human to install human package from npmjs and that's it

then to update human at any time, just run npm update to update to latest minor version or npm update --latest to update to latest major version

or alternatively copy human from git main branch using git clone https://github.com/vladmandic/human
in that case, to update to latest changes, you'd use git pull

but again, no need to run npm install inside human in either case

installing dependencies inside human is only needed if you plan to make changes to the library as it installs devDepenendencies and in that case, best procedure would be to have a separate tree for human:

and such local fork can be used in your actual project by installing it from local path

vladmandic commented 3 years ago

i'm closing this issue as related to gaze detection. if there are any other questions, let me know.

lghasemzadeh commented 3 years ago

Hello, I am able now to play with prints at the left corner, and changing them the way I want, I need to find two more things: 1) There are several lines of print for FACE and IRIS, and the order of them are not fix, it is really confusing for example sometimes the second line is for FACE: HEAD UP/DOWN and sometimes is for Blink parameter. the user used to check the head up and down in second line but it changes to blink. each lines needs to be for a unique parameter otherwise it makes the observer confuse since he/she should search for the parameter he/she wants beyond those prints and of course the frames pass and actions change. I want to make each parameter unique in names and orders (order is not prior now). I want to give them unique names. for example instead of two FACEs, I want to make the first one (for left and right) as FACE1:, and second one (which is for up and down) FACE2:. as well for IRISs. Where can I edit them? 2) In the box created over the face I want to deactivate some modules or at least their prints because they come over the face and don't let to see the face clearly. How to do it? when I comment out the line 165 of index.js, all the prints in the box disappear.

Thank you

vladmandic commented 3 years ago

best is to take a look at how drawing is done in /src/draw/draw.ts and copy those functions to your code and use them instead of built-in one. for example, in index.js, inside function drawResults, it calls human.draw.face() - remove that and replace with call to your function. and you can use /src/draw/draw.ts:face() as template to build that function. same for gestures.

lghasemzadeh commented 3 years ago

Hello Vladimir,

1) I have tested the gaze direction in several different situations and the performance is not robust, as soon as I changed the position of the camera or light condition the accuracy drops. Actually I expected it because gaze direction function is a rule-based (mathematics based) and the ratios will change with both position of camera and the person. I thought it will be a good idea to integrate a robust DL based gaze estimation model into Human. There is an open source python pre-trained model that I am using for my study. I am supposed to change the algorithm to JS and then integrate it to Human, or is there any way to skip this conversion part?

What is your idea? any solution to make the gaze direction finding part work more accurate?

2) The Iris distance is working reverse, When I get more close to the webcam it shows higher distance and we I get more far from the camera it shows lesser distance.

Thx

vladmandic commented 3 years ago

I have tested the gaze direction in several different situations and the performance is not robust, as soon as I changed the position of the camera or light condition the accuracy drops. Actually I expected it because gaze direction function is a rule-based (mathematics based) and the ratios will change with both position of camera and the person. I thought it will be a good idea to integrate a robust DL based gaze estimation model into Human. There is an open source python pre-trained model that I am using for my study. I am supposed to change the algorithm to JS and then integrate it to Human, or is there any way to skip this conversion part?

Can you share the model so I can take a look?

The Iris distance is working reverse, When I get more close to the webcam it shows higher distance and we I get more far from the camera it shows lesser distance.

Iris was calculating iris size, not actual distance. I've corrected that, latest code is on git.

Values returned now are distance from camera in cm corrected for a typical webcam field of view of 88 degrees
(for example, when i'm sitting in front of notebook, iris distance will be ~30cm).

Note that there is no way to determine camera field of view programatically, so for more correct measurements user should adjust this value accordingly.

Btw, this iris distance was an actual issue - can you in the future open a separate issue for such items so it can be tracked and closed correctly?

lghasemzadeh commented 3 years ago

1) What do you do exactly in the Human to find the gaze? 2) Is there any room to improve the gaze finding part? Is it possible to make it more robust?

I just shared the link via email. Sure I will open separate issue for iris distance. Thanks

vladmandic commented 3 years ago

No need to open separate issue for Iris distance anymore since it's already fixed in main branch as of few days ago
Just please do so for any new issues you find in the future

Anyhow, I've just created a new discussion item and tried to answer most of the questions there: https://github.com/vladmandic/human/discussions/124

lghasemzadeh commented 2 years ago

Hi Vladimir,

I checked the iris detection of mediapipe and I faced problem regarding the directions correctly and while searching for similar issues I found this. It can give the iris position inside the eyes correctly when it is at right and left but for up and down it gives wrong estimation since the eyelid landmarks move as iris landmarks move and by the result the position of iris inside the sclera (eye) remains constant. see the video in the link I shared. I think you have this problem for Human as well. Do you have any idea?

Thx

vladmandic commented 2 years ago

exactly the same problem and i don't see an easy way out. but its also a low priority for me given everything else.