vladmandic / human

Human: AI-powered 3D Face Detection & Rotation Tracking, Face Description & Recognition, Body Pose Tracking, 3D Hand & Finger Tracking, Iris Analysis, Age & Gender & Emotion Prediction, Gaze Tracking, Gesture Recognition
https://vladmandic.github.io/human/demo/index.html
MIT License
2.29k stars 320 forks source link

Couldnt get any results of face, hand, etc. #113

Closed okeoke85 closed 3 years ago

okeoke85 commented 3 years ago

I guess there is an issue of configuration, with same config i can get results from client side, on node side couldnt get,

Config;

{"backend":"tensorflow","modelBasePath":"http://localhost:5000/app/ai/humanapi/models/","wasmPath":"../assets/","debug":true,"async":false,"profile":false,"deallocate":false,"scoped":false,"videoOptimized":false,"warmup":"face","filter":{"enabled":true,"width":0,"height":0,"return":true,"brightness":0,"contrast":0,"sharpness":0,"blur":0,"saturation":0,"hue":0,"negative":false,"sepia":false,"vintage":false,"kodachrome":false,"technicolor":false,"polaroid":false,"pixelate":0},"gesture":{"enabled":true},"face":{"enabled":true,"detector":{"modelPath":"blazeface-back.json","rotation":true,"maxFaces":10,"skipFrames":21,"skipInitial":false,"minConfidence":0.2,"iouThreshold":0.1,"scoreThreshold":0.2,"return":false,"enabled":true},"mesh":{"enabled":true,"modelPath":"facemesh.json"},"iris":{"enabled":true,"modelPath":"iris.json"},"description":{"enabled":true,"modelPath":"faceres.json","skipFrames":31},"emotion":{"enabled":true,"minConfidence":0.1,"skipFrames":32,"modelPath":"emotion.json"},"age":{"enabled":false,"modelPath":"age.json","skipFrames":33},"gender":{"enabled":false,"minConfidence":0.1,"modelPath":"gender.json","skipFrames":34},"embedding":{"enabled":false,"modelPath":"mobileface.json"}},"body":{"enabled":true,"modelPath":"posenet.json","maxDetections":10,"scoreThreshold":0.3,"nmsRadius":20},"hand":{"enabled":true,"rotation":false,"skipFrames":12,"skipInitial":false,"minConfidence":0.1,"iouThreshold":0.1,"scoreThreshold":0.5,"maxHands":1,"landmarks":true,"detector":{"modelPath":"handdetect.json"},"skeleton":{"modelPath":"handskeleton.json"}},"object":{"enabled":true,"modelPath":"nanodet.json","minConfidence":0.2,"iouThreshold":0.4,"maxResults":10,"skipFrames":41}}

Code snippet ;

let imageDetection = (await HUMAN_API.detectHuman(imageUrl));
 console.log(`imageDetectionHuman : ${JSON.stringify(imageDetection)}`);

Function;

async function detectHuman(url: string) {

    const tensor = await urlImageHuman(url);
    const result: vladHuman.Result = await human.detect(tensor) as vladHuman.Result;

    console.log('Results:');
    for (let i = 0; i < result.face.length; i++) {
        const face = result.face[i];
        const emotion = face.emotion.reduce((prev, curr) => (prev.score > curr.score ? prev : curr));
        console.log(`  Face: #${i} boxConfidence:${face.boxConfidence} faceConfidence:${face.boxConfidence} age:${face.age} genderConfidence:${face.genderConfidence} gender:${face.gender} emotionScore:${emotion.score} emotion:${emotion.emotion} iris:${face.iris}`);
    }
    for (let i = 0; i < result.body.length; i++) {
        const body = result.body[i];
        console.log(`  Body: #${i} score:${body.score}`);
    }
    for (let i = 0; i < result.hand.length; i++) {
        const hand = result.hand[i];
        console.log(`  Hand: #${i} confidence:${hand.confidence}`);
    }
    for (let i = 0; i < result.gesture.length; i++) {
        const [key, val] = Object.entries(result.gesture[i]);
        console.log(`  Gesture: ${key[0]}#${key[1]} gesture:${val[1]}`);
    }
    for (let i = 0; i < result.object.length; i++) {
        const object = result.object[i];
        console.log(`  Object: #${i} score:${object.score} label:${object.label}`);
    }
    result.face.length = 0;
    return result;
}

Result; imageDetectionHuman : {"face":[],"body":[],"hand":[],"gesture":[],"object":[{"id":11,"strideSize":4,"score":0.35,"class":57,"label":"chair","center":[301,253],"centerRaw":[0.47115384615384615,0.5288461538461539],"box":[264,225,87,46],"boxRaw":[0.41256009615384615,0.47025240384615385,0.13671875,0.09765625]},{"id":4,"strideSize":1,"score":0.3,"class":1,"label":"person","center":[418,276],"centerRaw":[0.6538461538461539,0.5769230769230769],"box":[268,51,350,412],"boxRaw":[0.41947115384615385,0.10817307692307687,0.546875,0.859375]},{"id":0,"strideSize":1,"score":0.2,"class":60,"label":"bed","center":[270,240],"centerRaw":[0.4230769230769231,0.5],"box":[20,15,600,480],"boxRaw":[0.03245192307692307,0.03125,0.9375,1]}],"performance":{"backend":0,"load":504,"image":0,"face":46,"body":33,"hand":115,"object":73,"gesture":0,"total":267}}

vladmandic commented 3 years ago

don't see anything wrong with config from the top of my head. can you:

okeoke85 commented 3 years ago

I do need uri method to load models, cause i will use serveless solution i guess, i didnt get any problems with it, other things you wanted are below;

thanks.

Image; 1_2021-04-23 22_18_10

Function;

async function urlImageHuman(url: string): Promise<undefined | any> {

    const fetch = require("node-fetch");
    const res = await fetch(url);
    let buffer = await res.buffer();

    const decoded = human.tf.node.decodeImage(buffer);
    const casted = decoded.toFloat();
    let tensor = casted.expandDims(0);

    tensor = human.tf.tidy(() => {
        let res;
        //RGBA imaj varsa RGB ye dönüştürüyoruz
        if (tensor.shape[2] !== 3) {
            const channels = human.tf.split(tensor, 4, 2); // split rgba to channels
            const rgb = human.tf.stack([channels[0], channels[1], channels[2]], 2); // stack channels back to rgb and ignore alpha
            res = human.tf.reshape(rgb, [1, decoded.shape[0], decoded.shape[1], 3]); // move extra dim from the end of tensor and use it as batch number instead
        } else {
            res = casted.expandDims(0); // only add batch number
        }
        return res;
    });

    return tensor;
}
vladmandic commented 3 years ago

I do need uri method to load models, cause i will use serveless solution i guess, i didnt get any problems with it

That's fine, I just wanted to check.

Anyhow, i've run a test with your image and detection works just fine:

# node demo/node /tmp/test.jpg
2021-04-24 12:03:50 INFO:  @vladmandic/human version 1.6.1
...
2021-04-24 12:03:51 INFO:  Loading image: /tmp/test.jpg
2021-04-24 12:03:51 STATE: Processing: [ 1, 480, 640, 3, [length]: 4 ]
2021-04-24 12:03:52 DATA:  Results:
2021-04-24 12:03:52 DATA:    Face: #0 boxConfidence:0.88 faceConfidence:0.88 age:33.3 genderConfidence:0.98 gender:male emotionScore:0.41 emotion:sad iris:1.17
2021-04-24 12:03:52 DATA:    Body: #0 score:0.8 landmarks:9
2021-04-24 12:03:52 DATA:    Hand: N/A
2021-04-24 12:03:52 DATA:    Gesture: face#0 gesture:facing right
2021-04-24 12:03:52 DATA:    Gesture: body#0 gesture:leaning left
2021-04-24 12:03:52 DATA:    Gesture: iris#0 gesture:looking left
2021-04-24 12:03:52 DATA:    Object: #0 score:0.73 label:person

also tested with loading from url, same results:

# node demo/node http://localhost:10020/test.jpg

can you check your urlImageHuman function vs reference implementation found at https://github.com/vladmandic/human/blob/main/demo/node.js#L54

Update

Ah! Just saw it - you're running expandDims before RGBA->RGB conversion, so tensor shape is different than expected and conversion basically corrupts the input image. Conversion function already takes care of batching, your explict op before conversion should be removed.

okeoke85 commented 3 years ago

Thank you so much, guess there will be lots of questions about tensoring image coming up in time :)

vladmandic commented 3 years ago

decode results in 3d tensor: height x width x depth.
and tensor is basically a specially created n-dimensional array that can be in anywhere (e.g. main memory, gpu memory, etc.) and is referenced with getters and putters, not directly.

expand(0) adds one more dimension at the beginning as that is standard for 90% of models, first dimension is image number so you could batch multiple images in single request. but since models don't support batching, first dimension is just 1.

so after that when you were comparing shape[2] === 3, instead of comparing color depth, you were comparing image - which is never the case since all the dimensions moved by one :). and you can guess what split and stack then did - corrupt the image.

btw, easiest way to see what's going on is to print tensor.shape after each operation

okeoke85 commented 3 years ago

thank you, tensor.shape check is what i'm going to do, but i ll keep asking :)