justadudewhohacks / face-api.js

JavaScript API for face detection and face recognition in the browser and nodejs with tensorflow.js
MIT License
16.53k stars 3.69k forks source link

Conditionally use High-Level API to detect face landmarks only when 1 face detected? #552

Open Infinitay opened 4 years ago

Infinitay commented 4 years ago

With my use-case, I am attempting to only detect the facial landmarks if and only if one face was detected. I am well aware detectSingleFace returns the face with the highest score, but in my situation, I am attempting to scrape and build a database, so I am not sure if the highest score will be of the right subject. In fact, I have already came across an example where that is not the case.

I could execute faceapi.detectAllFaces(input).withFaceLandmarks(), but to me this doesn't seem efficient. I would be detecting facial landmarks of the input even when there are multiple faces, which is what I am trying to avoid for the sake of efficiency.

Is it possible for me to conditionally use the high-level api, perhaps using promises or such? Below is a non-working snippet I was messing about, hopefully this helps portray my idea better.

await new Promise((resolve, error) => {
    faceapi.detectAllFaces(img).then(detectedFaces => {
        console.log(`Detected ${detectedFaces.length} faces.`);
        if (detectedFaces.length === 1) {
            resolve(detectedFaces)
        } else {
            error('Found more than one face');
        }
    }).withFaceLandmarks(landmarks => {
        console.log(landmarks);
    });
});
justadudewhohacks commented 4 years ago

You can use faceapi.detectAllFaces to get the bounding boxes, and then use the low level API to detect landmarks if detections === 1:

    const results = await faceapi.detectAllFaces(img)
    if (results.length === 1) {
      const faces = await faceapi.extractFaces(img, results.map(res => res.detection))
      const landmarks = await faceapi.nets.faceLandmark68Net.detectLandmarks(faces[0])
   }

You can also have a look at the implementation of the high level API, to get a better idea of how the low level API works: DetectFaceLandmarksTasks.ts

Infinitay commented 4 years ago

Thank you for the snippet and the additional link to an implementation.

I do have a few follow up questions if you don't mind clearing up:

Here is my current implementation with your help:

try {
    const image = await canvas.loadImage(await googleUtils.getImageAsBuffer(file.id));
    //console.log(`[${i++}] Detecting faces in ${file.id}`);
    const detections = await faceapi.detectAllFaces(image, faceapiOptions);
    if (detections.length == 1) {
        //console.log(`[${i}] Detecting single face in ${file.id} | Total found: ${facecount++}`);
        const faces = await faceapi.extractFaces(image, detections[0].detection); // doesnt work unless second argument is detections#map
        const landmarks = await faceapi.detectLandmarks(faces[0]);
        const descriptors = await faceapi.computeFaceDescriptor(faces[0]);
        return descriptors;
    }
} catch (err) {
    console.warn(`Hit an error trying to detect faces in file '${file.name}' (${file.id})`);
    console.error(err);
}
Infinitay commented 4 years ago

Scratch that, I forgot to remove my previous code I had where I was using the high-level api, and the code was executing with no problem: await faceapi.detectAllFaces(image, faceapiOptions).withFaceLandmarks().withFaceDescriptors();

But when I got rid of the other methods after detectAllFaces, to await faceapi.detectAllFaces(image, faceapiOptions), I am getting the following error:

~~TypeError: Cannot read property 'clipAtImageBorders' of undefined at \node_modules\face-api.js\build\commonjs\dom\extractFaces.js:48:58~~

~~I assume this is because of the image I am passing in: const image = await canvas.loadImage(await googleUtils.getImageAsBuffer(file.id));~~

Going to look into this a bit more

EDIT: Isn't faceapi.detectAllFaces supposed to return a FaceDetection that has a detection property? Because if so, it's not doing so in the snippet you provided

After looking at DetectFacesTasks I see that you should be passing in just the results array, and not results.map..., at least that's what fixed the error I was having above, because it was trying to reference res.detection, but results return a FaceDetection, but with no description property.

Infinitay commented 4 years ago

Please ignore the previous two comments. So I attempted the solution you've provided, passing in my img, const img = await canvas.loadImage(await googleUtils.getImageAsBuffer(<fileId>));, and I received the following error:

UnhandledPromiseRejectionWarning: TypeError: Cannot read property 'clipAtImageBorders' of undefined at ...\node_modules\face-api.js\build\commonjs\dom\extractFaces.js:48:58


However, I noticed that if I change your snippet to detectAllFaces followed by withLandmarks, it works fine, for example:

const img = await canvas.loadImage(await googleUtils.getImageAsBuffer(<fileId>));
const results = await faceapi.detectAllFaces(img).withFaceLandmarks();
if (results.length === 1) {
    const faces = await faceapi.extractFaces(img, results.map(res => res.detection));
    const landmarks = await faceapi.detectLandmarks(faces[0]);
}

Once again, if I remove withFaceLandmarks, it results in an error. However, I forgot to mention in the main post that I want the facial descriptors. So, when I attempt to compute the descriptors:

const img = await canvas.loadImage(await googleUtils.getImageAsBuffer(<fileId>));
const img2 = await canvas.loadImage(await googleUtils.getImageAsBuffer(<fileId>));
const results = await faceapi.detectAllFaces(img).withFaceLandmarks();
if (results.length === 1) {
    const faces = await faceapi.extractFaces(img, results.map(res => res.detection));
    const landmarks = await faceapi.detectLandmarks(faces[0]);
    console.log(JSON.stringify(await faceapi.computeFaceDescriptor(faces[0]))); // Returns descriptor A
    console.log(JSON.stringify(await faceapi.computeFaceDescriptor(img))); // Returns descriptor B
}
console.log(JSON.stringify((await faceapi.detectAllFaces(img2).withFaceLandmarks().withFaceDescriptors())[0].descriptor)); // Returns descriptor C
}

All three of those values return different results. I feel like I am not understanding something properly here. Because firstly, I thought that detectLandmarks manipulated the input and aligned it as described in the README for the high-level API.