Open ziyuan-linn opened 1 year ago
Hi everyone, I think we met the same concerns for the model prediction output for face mesh, and I put my solution here (Feedback needed!).
The tf.js original output for face mesh looks like this:
[
{
box: {
xMin: 304.6476503248806,
xMax: 502.5079975897382,
yMin: 102.16298762367356,
yMax: 349.035215984403,
width: 197.86034726485758,
height: 246.87222836072945
},
keypoints: [
{x: 406.53152857172876, y: 256.8054528661723, z: 10.2, name: "lips"},
{x: 406.544237446397, y: 230.06933367750395, z: 8},
...
],
}
]
My idea for organizing the key points is to expose its centerX, centerY and width and height based on some basic calculation.
featuresData {
"leftEye": {
"centerX": ,
"centerY": ,
"width": ,
"height":
},
"rightEye": {
"centerX": ,
"centerY": ,
"width": ,
"height":
},
{...},
{...},
{...},
}
This is my function for getting the data, and I'm wondering if we need to organize all the face features so that the user could use them directly or just leave an example?
//A function to store basic data for certain facial features, this example is for lips
//We have faceOval,rightEyebrow, leftEyebrow, rightEye, leftEye, lips
function featuresData(){
if (predictions.length > 0){
for (let i = 0; i < predictions.length; i += 1) {
const face = predictions[i];
const fKeypointX = [];
const fKeypointY = [];
for (let j = 0; j < face.keypoints.length; j += 1) {
// console.log(Object.values(keypoint)[3]); //The name of all facial features
const keypoint = face.keypoints[j];
if (Object.values(keypoint)[3]=="lips") {
fKeypointX.push(keypoint.x);
fKeypointY.push(keypoint.y);
};
};
//Create an example class of important data of facial features
const featuresData = {
lips: {
centerX: avg(fKeypointX),
centerY: avg(fKeypointY),
fWidth: length(fKeypointX),
fheight: length(fKeypointY),
}
};
// console.log(featuresData);
};
function avg(x){
return (max(x)+min(x))/2
}
function length(x){
return max(x)-min(x)
}
}
}
I found that the facemesh model do not has a preset nose area, do we need to have a preset nose?
The keypoint Diagram is really useful for me! I also add a function for users to get the index of the points closest to their mouse.
Here's my function to show the index of the points
//Show the index of the points
function directPoints(){
let dMouse = [];
let closest = 0;
if (predictions.length > 0){
for (let i = 0; i < predictions.length; i += 1) {
const face = predictions[i];
for (let j = 0; j < face.keypoints.length; j += 1) {
const keypoint = face.keypoints[j];
//calculate the distance between mouse and points
let d = dist(keypoint.x,keypoint.y,mouseX,mouseY);
dMouse.push(d);
}
let minimum = min(dMouse);
closest = dMouse.indexOf(minimum);
fill(255,0,0);
ellipse(predictions[i].keypoints[closest].x, predictions[i].keypoints[closest].y, 5, 5);
console.log(closest);
dMouse.splice(0,dMouse.length);
}
}
}
Feedback Needed!
Hi @B2xx, if you take a look at @ziyuan-linn's latest in #35, this may help as a guide for the face keypoints!
One comment about your earlier post is that the featuresData
property isn't a clear name for me. Does the API output an array of faces or just one face only? Regardless, I think any face
object can include the "parts" directly along with a keypoints
array. I'm imagining something like:
function gotFaces(faces) {
// all faces
console.log(faces);
// one face
console.log(faces[0]);
// / bounding box of face, not sure if x,y should be centered or top left?
console.log(faces[0].x, faces[0].y, faces[0].width, faces[0].height);
// all keypoints
console.log(faces[0].keypoints);
// one keypoint
console.log(faces[0].keypoints[0].x, faces[0].keypoints[0].y);
// x,y of a part (should this be center or top left?
// should width and height also be included for part bounding box?)
console.log(faces[0].mouth.x, faces[0].mouth.y);
// all of the part keypoints
console.log(faces[0].mouth.keypoints);
// x,y of one part keypoint
console.log(faces[0].mouth.keypoints[0].x, faces[0].mouth.keypoints[0].y);
// etc.
}
Hi @shiffman, we have updated the output of facemesh model according to @ziyuan-linn's latest in #35, and its output looks like this now!
[
{
"box": { //<-----------------------add
"height": 115.38676768541336,
"width": 93.99256706237793,
"xMax": 249.73242282867432,
"xMin": ...,
"yMax": ...,
"yMin": ...,
},
"faceOval": [ //<-----------------------add
{
"x": 202.27954387664795,
"y": 50.33646672964096,
"z": 2.1165020763874054,
},
{...},
{...},
{...},
{...},
],
"keypoints": [
{
"x": 201.72533988952637,
"y": 122.80799746513367,
"z": 13.084457814693451,
"name": "lips"
},
],
"leftEye":[], //<-----------------------add
"leftEyebrow":[], //<-----------------------add
"ringFinger":[], //<-----------------------add
"lips":[], //<-----------------------add
"rightEye":[], //<-----------------------add
"rightEyebrow":[], //<-----------------------add
Besides, I have made a pull request of our newest facemesh-noeventestr to merge to the main, could you look into it?
Thank you @ziyuan-linn for helping us debug the output of facemesh!
One other option is to return an object which has methods and not just raw data. Like we would define a HandPrediction
class and return an instances of it.
Possible APIs:
prediction.getKeypoint('pinkyTip'); // returns x, y
prediction.getKeypoint3D('pinkyTip'); // returns x, y, z
prediction.getShape('ringFinger'); // returns an array of points?
prediction.getBoundingBox(); // returns the rectangle dimensions
prediction.getKeypoints(); // return the array of all x, y points
Let me know if you want my help with this.
Hi @lindapaiste, thank you so much for following the continued development of this library! Your previous work and pull requests have been an invaluable resource as we look to reboot and release a "next generation" ml5.js!
I like this idea and see how it could help simplify things, especially for a face detection model which includes many parts, keypoints, etc. Returning a p5.Vector
could also be very convenient (but then reduces compatibility outside of p5.js) Curious to hear from everyone else! cc @MOQN @gohai @ziyuan-linn @sproutleaf (and more!)
Returning a
p5.Vector
could also be very convenient (but then reduces compatibility outside of p5.js)
We could potentially return a p5.Vector
when p5 is loaded and a otherwise return a plain object with x
, y
, and z
. The p5.Vector
has properties x
, y
, and z
so it would not be dramatically different between the two modes.
Rough code based on the handpose data
function maybeVector(point) {
if (p5Utils.checkP5()) {
const p5 = p5Utils.p5Instance;
return p5.createVector(point.x, point.y, point.z);
} else return point;
}
class DetectedHand {
constructor(data) {
this.data = data;
}
getKeypoints() {
return this.data.keypoints.map(maybeVector);
}
getKeypoints3D() {
return this.data.keypoints3D.map(maybeVector);
}
_findKeypoint(array, partName) {
const point = array.find(point => point.name === partName);
if (!point) {
throw new Error(
`No keypoint found with name ${partName}.\n
Available names: ${this.data.keypoints.map(point => point.name).join(', ')}`
);
}
return maybeVector(point);
}
getKeypoint(partName) {
return this._findKeypoint(this.data.keypoints, partName);
}
getKeypoint3D(partName) {
return this._findKeypoint(this.data.keypoints3D, partName);
}
getShape(partName) {
// may require a specific mapping of keypoints to parts
return this.data.keypoints
.filter(point => point.name.startsWith(partName))
.map(maybeVector);
}
}
I think this is also in a settled place as we move towards 1.0 release and perhaps this should also be closed? The discussion here is of course welcome to continue, but I'm hesitant to make any major API changes before release!
Hi everyone, I'm opening this thread to discuss about the model prediction output for hand detection. Though I think a lot of things here can also be applied to other landmark detection models.
Keypoints
The tf.js original output for hand detection looks like this:
One idea is to expose each keypoint by name so they can be more intuitively accessed, for example:
@yining1023 suggested grouping landmarks of each finger together with intuitive names like wrist, thumb, etc...
Handedness
I think this feature could potentially be very useful for users. However, the handedness is the opposite of the actual hand (left hand labeled as right). I found that when
flipHorizontal
is set to true, the handedness would be labeled correctly. We could potentially flip the handedness value within ml5 whenflipHorizontal
is false.Keypoint Diagram
Tf.js have a diagram outlining each index and the name of each keypoint.
I personally find this kind of diagram very helpful when trying to find a landmark point quickly. I think there are similar diagrams for other tf.js landmark detection models. @MOQN Do you think we could display or link these diagrams on the new website?
I'm happy to hear any suggestions or ideas!