Open ghost opened 8 years ago
Thanks for your interest in this project.
I know what you are talking about with noisy, jumpy. Unfortunately I did not find a solution yet for this problem.
What do you mean with OSX? Did you use the same project on OSX?
I have one advice that could work, but I have not tested it. Maybe you should plot the rectangles that iOS gives you onto the camera image as well. Maybe these are a little bit too small for Dlib to find the face in. You could try padding the face rectangles with like 5-50px on each side and test if that works better than.
Thanks for the quick update.good idea! I'll give it a try and will let you know.
regarding OSX, I build form source for OSX per SATYA's tutorial
Could you please advice on where's the best place to do that ?(increase face rectangle)
I did some digging and it seems
(dlib::rectangle)convertScaleCGRect:(CGRect)rect toDlibRectacleWithImageSize:(CGSize)size
is doing the conversion from AVMetadataFaceObject bound to Dlib Rectangle, but increasing the size and origin in the that method wont fix it, instead it breaks face recognition.
I even tried another approach with transformedMetadataObjectForMetadataObject
to pass in the actual rect in the view instead of scalar values which is provided by AVMetadataFaceObject
but that didn't help either.
At this point I am certain of one thing and that's the face rect returned by AVMetadataFaceObject
is way too small and your suspicion is correct.
I am also considering switching to CIDetector to see if I can get the better value for face
any thoughts ?
Bump Did someone figure out how to improve noise level, I don't think I'm the only one with this issue.
Also I increase the detected rectangle but that didn't help
i'm working with a portrait orientation version of this app and i'm thinking it's less stable because we are initially passing the dlibwrapper function an image with smaller dimensions than the one we output. at a smaller scale the jumpiness wouldn't be so conspicuous, but upon ultimately scaling up the image we amplify the noise.
@zweigraf could that be it, at least for my issues? if so, then i guess i should be trying to pass in an image that is scaled properly before sending it off to dlib
I also see significant jitter / noise on iOS vs. OSX. I've been rendering the face detection box to debug and the noise + jitter is present even when the face detection remains constant and everything is motionless in the frame. My OSX webcam is worse quality and about the same resolution. Anyone ever resolve this issue?
I'm also using my own iOS project (just ended up here from googling), however I also built this one and see similar behavior.
@mosn / For anyone else who lands here from Googling:
I was able to figure out the issue on my end -- it was the result of reading the sample buffer as if it was BGR pixels when in fact it was BGRA. I fixed it by reading the buffer as BGRA into a CV_8UC4 mat and then converting the mat to BGR (because of assertions in dlib methods rejecting mats with an alpha channel).
@faoiseamh I think I'm experiencing the same issue - I used two iPads both stationary, with one viewing a video on the other (instead of shakily holding a camera up at my own face), and the amount of jitter was the same.
I'm not using a cv::mat at the point I'm reading from the AVCaptureSession sample buffer, I just step through the buffer 4 elements at a time and ignore the fourth value each time, creating a dlib::bgr_pixel from each set of 3 values.
Does that mean I need to introduce a step where I create a mat from this sample buffer, then convert it, and step through that in the same way as I do with the original buffer, described above? I'm not sure how I'd do that yet, but it would be good to know if this sounds like the exact process you followed for a reliable fix!
@Cloov If you're building the dlib datastructure directly that should be fine. The format of the buffer depends on the pixel format you've set for the av catpure session. If it's BGRA then your logic is right, but it may be one of several other more exotic formats. I would first validate the format you set. I'd also try rendering the dlib datastructure back to a UIImage or something that you can view in the debugger to validate that it's actually what you expect it to be. I was able to spot the bug easily once I rendered a frame.
I still have more jitter on an iPad I'm testing on vs. the same code on desktop, and I'm tracking down the sources of this. I'll update you if I resolve it. The fix I described above significantly improved the jitter though.
I gave the native ios face detection a try as a faster alternative to dlib's frontal face detector. The results are significantly faster, but the resulting bounding face box is not as optimal as input for dlib's landmarking. It seems to be generally smaller and as a result the landmarking get confused if the head is at even relatively small angles. This is probably a large source of the "noise" everyone is experiencing.
I've been using the native iOS face detection, but it was already in the project I began with. I tried simply expanding those boxes 5, 20, and then 50 pixels on each side (as a temporary measure) as the position of the face detection was fine, it just seemed to be that the rectangles were highlighting facial features tightly and not encompassing any more.
However, that didn't improve the jitter for me. If I look at videos online of the same 68-point landmark detection, there is some jitter in a similar place - for me, it's mostly around the chin area, and on all types of face too. Movements in the mouth don't work very well also.
I know you can retrain these or create your own, but since the file I'm using is so widely used, I don't think that's the solution at all.
@faoiseamh , I may still try skipping the native face detection and using Dlib's, as I have no other ideas at the moment! I did inspect the camera data as it goes through Dlib and pixel formats look fine - besides, I also now ensure I'm using the iOS camera's BGRA format.
@Cloov The face detection being too large is a problem as well. It seems the landmarking relies on fairly precise detection area. If it's too large the edges of the face can jump around, so chin and jaw line are problematic. My current iteration uses the ios native face detection as a starting point, expands that by 25% in all directions, then downsizes the resulting region and runs the dlib face detection on the downsized area. That's the best balance of performance and precision I've found. To reduce noise I'm also using a simple moving average of the output. You could enhance this with some outlier rejection using various noise reduction / outlier rejection statistical techniques, but this was sufficient for my needs.
There are a variety of other facial landmarking algorithms out there, and iOS 11 has a native landmarking feature as well in Vision.framework, so if I need to improve this further I'm going to try other libraries. I think I've squeezed as much performance from dlib as I can. My main additional needs are more robust performance for extreme angles (profile faces), which are not intended functionality with dlib.
@faoiseamh I think a lot of the more extreme noise I was experiencing was down to reflections! I was often pointing at 3d models on my screen. I think I also need to average out changes in my pose transform, because even when detection is going well, a quite-still target face causes a lot of shaking. Do you have any advice on averaging this motion out - techniques or algorithms? Are they in openCV/dlib?
I rolled my own and it's not advanced at all. I'm just using a moving average -- so averaging the last several values. You could take the trailing X values and use a more powerful stat technique to remove outliers if you wanted even better results. ᐧ
On Wed, May 16, 2018 at 11:48 AM, Shaun Campbell notifications@github.com wrote:
@faoiseamh https://github.com/faoiseamh I think a lot of the more extreme noise I was experiencing was down to reflections! I was often pointing at 3d models on my screen. I think I also need to average out changes in my pose transform, because even when detection is going well, a quite-still target face causes a lot of shaking. Do you have any advice on averaging this motion out - techniques or algorithms? Are they in openCV/dlib?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/zweigraf/face-landmarking-ios/issues/6#issuecomment-389568623, or mute the thread https://github.com/notifications/unsubscribe-auth/ABfXIs5p0-iqRsd4Mgik4--thBl0vNV_ks5tzEpZgaJpZM4Ja5PV .
-- Peter Jackson Parable Health, Inc. http://parablehealth.com
Confidentiality Notice: This message and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you are not the intended recipient, you are hereby notified that review, disclosure, dissemination, distribution, or copying of it or its contents is prohibited. If you have received this message in error, please notify the sender immediately and permanently delete this message and any attachments.
@Cloov @faoiseamh Hi guys, I ended up here from googleing too. @faoiseamh Not sure if your suggestion about changing BGRA to BGR will make a difference, because if you print the alpha components of your bgra sampleBuffer you can see that it is always \xff which is 100%. So basically it should be the same with just ignoring the alpha component as far as I understand. I use this exact project and while testing on my iPhone 4 I too experience a constant jumping/noise with eyes. I wonder if it can be improved somehow?
Currently, landmarks seems noisy and jump a lot (clear environment with sufficient light ), this is however a lot more stable when compiling dlib for OSX, is there something that we can do to improve the noise level on iOS?
Many thanks for your awesome work!