takuya-takeuchi / FaceRecognitionDotNet

The world's simplest facial recognition api for .NET on Windows, MacOS and Linux
MIT License
1.25k stars 305 forks source link

Code find a error #1

Closed chinapsu closed 6 years ago

chinapsu commented 6 years ago

Source Python code here is :

def _raw_face_locations(img, number_of_times_to_upsample=1, model="hog"):
    """
    Returns an array of bounding boxes of human faces in a image

    :param img: An image (as a numpy array)
    :param number_of_times_to_upsample: How many times to upsample the image looking for faces. Higher numbers find smaller faces.
    :param model: Which face detection model to use. "hog" is less accurate but faster on CPUs. "cnn" is a more accurate
                  deep-learning model which is GPU/CUDA accelerated (if available). The default is "hog".
    :return: A list of dlib 'rect' objects of found face locations
    """
    if model == "cnn":
        return cnn_face_detector(img, number_of_times_to_upsample)
    else:
        return face_detector(img, number_of_times_to_upsample)

But Code Here is :

  private IEnumerable<MModRect> RawFaceLocations(Image faceImage, int numberOfTimesToUpsample = 1, Models model = Models.Hog)
        {
            switch (model)
            {
                case Models.Hog:
                    return CnnFaceDetectionodelV1.Detect(this._CnnFaceDetector, faceImage.Matrix, numberOfTimesToUpsample);
                default:
                    return this._FaceDetector.Operator(faceImage.Matrix, numberOfTimesToUpsample).Select(rectangle => new MModRect() { Rect = rectangle });
            }
        }

At FaceRecognition.cs File

Can you fix this ? Thank you ~

You should 'case Models.Cnn' in switch.

👍

takuya-takeuchi commented 6 years ago

Oh, what a stupid bug!! Thank you for your report!!

You will see fixed lib on tomorrow morning :)

chinapsu commented 6 years ago

Is here need to be change to 👍

 private IEnumerable<MModRect> RawFaceLocations(Image faceImage, int numberOfTimesToUpsample = 1, Models model = Models.Hog)
        {
            switch (model)
            {
                case Models.Cnn:
                    return CnnFaceDetectionodelV1.Detect(this._CnnFaceDetector, faceImage.Matrix, numberOfTimesToUpsample);
                default:
                    {
                        using (var pyr = new PyramidDown(2))
                        {
                            var rects = new List<MModRect>();
                            var image = faceImage.Matrix;
                            var levels = numberOfTimesToUpsample;
                            while (levels > 0)
                            {
                                levels--;
                                DlibDotNet.Dlib.PyramidUp<PyramidDown>(image, 2);
                            }
                            var dets = this._CnnFaceDetector.Operator(image);
                            foreach (var d in dets.First())
                            {
                                var drect = pyr.RectDown(new DRectangle(d.Rect), (uint)numberOfTimesToUpsample);
                                d.Rect = new Rectangle((int)drect.Left, (int)drect.Top, (int)drect.Right, (int)drect.Bottom);
                                rects.Add(d);
                            }
                            return rects;
                        }
                    }
            }
        }

Otherwise, no face will be recognized when using the Yalta.jpg image. But DlibDonet can find 4 faces.

takuya-takeuchi commented 6 years ago

Could you tell me where is the above source from? I checked dlib c++ code of python, but there is not special code.

    if model == "cnn":
        return cnn_face_detector(img, number_of_times_to_upsample)
    else:
        return face_detector(img, number_of_times_to_upsample)

I guess number_of_times_to_upsample is wrong name. face_detector is typedef of 'dlib::object_detector<dlib::scan_fhog_pyramid<dlib::pyramid_down<6u>, dlib::default_fhog_feature_extractor> >'. You can see it in http://dlib.net/python/index.html#dlib.simple_object_detector. And the operator of this class is defined in https://github.com/davisking/dlib/blob/master/dlib/image_processing/object_detector.h#L420-L461

Therefore, I need not to change internal statement of switch.

Please notify if there is wrong.

takuya-takeuchi commented 6 years ago

First of all, I fixed the switch case statement in f7290dcd722a24625c89e50b31f8d01d156602f8. Please pull the latest develop branch and have a try!!!

chinapsu commented 6 years ago

Hello, The code :

    if model == "cnn":
        return cnn_face_detector(img, number_of_times_to_upsample)
    else:
        return face_detector(img, number_of_times_to_upsample)

Copy from Here: https://github.com/ageitgey/face_recognition/blob/master/face_recognition/api.py But the code:

else:
        return face_detector(img, number_of_times_to_upsample)

The Argument number_of_times_to_upsample:" How many times to upsample the image looking for faces. Higher numbers find smaller faces."

But in DlibDotnet Project:

https://github.com/takuya-takeuchi/DlibDotNet/blob/5d25c234b7d8c9de10071c4279e2ded65a3fbff5/src/DlibDotNet/ImageProcessing/FrontalFaceDetector.cs

        #region Methods

        public Rectangle[] Operator(Array2DBase image, double threshold = 0d)
        {
            this.ThrowIfDisposed();

            if (image == null)
                throw new ArgumentNullException(nameof(image));

            image.ThrowIfDisposed();

            using (var dets = new StdVector<Rectangle>())
            {
                var inType = image.ImageType.ToNativeArray2DType();
                var ret = Native.frontal_face_detector_operator(this.NativePtr, inType, image.NativePtr, threshold, dets.NativePtr);
                switch (ret)
                {
                    case Dlib.Native.ErrorType.InputArrayTypeNotSupport:
                        throw new ArgumentException($"Input {inType} is not supported.");
                }

                return dets.ToArray();
            }
        }

        public Rectangle[] Operator(MatrixBase image, double threshold = 0d)
        {
            this.ThrowIfDisposed();

            if (image == null)
                throw new ArgumentNullException(nameof(image));

            image.ThrowIfDisposed();

            using (var dets = new StdVector<Rectangle>())
            {
                var inType = image.MatrixElementType.ToNativeMatrixElementType();
                var ret = Native.frontal_face_detector_matrix_operator(this.NativePtr, inType, image.NativePtr, threshold, dets.NativePtr);
                switch (ret)
                {
                    case Dlib.Native.ErrorType.InputElementTypeNotSupport:
                        throw new ArgumentException($"Input {inType} is not supported.");
                }

                return dets.ToArray();
            }
        }

        #region Overrides

The Second argument of Operator method is:double threshold = 0d

So we should Add code like :

https://github.com/takuya-takeuchi/FaceRecognitionDotNet/blob/master/src/FaceRecognitionDotNet/Dlib/Python/CnnFaceDetectionodelV1.cs

                    // Upsampling the image will allow us to detect smaller faces but will cause the
                    // program to use more RAM and run longer.
                    var levels = upsampleNumTimes;
                    while (levels > 0)
                    {
                        levels--;
                        DlibDotNet.Dlib.PyramidUp<PyramidDown>(image, 2);
                    }

                    var dets = net.Operator(image);

                    // Scale the detection locations back to the original image size
                    // if the image was upscaled.
                    foreach (var d in dets.First())
                    {
                        var drect = pyr.RectDown(new DRectangle(d.Rect), (uint)upsampleNumTimes);
                        d.Rect = new Rectangle((int)drect.Left, (int)drect.Top, (int)drect.Right, (int)drect.Bottom);
                        rects.Add(d);
                    }

                    return rects;

First PyramidUp and RectDown at last.

The result:

private IEnumerable<MModRect> RawFaceLocations(Image faceImage, int numberOfTimesToUpsample = 1, Models model = Models.Hog)
        {
            switch (model)
            {
                case Models.Cnn:
                    return CnnFaceDetectionodelV1.Detect(this._CnnFaceDetector, faceImage.Matrix, numberOfTimesToUpsample);
                default:
                    {
                        using (var pyr = new PyramidDown(2))
                        {
                            var rects = new List<MModRect>();
                            var image = faceImage.Matrix;
                            var levels = numberOfTimesToUpsample;
                            while (levels > 0)
                            {
                                levels--;
                                DlibDotNet.Dlib.PyramidUp<PyramidDown>(image, 2);
                            }
                            var dets = this._CnnFaceDetector.Operator(image);
                            foreach (var d in dets.First())
                            {
                                var drect = pyr.RectDown(new DRectangle(d.Rect), (uint)numberOfTimesToUpsample);
                                d.Rect = new Rectangle((int)drect.Left, (int)drect.Top, (int)drect.Right, (int)drect.Bottom);
                                rects.Add(d);
                            }
                            return rects;
                        }
                    }
            }
        }

If do it, We can Find same faces count by FaceLocations method using Hog or Cnn when detect Yalta.jpg. https://github.com/takuya-takeuchi/FaceRecognitionDotNet/blob/master/src/FaceRecognitionDotNet/FaceRecognition.cs.

takuya-takeuchi commented 6 years ago

Thank you for lots of your contribution. I believe you are right.

However, FaceRecignutionDotNet aims to port to C#. I want to avoid to write original implementation apart from face_recognition.

Therefore, I posted https://github.com/ageitgey/face_recognition/issues/573.

Please, please give me time to resolve this issue!!

takuya-takeuchi commented 6 years ago

This is not bug. Refer #4. I checked python and dlib code by adding printf function. So current implementation called correct function/method.

takuya-takeuchi commented 6 years ago

I'm verry sorry. You are correct. _raw_face_locations does not call object_detector.operator() directly. _raw_face_locations create the following call stack

This source is come from tools/python/src/object_detection.cpp

    {
    typedef simple_object_detector type;
    py::class_<type, std::shared_ptr<type>>(m, "fhog_object_detector",
        "This object represents a sliding window histogram-of-oriented-gradients based object detector.")
        .def(py::init(&load_object_from_file<type>),
"Loads an object detector from a file that contains the output of the \n\
train_simple_object_detector() routine or a serialized C++ object of type\n\
object_detector<scan_fhog_pyramid<pyramid_down<6>>>.")
        .def("__call__", run_detector_with_upscale2, py::arg("image"), py::arg("upsample_num_times")=0,
"requires \n\