Qualeams / Android-Face-Recognition-with-Deep-Learning-Library

Face Recognition library for Android devices is an Android library (module) which includes several face recognition methods.
Apache License 2.0
371 stars 135 forks source link

Black screen while recognizing #8

Closed Zumbalamambo closed 7 years ago

Zumbalamambo commented 7 years ago

I tried to recognise the face using the latest library build. But then , the mobile screen is black fully.

Im getting the following message in logcat.

06-10 16:28:58.578 8532-8542/ch.zhaw.facerecognition W/art: Suspending all threads took: 17.306ms
06-10 16:29:24.238 8532-8542/ch.zhaw.facerecognition W/art: Suspending all threads took: 8.277ms
06-10 16:29:35.658 8532-8548/ch.zhaw.facerecognition W/art: Suspending all threads took: 7.428ms
06-10 16:29:39.818 8532-8542/ch.zhaw.facerecognition W/art: Suspending all threads took: 9.988ms
06-10 16:29:40.508 8532-8548/ch.zhaw.facerecognition W/art: Suspending all threads took: 6.632ms
06-10 16:29:41.828 8532-8542/ch.zhaw.facerecognition W/art: Suspending all threads took: 15.030ms
06-10 16:29:49.358 8532-8542/ch.zhaw.facerecognition W/art: Suspending all threads took: 8.884ms

This occurs in most of my mobiles. May I know how to sort this out?

Zumbalamambo commented 7 years ago

I have tried logging. The following method in EigenFaces.java takes forever to load.

   public void loadFromFile() {
        Log.d(TAG, "Loading from file");
        FileHelper fh = new FileHelper();
        MatName mOmega = new MatName("Omega", Omega);
        MatName mPsi = new MatName("Psi", Psi);
        MatName mEigVectors = new MatName("eigVectors", eigVectors);
        List<MatName> listMat = new ArrayList<MatName>();
        listMat.add(mOmega);
        listMat.add(mPsi);
        listMat.add(mEigVectors);
        Log.d(TAG, "Finished Populating list mat and its size is " + listMat.size());
        listMat = fh.getMatListFromXml(listMat, fh.EIGENFACES_PATH, filename);
        Log.d(TAG, "Done loading mat list from xml");

        Log.d(TAG, "Path " + fh.EIGENFACES_PATH);
        Log.d(TAG, "Filename " + filename);
        for (MatName mat : listMat) {
            Log.d(TAG, "Mat name " + mat.getName());
            switch (mat.getName()) {
                case "Omega":
                    Log.d(TAG, "Done loading omega");
                    Omega = mat.getMat();
                    break;
                case "Psi":
                    Log.d(TAG, "Done loading psi");
                    Psi = mat.getMat();
                    break;
                case "eigVectors":
                    Log.d(TAG, "Done loading eigvectors");
                    eigVectors = mat.getMat();
                    break;
            }
        }

        Log.d(TAG, "out of for loop");
        labelList = fh.loadIntegerList(fh.createLabelFile(fh.EIGENFACES_PATH, "train"));
        Log.d(TAG, "labelList size" + labelList.size());
        labelMap = fh.getLabelMapFromFile(fh.EIGENFACES_PATH);
        Log.d(TAG, "labelMap size" + labelMap.size());
    }

How can I be able to optimise it?

sladomic commented 7 years ago

@Zumbalamambo does your first problem still persist?

About the 2nd one. The Eigenfaces implementation was not touched for a year since Eigenfaces is outdated anyway. But if you want to improve the performance, you would need to serialize the Mat Objects into a file. The problem is, that the FileStorage of OpenCV is not implemented for Android (http://answers.opencv.org/question/8873/best-way-to-store-a-mat-object-in-android/?answer=30688#post-id-30688) probabaly you could also write a JNI wrapper for the C++ function of OpenCV

Zumbalamambo commented 7 years ago

Okay thank you so much. This is very very helpful for me to learn. I have tried Tensorflow implementation. It gives optimal accuracy in recognition. May I know why do we need optimized_facenet.pb and how you've created this model?

sladomic commented 7 years ago

@Zumbalamambo with the facenet model it works 4 times faster for me (the model is also 5-6 times smaller). Also it allows you to include the model as an asset for your app (maximum app size is 100Mb, so this wouldn't work with the vgg_faces model, which is 553Mb alone).

I used the frozen facenet model, thanks to the user apollo-time, then I just optimized it for inference on Android devices using graph_transforms with these parameters:

bazel build tensorflow/tools/graph_transforms:transform_graph
bazel-bin/tensorflow/tools/graph_transforms/transform_graph \
--in_graph=tmp/facenet.pb \
--out_graph=tmp/optimized_facenet.pb \
--inputs='input' \
--outputs='embeddings' \
--transforms='
  strip_unused_nodes(type=float, shape="1,160,160,3")
  fold_constants(ignore_errors=true)
  fold_batch_norms
  fold_old_batch_norms
  round_weights(num_steps=256)'
Zumbalamambo commented 7 years ago

You are super genius. Amazing efforts. Really mind blowing. I have learnt a lot. Im afraid to try bazel build. Two weeks back I tried retraining with MNIST data set. It worked. Later on, I wasn't be able to restart my mac. I have to erase everything from my mac and reinstalled os for my mac to work as usual. Lost all my Projects. :(

Zumbalamambo commented 7 years ago

I have analysed the TensorFlow implementation and I have the following observations.Please correct me if Im wrong.

1. Training using KNN

  1. Convert the image to Grayscale
  2. a) Crop and resize the grayscale image to 160 x160 size (//I under stand that the resizing will resize the image without loosing the facial data but why do we need to crop it since we are resizing it later?). b) Reshape the image (Convert 128x1 image to 1x128 mat) and add the mat to trainingList c) Check if the person's face has been added to the database already (featuresAlreadyExtracted = true). If true, we skip adding else we add them to the database.
  3. Save the preprocessed Mat as an image (preprocessedImage.png).
  4. Save the trainingList to xml file.

2. Recognition using KNN:

  1. in loadFromFile method , we load the labelList from label_train.xml , OneToOneMap from labelMap_train.xml file and trainingList from knn_traininglist.xml
  2. Convert the labelList to Mat (Why should we fill shorter labels with 0)?
  3. The knn algorithm finds the nearest neighbor. The nearst value will have the name (Key which is the name of the person). (But which data is used for the plot? is it the trainingList or labelList? The recognition part is bit confusing me :( )
Zumbalamambo commented 7 years ago

Also please correct me if Im wrong anywhere in understanding feature vector part.

public Mat getFeatureVector(Mat img) {

        // Resize the image to 160 x 160
        Imgproc.resize(img, img, new Size(inputSize, inputSize));

        /*
        Copy the input data into TensorFlow.
          input layer -> input
          getPixels(img) -> Normalized image data (Pre process the image data from 0-255 int to float
          inputSize = 160
          output size = 160
          channels = 3  (R-Red G- Green B-Blue so 3 channels)
         */
        inferenceInterface.feed(inputLayer, getPixels(img), 1, inputSize, inputSize, channels);
        // Run the inference call.
        inferenceInterface.run(new String[]{outputLayer}, logStats);
        float[] outputs = new float[outputSize];
        // Copy the output Tensor (Feature vector) back into the output array. (If you could tell what happens inside the inference interface after feeding, it will be very nice for me to understand deeply. Im so curious)
        inferenceInterface.fetch(outputLayer, outputs);

        List<Float> fVector = new ArrayList<>();
        for (float o : outputs) {
            fVector.add(o);
        }
        //Convert the feature vector list to mat
        return Converters.vector_float_to_Mat(fVector);
    }
sladomic commented 7 years ago

Haha thanks, you're welcome. One year before I was at the same position as you and it would have been nice if anyone could have shown me all of this.

I've never had problems with my Mac though, I hope you had a backup at least :)

Training KNN:

  1. Grayscale is not needed for the CNN model since this is trained on color images. But it is needed for the detection. What is done in the PreprocessorFactory is, that a copy of the image is used for detection. This copy is converted to Grayscale for Viola-Jones detection but the original image remains colored (if you use my default settings) 2a) Crop is cropping the image to only contain the face. That's to exclude distraction from the background. Resizing is either for performance reasons (so the processing with less pixels gets faster) or to fit the image to the next step (e.g. CNN model with input size of 160x160) 2c) featuresAlreadyExtracted is not used in KNN (see fist comment in code) - this flag is only used in SVM I think, to indicate if the input to the method already contains the features or still the raw image 4) This is what's slow and could be improved with serialization

Recognition KNN 2) :) that's a hack because the Mat needs to be filled with something. We fill it with the ASCII characters and fill up the shorter names to fill the Mat 3) What do you mean with plot? Showing the name on the screen? T-SNE plot?

getFeatureVector: output size = 128 inferenceInterface: Maybe this helps https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/android/java/org/tensorflow/contrib/android/TensorFlowInferenceInterface.java in general it's similar to the old implementation, that JNI is used to run TensorFlow but this time using the library

Zumbalamambo commented 7 years ago

No backups :( . i have got a bad habit of being lazy to backup my data :P . Lost all my projects including the one that has got working eigen face app. Now Im lazy to try eigen face after using CNN recognition :).

Can you please point me in a direction where I could apply serialization? If there is something like MatJson instead of MatXml , it will be fast i guess since XML is time consuming to read and retrieve.

Yes How it recognises the name using the T-SNE plot. Which value is being used for the T-SNE plot and how the plot gives the exact name. This is like a rocket science for me to understand whats happening behind the scene :| . Can you please help me to learn this...

The link that you have shared with me to understand TensorFlowInferenceInterface is really useful. Im thankful and grateful for sharing your knowledge

sladomic commented 7 years ago

"Can you please point me in a direction where I could apply serialization? If there is something like MatJson instead of MatXml , it will be fast i guess since XML is time consuming to read and retrieve."

You can try this https://stackoverflow.com/a/29297123 but I cannot just reuse it because of the license.

"Yes How it recognises the name using the T-SNE plot. Which value is being used for the T-SNE plot and how the plot gives the exact name. This is like a rocket science for me to understand whats happening behind the scene :| . Can you please help me to learn this..."

Ah, it is no rocket science, I think I just made it unclear in the code :) We just use t-SNE in Matlab for us to visualize the results. But for the recognition itself it's much easier. E.g. for SVM LIBSVM is used to retrieve the label by classification (jniSvmPredict(prediction + " " + model + " " + output)). Then the name is retrieved of the labelmap using the label (labelMap.getKey(iLabel)).

Zumbalamambo commented 7 years ago

Okay thank you so much. It becomes simple as I understood when u explained

List<Mat> images = ppF.getProcessedImage(processedImage, PreProcessorFactory.PreprocessingMode.RECOGNITION);
                                        if (images == null || images.size() > 1) {
                                            // More than 1 face detected --> cannot use this file for training
                                            continue;
                                        } else {
                                            processedImage = images.get(0);
                                        }

In the above code, why are you choosing only the mat at first index as processedImage = images.get(0);

sladomic commented 7 years ago

@Zumbalamambo we exclude all images, where we detect more than 1 face for training. Because all images in the training folder should only contain 1 face. So if 2 or more faces are detected, that means, that a wrong face has been detected (could be anything, for example a bookshelf in the background). If we would include this "face", it would affect the training.

Zumbalamambo commented 7 years ago

Thank you so much. Now Im understanding the code. The way you explain is very nice. Thank you so much. Thank you for your patience in explaining me.

Im trying to code it from scratch based on your code so I can understand it in deeper dimension.

I have a doubt. In PreProcessorFactory there is an enum called public enum PreprocessingMode {DETECTION, RECOGNITION} and instance variables private PreProcessor preProcessorRecognition; private PreProcessor preProcessorDetection; also in PreferenceHelper, I find the same enum public enum Usage {RECOGNITION, DETECTION};

Can you please tell me the purpose of this enums and those two instance variables ? :)

sladomic commented 7 years ago

You're right, I guess one of those enums could be replaced, since both are public.

This is used, because in the beginning we only had the preprocessing as one part (before the detection we did a grayscale preprocessing and before the recognition you could choose from a list of preprocessings). But then we tried to experiment with different preprocessings for the detection and therefore we needed two separate preprocessings. Now preProcessorDetection uses a different preprocessings list than preProcessorRecognition and also the image gets copied, so you can for example use grayscale for the detection but still color for the recognition.

One thing we want to add in the future is a better detector. So instead of using Viola-Jones, we will replace it with a CNN and then you will also be able to choose between different detectors instead of only classifiers.

Zumbalamambo commented 7 years ago

Please correct me if my understanding is wrong.

During Detection of face , we use PreprocessingDetection. During Recognition , we use PreprocessingDetection for finding the face and PreprocessingRecognition for recognising.

Also please consider using [IntDef](https://developer.android.com/reference/android/support/annotation/IntDef.html) . Because enums sometimes result in memory issues.

wow! Detection using CNN! Is it possible in an android device? You are so brilliant!

sladomic commented 7 years ago

hehe..no I'm just applying the algorithms someone else found..the researchers are the brilliant people..for example these https://arxiv.org/pdf/1503.03832.pdf https://arxiv.org/pdf/1704.04861.pdf http://www.vision.caltech.edu/html-files/EE148-2005-Spring/pprs/viola04ijcv.pdf

Zumbalamambo commented 7 years ago

You are my inspiration!... I have been only dreaming about image processing until I came across your code...Im grateful for your patience in helping me to learn...

You have made me to understand it as easy as 123 . I read it. It is like a alien language with lots of expressions. Is there any opensource library to implement this ? this sounds really cool...

Zumbalamambo commented 7 years ago

Oops... sorry... Just wanted to check what Close and comment button does in github...

Zumbalamambo commented 7 years ago

I tried to get the confidence score while trying to predict through Tensorflow using KNN

Following is my code,

 private synchronized double getConfidenceScore(Mat featureVectorToRecognize, Mat defaultFeature) {

        double dotProduct                   = defaultFeature.dot(featureVectorToRecognize);
        double normFeatureVector            = Core.norm(defaultFeature, Core.NORM_L2);
        double normFeatureVectorToRecognize = Core.norm(featureVectorToRecognize, Core.NORM_L2);
        double cosineSimilarity             = dotProduct / (normFeatureVector * normFeatureVectorToRecognize);
        double absoluteCosineSimilarity     = Math.abs(cosineSimilarity);
        Log.i(getClass().getName(), "Absolute cosine similarity : " + absoluteCosineSimilarity);
        return absoluteCosineSimilarity;
    }

I have called it by following way,

   public String recognize(Mat img, String expectedLabel) {
        Mat   result = new Mat();
        float nearest;
        img = getFeatureVector(img);
        addImage(img, expectedLabel, true);
        nearest = knn.findNearest(img, k, result);
        getConfidenceScore(result, img);
        return labelMap.getKey((int) nearest);
    }

This throws the following error :(

CvException [org.opencv.core.CvException: cv::Exception: /build/master_pack-android/opencv/modules/core/src/matmul.cpp:3402: error: (-215) mat.type() == type() && mat.size == size && func != 0 in function double cv::Mat::dot(cv::InputArray) const
                                                   ]
Zumbalamambo commented 7 years ago

Can you please help me to find the confidence score using tensorflow knn.. It is crashing even after lots of tries :(

sladomic commented 7 years ago

I was trying to implement it but then I had no time and I read through some literature. Usually KNN and SVM don't provide any probability because they are discrete classifiers and not regression algorithms. But a workaround to get at least a hint in the confidence is to use the cosine similarity. But this is used on the raw feature vectors and not on the output of SVM or KNN. If you want to implement it, you need to calculate the meanFeatureVector the recognized person/label as done in the function trainClassifier here https://github.com/literacyapp-org/literacyapp-android/blob/master/app/src/main/java/org/literacyapp/authentication/thread/TrainingThread.java and then calculate the cosine similarity between the image/featureVector vs. the meanFeatureVector.

So

  1. You get the label by using KNN or SVM
  2. You calculate the meanFeatureVector for this label using the featureVectors of this label of the trainingList
  3. You calculate the cosine similarity between the input image/featureVector and the meanFeatureVector

Please note that this is only a workaround and doesn't fully represent the confidence of KNN or SVM.

Zumbalamambo commented 7 years ago

okay thank you... waiting to learn cnn based face detection.. :)

ShahBhavya101094 commented 5 years ago

Hey! Are you try to Achieving Face Identification approach for Unknown face ?

Is it possible to detect face based on eye blink feature.