google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.
https://ai.google.dev/edge/mediapipe
Apache License 2.0
27.76k stars 5.18k forks source link

use single bitmap image as input #665

Closed sobhan87068 closed 3 years ago

sobhan87068 commented 4 years ago

hello, i am trying to use mediapipe's face detection in as a part of my face recognition app which should be able to detect faces in different images and extract them for recognition process. Unfortunately, i'm unable to find a way to use bitmap as the input for face detection. I have inspected the repo link from #417 but it basically makes a stream of a single bitmap to be processed continuously. Are there any possibilities to achieve such capability? thank you in advance for your help

afsaredrisy commented 4 years ago

That repository which you inspected is using HandTracking sample same changes you can apply to face detection but what is the source of bitmap in your case ? if you have continues bitmap coming from a source then you can update bmp object of BmpProducer by creating a setter method in BMPProducer class . That will transmit the updated bitmap object to mediapipe converter. Let me know the further question if you are looking for something else

sobhan87068 commented 4 years ago

Consider it like the following scenario: a directory of image files is given and then each file has to be processed separately and its detected faces be cropped. Then the cropped faces get passed to the recognition module (which i already have implemented) to be categorized into similar faces.

afsaredrisy commented 4 years ago

Passing bitmaps from a directory can be achieved by making little changes to the BmpProducer for example let say all your frames are in drawable resource directory then you can send them sequentially to converter like this.

// Change this in BmpProducer class.
int frame_rate = 10;
int frame_count = 1;
int total_frames = 100 // let say 
 @Override
    public void run() {
        super.run();
        while ((true)){
           //Load Next Bitmap object from drawble , manage the names of bmp object so that you can make sequence like (bmp_name_1.png,bmp_name_2.png,........ .bmp_name_100.png)
          bmp = BitmapFactory.decodeResource(context.getResources(), R.drawable.bmp_name_frame_count);
           bmp = Bitmap.createScaledBitmap(bmp,480,640,true);
            customFrameAvailableListner.onFrame(bmp);
         // Repeate the same sequence when all frames processed 
           frame_count = (frame_count==total_frame)?1:frame_count+1;
            try{
                Thread.sleep(1000/frame_rate);
            }catch (Exception e){
                Log.d(TAG,e.toString());
            }
        }
    }

Now the main concern is how you are combining detection & recognition module one way is to create a graph with two sub-graph, one sub-graph will detect face (You can make changes to existing face detection graph) & will crop the face, Next sub-graph will recognize & render the result to output surface.
If you have implemented both of the modules like explain above then you can pass sequence of Bitmap objects by making above changes. If you are using two separate graph then you need to extract the bitmap from first graph output and pass it to second graph, This will not be the optimized way as it will reduce the number of frames process per seconds.

sobhan87068 commented 4 years ago

Thank you for your response. I still have a couple of questions: The recognition process is done separately from mediapipe since i am not so familiar with bazel build system (i use mediapipe aar in gradle project) and as a result of this i need to match the detection data and source bitmap how can i achieve this on packet callback? (i need the matching bitmap to know which images contain similar faces) Since the images are not from the same context (they usually have different scenes) i'm not using any tracking/inference between sequences of images and therefore, don't really need to make streams of images. Is there any way to input the bitmap directly rather than making a stream of it? (i think it would have better performance in my case tbh and less complexity)

MaskeZen commented 3 years ago

Hello @sobhan87068, has you get it working? I need to do the same thing, but I'm yet in a research instance. Any suggestion is welcome, thanks!

sgowroji commented 3 years ago

Hi @sobhan87068, Could you respond if you are still looking answer for your query. Thanks!

MaskeZen commented 3 years ago

Hello @sgowroji, I try to do the same thing but with face pose, if you has a reply to @sobhan87068 may be fit to my goal too. thanks!

google-ml-butler[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you.

sgowroji commented 3 years ago

Hi @MaskeZen, Can you try using our latest android solutions to input an image or video. Thanks!

sobhan87068 commented 3 years ago

Hi @MaskeZen sorry for the late response. Unfortunately no i could not solve it. From what I understood the project seems to be implemented with streams in mind. So the solution seems to be what @afsaredrisy has mentioned but sadly I could not get it to work in my case with the recognition process. I haven't checked the newer versions so I don't really know if anything has changed to help solve this.

sgowroji commented 3 years ago

You can try using this below code to input a image or video and display the graph on top of it. https://github.com/google/mediapipe/blob/master/mediapipe/examples/android/solutions/facemesh/src/main/java/com/google/mediapipe/examples/facemesh/MainActivity.java#:~:text=%7D)%3B-,Button%20loadImageButton%20%3D%20findViewById(R.id.button_load_picture)%3B,%7D,-/**%20The%20core%20MediaPipe

google-ml-butler[bot] commented 3 years ago

Are you satisfied with the resolution of your issue? Yes No