tucan9389 / PoseEstimation-CoreML

The example project of inferencing Pose Estimation using Core ML
https://github.com/motlabs/awesome-ml-demos-with-ios
MIT License
680 stars 136 forks source link

How to add another CoreML model? Detection output feed to another CoreML model (GAN)? #9

Open Russzheng opened 5 years ago

Russzheng commented 5 years ago

Hi, my group is trying to plug in another CoreML model into this app (a GAN model to generate new targets with same poses, paper: everybody dance now). However, we are all newbies with Swift, and we struggled a lot and still could not do it. Basically we just want to feed the skeletons into the new model and display the generated image/video from GAN.

I am wondering do you have any advice as for which part should we modify/add/remove and etc? We would much appreciate your help, thank you so much!

ps: We are working on Swift version of shufflenet-like OpenPose 17 keypoint detection and would love to merge it after everything is finished!

tucan9389 commented 5 years ago

@Russzheng Hi!

Typically iOS projects have XXViewController.swift which contains the business logic. This project has also JointViewController.swift. And check MLMultiArray+PoseEstimation.swift which use as post-processor after Core ML inference.

Plz, check these sources.

tucan9389 commented 5 years ago

OpenPose 17 keypoint detection suggestion is really cool. If you have some problem when contribute, write on the repo's issue. I'm going to help as possible.

Russzheng commented 5 years ago

@Russzheng Hi!

Typically iOS projects have XXViewController.swift which contains the business logic. This project has also JointViewController.swift. And check MLMultiArray+PoseEstimation.swift which use as post-processor after Core ML inference.

Plz, check these sources.

Hi @tucan9389, thank you so much for replying, actually we have already been modifying these two files. Our logic is: 1) find the UIImage that represents the skeletons 2) feed it to our GAN model and then modify the original UIImage. However, we could not quite locate which var represent the skeletons, we have tried 1) add layers, or set layer content of self.videoPreview.layer 2) modify pixelBuffer but it is 'let' constant and cannot be modified. Our logic is really simple on high-level, but we found it really hard to implement. Could you provide us some further insights? Thank you so so much in advance.

tucan9389 commented 5 years ago

Hi @tucan9389, thank you so much for replying, actually we have already been modifying these two files. Our logic is: 1) find the UIImage that represents the skeletons 2) feed it to our GAN model and then modify the original UIImage. However, we could not quite locate which var represent the skeletons, we have tried 1) add layers, or set layer content of self.videoPreview.layer 2) modify pixelBuffer but it is 'let' constant and cannot be modified. Our logic is really simple on high-level, but we found it really hard to implement. Could you provide us some further insights? Thank you so so much in advance.

@Russzheng
Does it mean you couldn't implement that logic? Please tell me what you tried and what is issue more details. If you have trouble with swift basic syntax, check this document or stackoverflow first.

Russzheng commented 5 years ago

Hi @tucan9389, thank you so much for replying, actually we have already been modifying these two files. Our logic is: 1) find the UIImage that represents the skeletons 2) feed it to our GAN model and then modify the original UIImage. However, we could not quite locate which var represent the skeletons, we have tried 1) add layers, or set layer content of self.videoPreview.layer 2) modify pixelBuffer but it is 'let' constant and cannot be modified. Our logic is really simple on high-level, but we found it really hard to implement. Could you provide us some further insights? Thank you so so much in advance.

@Russzheng Does it mean you couldn't implement that logic? Please tell me what you tried and what is issue more details. If you have trouble with swift basic syntax, check this document or stackoverflow first.

Hi @tucan9389 Thanks for the reply, I figured it out myself. Thanks!