WangLongZJU / DeepAC

[ICCV 2023] Deep Active Contours for Real-time 6-DoF Object Tracking
54 stars 6 forks source link

Demo on iOS/macOS device #12

Open graboosky opened 6 months ago

graboosky commented 6 months ago

First of all,...very impressive!

I have run demo, and converted model succesfully to be compatible with iOS.

And now I can see 7 models

Screenshot 2024-03-15 at 13 17 43

And question: to recreate cat-demo does it require all 7 models?

btw. is it ok that model-last.cpkt is ~105mb, but all converted 7 models are less than ...12mb?

WangLongZJU commented 6 months ago

Yes, it is ok. Because we use convert the pytorch model to mlpackage(Float16) via coremltools. First, we use extractor.mlpackage to extract image feature maps. Second, we use histogram.mlpackage to compute the color statistical information. Third, we use contour_feature_extractor*.mlpackage to extract contour feature maps. As the converted mlpackage needs to fix the input size, we convert three mlpackage for 3 levels respectively. For example, the input sizes of feature maps(or images) are [64x64], [128x128] and [256x256]. Next, we use boundary_predictor.mlpackage to predict the boundary probability for each correspondence line. Finally, we use derivaive_calculator.mlpackage to compute the Hessian and Jacobian matrix, which are used to optimize the object pose via the Newton method.

graboosky commented 6 months ago

Thanks for clarifying. I can't wait to play with it!:)

graboosky commented 6 months ago

One more question regarding demo, sorry for bother you @WangLongZJU.

I thought that most important part of this project is to get content of pose.txt?:P But it seems demo have already that file.

Please correct if I am wrong, but each row in pose.txt represent position and rotation of 3d obj to fit image, correct?

WangLongZJU commented 6 months ago

You just need to provide the pose of the first frame at the first row of pose.txt. Each row in pose.txt has 12 values (data_on_each_row). The data_on_each_row[:9] is the rotation matrix and the data_on_each_row[9:12] is the translation of 3D Object.

graboosky commented 6 months ago

Thanks for clarification!

graboosky commented 6 months ago

I must say it is not easy to get into it, definitely academic level:)... Is there any chance to look into Xcode project?

Cursky commented 5 months ago

@graboosky Hello, may I ask how your research can import pkl/obj models into iOS and map them to a two-dimensional plane? Then, process these seven models through images and perform Newton operations to calculate pose? This seems very complex. I don't know how you have completed it. I am currently researching and hope to communicate with you. Thank you

WangLongZJU commented 4 months ago

One more question regarding demo, sorry for bother you @WangLongZJU.

I thought that most important part of this project is to get content of pose.txt?:P But it seems demo have already that file.

Please correct if I am wrong, but each row in pose.txt represent position and rotation of 3d obj to fit image, correct?

I am sorry, DeepAC will be applied to products in our company. I want to share the mobile code but I am not allowed to do it.

WangLongZJU commented 4 months ago

@graboosky Hello, may I ask how your research can import pkl/obj models into iOS and map them to a two-dimensional plane? Then, process these seven models through images and perform Newton operations to calculate pose? This seems very complex. I don't know how you have completed it. I am currently researching and hope to communicate with you. Thank you

You can check the structure of pkl model which is just a numpy array, and convert it to binary file. Then you need to understand how DeepAC use this pkl model in algorithm. You can use Bundle.path and MDLAsset to load obj model in Xcode. Finally, you can check in https://github.com/WangLongZJU/DeepAC/issues/12#issuecomment-2001613144 where I explain the function of every converted mlmodel.