ismailbozk / ObjectScanner

This project is a technical experiment on iOS Metal Api
MIT License
7 stars 3 forks source link

[not an issue] bunch of questions. #1

Closed johndpope closed 7 years ago

johndpope commented 7 years ago

Hi Ismail,

well done on your implementation. I managed to resurrect this project which is a photosynth clone https://github.com/johndpope/pixelstruct (this is in c++) I'd really love to port project to scenekit.

did you consider using scenekit for this project?

I was considering porting pixelstruct to scenekit as an academic exercise - but perhaps upgrading your code to fit scene kit maybe easier option.

why scenekit? it's pretty impressive. checkout the slide show from 2013 - if you haven't already. https://developer.apple.com/library/content/samplecode/SceneKit_Slides_WWDC2013/Introduction/Intro.html

I've dug around github for pointcloud code / it's pretty light on. not many people attempting to use meshes and stuff here. I wonder why.... is the performance bad? https://github.com/FernandoDoming/CodinGame/blob/c9d88791b514d27f258d32f7dc56c4f12cf0f8fc/Attractors/Attractors/GameViewController.swift

I noticed one of the methods - OSScanManager / startSingleFrameOperations I set a breakpoint - but it doesn't hit. Did you attempt to use the camera on device?

You maybe interested in some work being done on depth perception using monocular images. check this out http://www.terraai.org/depth/index.html

I'd like to use tensorflow + rgb-d traning sets + run camera live and detect edges in an indoor space. It seems like rudimentary real time line detection (for walls and stuff) is really complex. LSD - line segment detection / rectangle perspective stuff.

ismailbozk commented 7 years ago

Hi John,

Thanks for your kind complements.

well done on your implementation. I managed to resurrect this project which is a photosynth clone https://github.com/johndpope/pixelstruct (this is in c++) I'd really love to port project to scenekit.

As far as I know scenekit is high level framework specialized just for gaming. Yes it uses GPU as Metal does, but it may do that in a very implicitly way. Maybe you know more than I do. On the other hand, in terms of point clouds, I found Metal very straight-forward. I recommend you to check OSPointCloudView.swift and PointCloudShader.metal files. In summary, you can easily configure the point cloud display preferences on the device screen. Especially search for MTLPrimitiveType.point in the OSPointCloudView.swift, GPU will understand it won't draw vertexes but the points for the corresponding metal shader which is PointCloudShader.metal. So for your question

did you consider using scenekit for this project?

"pixelstruct" seems require additional computations like finding the lines and aligning the point cloud on the photos etc. I am not sure Scenekit or Metal can make a difference on this points. Probably you will make your computations on the CPU instead of GPU. For my self, I would go with Metal, I wouldn't want get the whole package related to game development.

I was considering porting pixelstruct to scenekit as an academic exercise - but perhaps upgrading your code to fit scene kit maybe easier option.

Let me know which one is easier, you use alter OSPointCloudView.swift and PointCloudShader.metal files to present your point cloud model on the screen. As far as I can tell it will a nice challenge to display a photo and a point cloud on the same CAMetalLayer.

why scenekit? it's pretty impressive. checkout the slide show from 2013 - if you haven't already. https://developer.apple.com/library/content/samplecode/SceneKit_Slides_WWDC2013/Introduction/Intro.html

I heard it is pretty impressive. But never got chance to take closer look.

I've dug around github for pointcloud code / it's pretty light on. not many people attempting to use meshes and stuff here. I wonder why.... is the performance bad? https://github.com/FernandoDoming/CodinGame/blob/c9d88791b514d27f258d32f7dc56c4f12cf0f8fc/Attractors/Attractors/GameViewController.swift

I think meshes mostly used by game developers who are using dedicated game engines for that purpose. Majority of the iOS community still lives in 2D. So most of the developers don't even bother learn the 3D basics.

I noticed one of the methods - OSScanManager / startSingleFrameOperations I set a breakpoint - but it doesn't hit. Did you attempt to use the camera on device?

It is my bad, I was still trying to figure out what I was doing, so I coded poorly. The reason that break point is not working, Manager is buggy at the moment, to make it work, you should add prepareTestData(named: "boxes2") under prepareTestData(named: "boxes1") in the OSCameraFrameProviderSwift which is camera simulator class The manager needs consecutive frames to compute the camera transformation between them. But the transformation computation logic at the moment buggy. So I removed the test data to give a better presentation on the screen, otherwise app user gets confused. or I left the manager in a different level of bad shape. So it broken completely :)

You maybe interested in some work being done on depth perception using monocular images. check this out http://www.terraai.org/depth/index.html

I always appreciate a ComputerVision related articles. Thanks

johndpope commented 7 years ago

you're right about the architecture of cpu computations of point cloud - they (pixelstruct leverages) use this software https://github.com/snavely/bundler_sfm

johndpope commented 7 years ago

fyi - found this code https://github.com/FernandoDoming/CodinGame/blob/c9d88791b514d27f258d32f7dc56c4f12cf0f8fc/Attractors/Attractors/GameViewController.swift

ismailbozk commented 7 years ago

I guess "primitiveType: .point" is the same thing in Metal called "MTLPrimitiveType.point"

johndpope commented 7 years ago

I think the good thing with scenekit - (even if you throw out all the game stuff (most of that logic is in gamekit)) is you get some first class touch / motion capabilities. check out the physics based rendering capabilities they recently added https://www.youtube.com/watch?v=xtl-zOdoD7Y

ismailbozk commented 7 years ago

I presume, Scenekit can be handy when u want to build an application fast. Keep me updated on your development process. I really like to see what Scene kit is capable.

Also thanks alot for the resources and articles.

johndpope commented 7 years ago

so been digging deeper it turns out the stl / ply format is a first class citizen in scenekit so with one line you can import complex vertex data sets. SCNScene *scene = [SCNScene sceneNamed:@"visualization_-_aerial.stl"];

screen shot 2017-01-05 at 21 56 24

did you consider coercing your depth data into a stl for other uses?

you should check out the wwdc slides example from 2014

screen shot 2017-01-05 at 21 58 04
ismailbozk commented 7 years ago

Hi John,

I was using .ply format as the output. Since kinect only provides 2D rgb and 2D depth arrays as input I had to compute the point cloud by using these two inputs. For the .stl format It seems it also have the vertex info as well. It is a different topic for me to cover, and it is not easy to compute, I tried a couple methods to triangulate the point clouds Check out: http://pointclouds.org/documentation/tutorials/greedy_projection.php

But I failed and moved to my more prioritised objectives like reduce the error in point cloud registration. But I am hoping to optimise the point clouds and create fully triangulated mesh out of camera images. I was hoping Apple to provide to give an open API for the iPhone 7+ stereo camera. Hopefully, next year they will

ismailbozk commented 7 years ago

How is your project going on? I hope everything is going well.