asousa / DepthPrediction

A tool to predict the depth field of a 2-dimensional image
82 stars 38 forks source link

Only course regress or implemented? #3

Open cjnolet opened 8 years ago

cjnolet commented 8 years ago

Reading the other issue, it appears this was done as a sort of school project. I'm currently doing some initial research to see what it would take to implement this for my own needs and I stumbled upon this implementation. Looking at your CNN Caffe prototxt model code, I notice you only implemented one out of the two CNNs described in the paper. Was this just because the other CNN required a custom loss layer while the unwary CNN made use of the common logistic loss?

Also,

Could you guys note, perhaps, some of the "gotchas" that you found while working on your implementation that were not immediately apparent from reading the paper?

Thanks so much for publishing your project!

asousa commented 8 years ago

Hi Corey,

Yep, this was definitely done for a class project — CS231n here at Stanford, which was sort of a grad-level crash course on convnets.

The Australian group who wrote the paper we started from managed to get some nice results; our plan was to toss together their neural net in Caffe, train, tweak, improve. We never got past that first step though — I think Caffe’s gotten a lot more robust since when we did the project.

Yeah, one of the CNNs we wanted to try required a custom loss layer, and some more-complicated stuff with a graph-cut algorithm which we really didn’t have time to tackle. The stuff we struggled the most with was trying to do regression (inferring a continuous depth), as opposed to binning and classifying. Additionally, the problem is pretty poorly-defined - there’s not a nice clean label for the depth of an image. I think the best you can hope for is the neural net detecting relative depth between two adjacent superpixels, and then attempt to minimize across the whole image (graphcut).

But our #1 gotcha for the project was that Stanford is on the quarter system, and we tried to bang the whole project out in about 3 weeks while learning a bunch of new tools.

A side project I had in mind was to train for automatic colorization of black and white images, with the hope that inferring a color of a superpixel would be better-constrained than a depth (and you can quantize and bin the color space).

Sorry the code isn’t a grab-and-go solution, but good luck with your future work!

Austin

On Aug 10, 2016, at 10:03 PM, Corey J. Nolet notifications@github.com<mailto:notifications@github.com> wrote:

Reading the other issue, it appears this was done as a sort of school project. I'm currently doing some initial research to see what it would take to implement this for my own needs and I stumbled upon this implementation. Looking at your CNN Caffe prototxt model code, I notice you only implemented one out of the two CNNs described in the paper. Was this just because the other CNN required a custom loss layer while the unwary CNN made use of the common logistic loss?

Also,

Could you guys note, perhaps, some of the "gotchas" that you found while working on your implementation that were not immediately apparent from reading the paper?

Thanks so much for publishing your project!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/asousa/DepthPrediction/issues/3, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ADaabBNRZNAu17vzfmfX4IugOoHXjarKks5qeq09gaJpZM4JhzJF.