Closed Pax1601 closed 8 years ago
You would need 3-d convolutions I think? Current convolutions are 2-d, over images. But I think you'd be convolving over time too, is that right?
I think this will be quite hard to shoe-horn into deepcl, which was originally intended to have a very strictly limited scope of handling Go-boards, and as such handles only square images. It could be upgraded to handle non-square videos, but would be a fair amount of work.
If I was in your position, I might plausibly look at porting across the similar layers in cuda torch into cl torch. That should be fairly straightforward to do. I can probably actually handle that if you are interested? cl torch is at https://github.com/hughperkins/clnn Let me know if this could be interesting to you.
Indeed it requires 3-d convolutions, but as you can read in the webpage I have linked above, it also provides a Caffe model where 3 subsequent grayscale frames are merged in a single RGB frame. Do you think such model may work?
My problem is portability. The final program must run on a drone, and right now I'm not sure about the architecture of the GPU. The only information I have is that it should run OpenCL. I'd like to be more precise, but I'm asking you this question because I will require optical flow computation for a university project and I still don't exactly know what machine the code will run on.
2d convolution takes a stack of 2d images, and convolves them together, using an arbitrary number of filters, to give a number of output 2d images equal to the number of filters. Each filter is 3d: taking a stack of 2d images.
It sounds like your model will have 3 incoming image planes, is that right? In which case, it's just a standard convolution. Actually, when I say '2d', each image is 2d, but the convolution filters are 3d: taking a stack of input images. Its plausible that for video, one would actually need stacks of 4d filters actually, not 3d as I implied earlier.
DeepCL has the following requirements to run:
Hi there,
I'm sorry if this is not the best way to contact you, I'm not really pointing an issue, but rather I'd like to ask a question. Do you believe that DeepCL could be suitable to estimate optical flows from video frames, as available, for example, in http://damienteney.info/cnnFlow.htm. This project is very good but it is based on MatConvNet which only supports CUDA computations, while I need an OpenCL based solution.
Thank you for your time.
Best regards,
Davide