GPU implementation of L_p feature pooling

torch / cunn

Other

215 stars 174 forks source link

GPU implementation of L_p feature pooling #477

Closed wickedfoo closed 7 years ago

wickedfoo commented 7 years ago

This diff contains the GPU kernels and calling code from the fbcunn library for FeatureLPPooling that we are contributing to cunn itself, converted to compile with modern cutorch / cunn with half, float and double support.

https://github.com/facebook/fbcunn/blob/master/src/FeatureLPPooling.cu https://github.com/facebook/fbcunn/blob/master/src/FeatureLPPoolingHost.cpp

@soumith said he'd add the rest from here.

wickedfoo commented 7 years ago

Missing some updates, one min

wickedfoo commented 7 years ago

ok updated with the header too, and float -> accreal

wickedfoo commented 7 years ago

Please advise if there's anything else that needs to be added on the cunn side. There was no CPU implementation, only this in fbcunn for the module:

https://github.com/facebook/fbcunn/blob/master/fbcunn/FeatureLPPooling.lua

wickedfoo commented 7 years ago

I'm going to write a simple CPU version of this for torch/nn, that shouldn't be too hard.

soumith commented 7 years ago

i think we can make do with the Lua version, dont waste your time.

wickedfoo commented 7 years ago

There's no implementation of the algorithm in Lua either, just the module file that forwards to the cunn version for CudaTensors. Is that good enough?

soumith commented 7 years ago

oh i see. yea it needs a cpu or just pure lua script implementation

wickedfoo commented 7 years ago

I added a CPU implementation in https://github.com/torch/nn/pull/1259

I'll update this cunn pull request with a test of the GPU code that compares against the CPU implementation

wickedfoo commented 7 years ago

This diff has been updated with a test, bugs fixed and it works, compared against the CPU version. This is good to go, as is https://github.com/torch/nn/pull/1259.

wickedfoo commented 7 years ago

ping... can this be merged?