Closed shuluoshu closed 7 years ago
Back when I was writing the mc-cnn code nn.SpatialConvolution1_fw
, which is just one matrix-matrix multiply, was much faster than cudnn.SpatialConvolution with a 1x1 kernel. Today this probably isn't the case anymore and I would have used cudnn instead. But you are right, they are the same thing.
@jzbontar Thanks for your timely reply, but what I really want to know is that why don't use 1×1 conv both in the training process and the testing process ? I mean why the nn.Linear is used in the training process ? And I wonder if I can replace the nn.Linear with 1×1 conv in the training process ? Thanks so much!
Yes, you can replace nn.Linear with 1x1 conv. I would benchmark to make sure it's not slower, though.
Hi, @jzbontar I notice that the you use nn.Linear in training network (slow) and replace it with nn.SpatialConvolution1_fw( I guess that is the same to 1 x 1 conv) in the test process, however, I wonder why don't you just use cudnn.SpatialConvolution with kernel size equals to 1 x 1 both in training process and testing process ? Will it affect the performance or just accelerate the training speed by using 1x1 conv (with cudnn) in all process ? Thanks a lot !