ShaoqingRen / SPP_net

SPP_net : Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
364 stars 237 forks source link

Use the network #13

Closed lyjh closed 9 years ago

lyjh commented 9 years ago

Hi,

I want to ask after we train and finetune the model, how do we actually object detection using raw image? What preprocessing we have to do to feed the image into the network?

At the finetune network definition file, the input_dim are 128, 12800, 1, 1. What those dimensions represent respectively? I understand that 128 is the batch_size, but can't figure out how other numbers come from.

If you can provide a demo to show how to use network to do detection as R-CNN, that will be very helpful.

Thanks.

ShaoqingRen commented 9 years ago

12800 is the fc6 input size, 12800 = (1_1 + 2_2 + 3_3 + 6_6) * 256

spp demo is updated. :)