LinHungShi / GCNetwork

117 stars 39 forks source link

can only test with certain image size #9

Open YoYo000 opened 7 years ago

YoYo000 commented 7 years ago

Hi Lin, thanks for your work for implementing the GCNetwork.

I try to run test.py to test the driving dataset. With the original image size the memory will run out and show ResourceExhaustedError. So I downsample the images size by two (480 * 270), however the following error will occur:

InvalidArgumentError (see above for traceback): Incompatible shapes: [1,12,18,30,64] vs. [1,12,17,30,64] [[Node: add_17/add = Add[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](activation_32/Relu, activation_28/Relu)]] [[Node: model_2/lambda_4/Squeeze/_969 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_6294_model_2/lambda_4/Squeeze", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

It is related to the downsamples in the 3D convolutions? Hope you can help.

LinHungShi commented 7 years ago

Yes, this problem arises with dimension inconsistency. Since there is a total of four down sampling layers, height and width (also disparity) must be divisible by 32. Currently, the input must be small enough (e.g. 256 x 256 x 3) to fit in the memory.

YoYo000 commented 7 years ago

Thanks!

One suggestion for improve the accuracy. The disparity of scene flow is relatively large and most stereo pair would have a maximum disp of ~200. In this case if the training input is 256x256 actually there would be very little overlapping among the input images patches. Changing the training input to 512x128 is much better and I can get a loss around 2.

I am hoping to train with input size 512x256 in a 1080Ti card (11GB). In the original paper they use a TitanX (12GB), but I am not sure whether the training is really restricted by this 1GB difference......