Closed MASTERCHIEF343 closed 4 years ago
Hi, firstly, you are right, but what implementations are you refering to? Could you please give more details?
Secondly, we haven't tuned the batch size, it is just set for faster training (in terms of epochs) according to our GPUs (four 32G V100).
Thanks to your reply. Refers to PSMNet, they use trilinear to upsample the depth volume(1/4H1/4W1/4D--->HWD), and then calculate the disparity. It's on the totally 192 disparity range. But if we calculate the disparity on small resolution, the disparity range is also small. So when we use upsampling, can we recover it very well? I just don't know if there is any differences.
Directly upsampling the low resolution disparity may lose some details compared with upsampling cost volume, but we also have refinement modules to obtain the disparity prediction at the original resolution. Also, upsampling a 2D disparity is more memory efficient than upsamling a 3D volume.
emm, I'm still confused. For example, for 1/4H*1/4W, we set the disparity 48, when we use upsampling, the upsampled values are also in 0-47. But in your network, it works(maybe the network itself can amplify the disparity range). I just can't figure this out. Maybe I'm wrong.
In PSMNet, they use upsampling first which in my opinion contains a disparity range recovering. And then calculate the disparity. Have you ever tested the network without any refinement modules?
When upsampling disparity, a scale factor is also multiplied to account for disparity scale change (see https://github.com/haofeixu/aanet/blob/master/model.py#L97-L98). I believe this is what you are missing currently.
Hi, thanks to your code and paper. I just have a question about disparity before loss calculation. In paper you said you first upsampled the disparity to the original resolution, then you use it for training loss calculation. So you first calculation disparity on three different resolution and then upsample them to the original resolution which is the same size as reference image(left image). Am I right? If I'm right, I just wonder if there is any wrong with these implementation?Another question is how did you decide the number of batch_size, because is really large.