problem about the reslut - Githubissues

JiaRenChang / PSMNet

Pyramid Stereo Matching Network (CVPR2018)

MIT License

1.44k stars 423 forks source link

problem about the reslut #27

Open zyfsa opened 6 years ago

zyfsa commented 6 years ago

hello,

do you test the result of model only trained the scene flow on the kitti2015-all-traing-image? the result is? 2.when you finetune on the kitti2015-traing, the result of model on the kitti2015-all-traing-image is?

my problom is when I finetune on the kitti-2015-traing, then test the result on the kitti2015-all-traing-image, the relslut has little change after 30epoch， now the result of err_rate is about 3.4%? do you have other special trainging method? I am chinese, we can chat in weixin if you want, thank you! Please! I am troubled in there, the performance is pool.

JiaRenChang commented 6 years ago

@zyfsa Hi, 請問你是下載了SceneFlow後從頭開始練嗎?

zyfsa commented 6 years ago

hi, @JiaRenChang 我们是简体，你是繁体，希望能看懂，哈哈。我就是下载了sceneflow重头开始训练，得到了预训练的模型；然后在finetune,我发现我finuetune有一定的效果，但是在30个epoch后就不变了，但是在kitti2015训练集上错误率3.5左右，我所以想知道一下您的中间结果，进行比较。另外您在sceneflow训练的时候是剔除了>192的点，在kitti时剔除了<0的点，这一步是很重要吧？我也是这样做的。能帮我解答一下我的结果为什么是这样吗？

JiaRenChang commented 6 years ago

@zyfsa 那請問您訓練出的模型在SceneFlow的測試集上的結果如何？是否符合論文中的結果？

zyfsa commented 6 years ago

是不太符合的，EPE没有你那么低；我用这个预训练的模型直接在kitti2015上跑，错误率20%;然后在整个kitti上finetune,错误率是3.5%.我不知道你是否有像我这么测试过，结果如何；另外你最后得到的视差图是1/4分辨率训练的loss，然后直接上采样到原始图，精度都有那么高，确实厉害。另外麻烦回答一下，剔除某些点得到loss训练是否很重要

JiaRenChang commented 6 years ago

剔除 > 192或 < 0的點是重要的，因為我們是用迴歸的方式的去算，disparity最大就是只能算到192，超過這的範圍的disparity反而會干擾訓練。另外請問您的影像的前處理以及是否有做augmentation?

zyfsa commented 6 years ago

我没有做augmentation,只是把数据预处理到-1到1的值范围

zyfsa commented 6 years ago

另外我想请问你训练的时候除了smooth l1_loss,有weight decay 吗？是否有影响

JiaRenChang commented 6 years ago

Hi, 我們前處理有做color normalization，用的mean/variance是來自於imagenet 訓練時有做random crop 隨機裁切成(256, 512)的影像做訓練沒有做weight decay

JiamingSuen commented 6 years ago

@JiaRenChang 想顺便问个问题，为了让其他人能看懂我使用英文了。 Would you please tell me if you have tried to normalize the ground truth disparity value with respect to the image width? In other words, let the network regress disparity/image.size[1] as the target. Do you think this would bring better generalization capability for different size image inputs? Thanks a lot!

JiaRenChang commented 6 years ago

We usually scale disparities because we scaled image sizes. For example, if we re-scale image to half, the disparity value should be half because that the correspondences are closer.

JiamingSuen commented 6 years ago

Yes, I get that, while it's probably not the answer to my question. Thanks for your reply anyway.

Deng-Y commented 6 years ago

@JiamingSuen Do you have answer to your question now?

JiamingSuen commented 6 years ago

I'm afraid not, it seems that few people have tried that. According to my experiment it will not improve the result a lot, but this is far from a solid conclusion.