Some question about Table 3 in the paper

gallenszl commented 4 years ago

Hello, we want to know how do you select the KITTI, Middlebury and ETH3D validation datasets in table 3 so that we can make a direct comparision in our paper. Do you just use all the training images in KITTI, Middlebury and ETH3D?

feihuzhang commented 4 years ago

It uses all images of the training datasets for evaluation.

gallenszl commented 4 years ago

Thank you for your reply. By the way, did you use the 13 additional datasets with GT when evaluating Middlebury? And threshold error rates is just the D1_all metric, right?

feihuzhang commented 4 years ago

Yes, it uses the new Middlebury training set which has high resolutions. D1_all is used for evaluation.

gallenszl commented 4 years ago

So you use 23 images from middlebury 2014 (10 + 13) to evaluate you model rather than 15 training images provided in this url https://vision.middlebury.edu/stereo/submit3/ or maybe both (15 + 13)?

gallenszl commented 4 years ago

hello?

skmhrk1209 commented 4 years ago

Hi,

I'm a little bit confused about D1-all metric.

D1_all is defined as mean(error >= max(3, 0.05GT)), but calculated as mean(error > 3) at: https://github.com/feihuzhang/DSMNet/blob/d7124e07840b073550810a5c45d1a5181db6b728/evaluation.py#L202

mli0603 commented 4 years ago

Is the Table 3 in the paper reporting 3px error?

gallenszl commented 4 years ago

Is the Table 3 in the paper reporting 3px error?

I also feel a little confused about this question. Is the three pixel error rate(D1_all) employed to evaluate all four datasets?

ZhiboRao commented 4 years ago

亲爱的作者同志， GA-Net，我个人一直认为是一个非常非常棒的工作，也是我一直崇拜、赞扬和推荐的文章。这篇DSM的文章，我相信也是非常棒的工作。但是，我认为其中的表格3是不严谨的（ETH3D部分）。

我们直接用了GA-Net中，给的10代模型测试了ETH3D中的训练集。我们发现得到的结果比DSM论文中要很好多（网站上只有2像素和4像素的结果，我们可以断定3像素的结果应该介于2像素和4像素之间）。

此外，我们也用GWC-Net给的sceneflow预训练模型去测试了相关数据，发现KITTI和ETH3D中结果应该比DSM论文中给出的结果要好很多。

我们组与GWC-Net没有任何的利益关系。只是单纯地好奇为什么得到的结果有如此大的不同？

Dear author,

GA net, which I have always considered to be a very, very excellent job, is also an article that I have always admired, praised, and recommended. This DSM article, I believe, is also very excellent work. However, I think Table 3 is not rigorous (eth3d section).

We directly used the 10 generation model in GA-Net to test the training set in eth3d. We found that the results are much better than those in the DSM paper (there are only 2 and 4 pixels results on the website, we can conclude that the result of 3 pixels should be between 2 and 4 pixels).

In addition, we also use the pre-trained model (the sceneflow) given by GWC-net to test the relevant data. We find that the results in Kitti and eth3d should be much better than those given in the DSM paper.

Our group has no profit in GWC-Net. Just wondering why the results are so different?

Best Jack Rao

feihuzhang commented 4 years ago

亲爱的作者同志， GA-Net，我个人一直认为是一个非常非常棒的工作，也是我一直崇拜、赞扬和推荐的文章。这篇DSM的文章，我相信也是非常棒的工作。但是，我认为其中的表格3是不严谨的（ETH3D部分）。

我们直接用了GA-Net中，给的10代模型测试了ETH3D中的训练集。我们发现得到的结果比DSM论文中要很好多（网站上只有2像素和4像素的结果，我们可以断定3像素的结果应该介于2像素和4像素之间）。

此外，我们也用GWC-Net给的sceneflow预训练模型去测试了相关数据，发现KITTI和ETH3D中结果应该比DSM论文中给出的结果要好很多。

我们组与GWC-Net没有任何的利益关系。只是单纯地好奇为什么得到的结果有如此大的不同？

Dear author,

GA net, which I have always considered to be a very, very excellent job, is also an article that I have always admired, praised, and recommended. This DSM article, I believe, is also very excellent work. However, I think Table 3 is not rigorous (eth3d section).

We directly used the 10 generation model in GA-Net to test the training set in eth3d. We found that the results are much better than those in the DSM paper (there are only 2 and 4 pixels results on the website, we can conclude that the result of 3 pixels should be between 2 and 4 pixels).

In addition, we also use the pre-trained model (the sceneflow) given by GWC-net to test the relevant data. We find that the results in Kitti and eth3d should be much better than those given in the DSM paper.

Our group has no profit in GWC-Net. Just wondering why the results are so different?

Best Jack Rao

Dear Jack, We use 1-pixel threshold error rates for ETH3D, 2-pixel threshold for Middlebury, 3-pixel threshold for KITTI. Thanks for your question.

feihuzhang commented 4 years ago

So you use 23 images from middlebury 2014 (10 + 13) to evaluate you model rather than 15 training images provided in this url https://vision.middlebury.edu/stereo/submit3/ or maybe both (15 + 13)?

The 15 training images are used.

feihuzhang commented 4 years ago

Hi,

I'm a little bit confused about D1-all metric.

D1_all is defined as mean(error >= max(3, 0.05GT)), but calculated as mean(error > 3) at: https://github.com/feihuzhang/DSMNet/blob/d7124e07840b073550810a5c45d1a5181db6b728/evaluation.py#L202

We use 3-pixel threshold for KITTI, 2-pixel for Middlebury and 1-pixel for Eth3D.

feihuzhang commented 4 years ago

亲爱的作者同志， GA-Net，我个人一直认为是一个非常非常棒的工作，也是我一直崇拜、赞扬和推荐的文章。这篇DSM的文章，我相信也是非常棒的工作。但是，我认为其中的表格3是不严谨的（ETH3D部分）。

我们直接用了GA-Net中，给的10代模型测试了ETH3D中的训练集。我们发现得到的结果比DSM论文中要很好多（网站上只有2像素和4像素的结果，我们可以断定3像素的结果应该介于2像素和4像素之间）。

此外，我们也用GWC-Net给的sceneflow预训练模型去测试了相关数据，发现KITTI和ETH3D中结果应该比DSM论文中给出的结果要好很多。

我们组与GWC-Net没有任何的利益关系。只是单纯地好奇为什么得到的结果有如此大的不同？

Dear author,

GA net, which I have always considered to be a very, very excellent job, is also an article that I have always admired, praised, and recommended. This DSM article, I believe, is also very excellent work. However, I think Table 3 is not rigorous (eth3d section).

We directly used the 10 generation model in GA-Net to test the training set in eth3d. We found that the results are much better than those in the DSM paper (there are only 2 and 4 pixels results on the website, we can conclude that the result of 3 pixels should be between 2 and 4 pixels).

In addition, we also use the pre-trained model (the sceneflow) given by GWC-net to test the relevant data. We find that the results in Kitti and eth3d should be much better than those given in the DSM paper.

Our group has no profit in GWC-Net. Just wondering why the results are so different?

Best Jack Rao

feihuzhang commented 4 years ago

@RaoHaocheng I don't know how you retrain the gwcnet. But Eth3D dataset is in gray style. All the model are trained with RGB color images. The evaluations test the color-to-gray generalization.

ZhiboRao commented 4 years ago

@RaoHaocheng I don't know how you retrain the gwcnet. But Eth3D dataset is in gray style. All the models are trained with RGB color images. The evaluations test the color-to-gray generalization.

@feihuzhang we don't retrain any model, including gwc or ga. We only use the source code and pre-trained model (we get from GitHub) to get results of eth3d.

we know that Eth3D dataset is in gray style, and it can be used as r=g=b. We find that The models are not sensitive to the gray style in eth3D. You can rerun your code of ga for the training set of ethd3d, you can get the same results, as the screenshot that we have given.

We only add lists of eth3d for ga-net. Then, we change the path in predict.py. finally, we use png2pfm to get the final results for uploading. (The model we get from your google drive)

we do the same things for gwc-net (only change the data loader).

Not only ga-net or gwc-net, my Junk network (NLCA-Net that based on GC-Net) also can suitable for ETH3D, which can get almost the same result as gwc-net, worse than GA-Net.

We think the models are not sensitive to the gray style in eth3D.

Sarah20187 commented 2 years ago

@feihuzhang Same here. I have calculated the 3-pixel error on KITTI15 with the pre-trained gwcnet provided by the author. It is around 10, rather than 22.7. Could you help me with that? Is that possible to provide the test code?

feihuzhang / DSMNet

Some question about Table 3 in the paper #6