sshan-zhao / ACMNet

Adaptive Context-Aware Multi-Modal Network for Depth Completion
64 stars 11 forks source link

The difference between final version and the arxiv version #4

Closed graycrown closed 3 years ago

graycrown commented 3 years ago

I noticed that the previous version of your work achieved RMSE=744 in KITTI online leader board , but the new submission got the RMSE=732. Can you tell us the key point you have done to improve the performance? E.g. adjust the network structure, using some training tricks or some data preprocessing/postprocessing methods.

Looking forward to your reply. thks

sshan-zhao commented 3 years ago

I noticed that the previous version of your work achieved RMSE=744 in KITTI online leader board , but the new submission got the RMSE=732. Can you tell us the key point you have done to improve the performance? E.g. adjust the network structure, using some training tricks or some data preprocessing/postprocessing methods.

Looking forward to your reply. thks

Hi, You can find the reason in run_test.sh and run_eval.sh. In the new submission, I use the flipping operation to improve the performance, inspired by GuideNet. The details can be found in https://github.com/sshan-zhao/ACMNet/blob/71dd7b2ce5e937c3299c45a158e8369deb152046/models/test_model.py#L41. In fact, this operation is a trick to improve the performance.

graycrown commented 3 years ago

I noticed that the previous version of your work achieved RMSE=744 in KITTI online leader board , but the new submission got the RMSE=732. Can you tell us the key point you have done to improve the performance? E.g. adjust the network structure, using some training tricks or some data preprocessing/postprocessing methods. Looking forward to your reply. thks

Hi, You can find the reason in run_test.sh and run_eval.sh. In the new submission, I use the flipping operation to improve the performance, inspired by GuideNet. The details can be found in

https://github.com/sshan-zhao/ACMNet/blob/71dd7b2ce5e937c3299c45a158e8369deb152046/models/test_model.py#L41

. In fact, this operation is a trick to improve the performance.

After reading source code, I found that flip operation is adopted randomly in testing and validation stage, so the prediction result is not stable ? In my opinion, I think the flip operation should be fixed. In other words, all inputs should be flipped or not rather than flip partial inputs

sshan-zhao commented 3 years ago

Actually, It is not done randomly. Instead, you can set the -flip_input to achieve it. Can you point out the related code? 2021年7月9日 +1000 PM7:42 graycrown @.***>,写道:

I noticed that the previous version of your work achieved RMSE=744 in KITTI online leader board , but the new submission got the RMSE=732. Can you tell us the key point you have done to improve the performance? E.g. adjust the network structure, using some training tricks or some data preprocessing/postprocessing methods. Looking forward to your reply. thks Hi, You can find the reason in run_test.sh and run_eval.sh. In the new submission, I use the flipping operation to improve the performance, inspired by GuideNet. The details can be found in https://github.com/sshan-zhao/ACMNet/blob/71dd7b2ce5e937c3299c45a158e8369deb152046/models/test_model.py#L41 . In fact, this operation is a trick to improve the performance. After reading source code, I found that flip operation is adopted randomly in testing and validation stage, so the prediction result is not stable ? In my opinion, I think the flip operation should be fix. In other words, all inputs should be flipped or not rather than flip partial inputs — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

graycrown commented 3 years ago

After re-reading your source code, I found that both training/testing share the same code for data augmentation which locate at flip operation.

` if self.flip: flip_prob = random.random() else: flip_prob = 0.0

    if img is not None:
        img = img.astype(np.float32)
        if flip_prob >= 0.5:
            img = img[:, ::-1, :]
        img = img / 255.0
        img = np.transpose(img, (2, 0, 1))  

    if flip_prob >= 0.5:
        s_depth = s_depth[:, ::-1]
        if gt_depth is not None:
            gt_depth = gt_depth[:, ::-1]`

So when you set the flip_input = True, all inputs will be randomly flipped. If I did not understand your code correctly, plz let me know.

sshan-zhao commented 3 years ago

After re-reading your source code, I found that both training/testing share the same code for data augmentation which locate at flip operation. ` if self.flip: flip_prob = random.random() else: flip_prob = 0.0

    if img is not None:
        img = img.astype(np.float32)
        if flip_prob >= 0.5:
            img = img[:, ::-1, :]
        img = img / 255.0
        img = np.transpose(img, (2, 0, 1))    

    if flip_prob >= 0.5:
        s_depth = s_depth[:, ::-1]
        if gt_depth is not None:
            gt_depth = gt_depth[:, ::-1]

` So when you set the flip_input = True, all inputs will be randomly flipped. If I did not understand your code correctly, plz let me know.

Firstly, I have pointed out the code for how to use --flip_input in my first answer. Secondly, the data augmentation operation is activated automatically during training (close it using --no_flip), and we do not use the data augmentation during inference, see the file https://github.com/sshan-zhao/ACMNet/blob/master/data/__init__.py

graycrown commented 3 years ago

After re-reading your source code, I found that both training/testing share the same code for data augmentation which locate at flip operation. ` if self.flip: flip_prob = random.random() else: flip_prob = 0.0

    if img is not None:
        img = img.astype(np.float32)
        if flip_prob >= 0.5:
            img = img[:, ::-1, :]
        img = img / 255.0
        img = np.transpose(img, (2, 0, 1))  

    if flip_prob >= 0.5:
        s_depth = s_depth[:, ::-1]
        if gt_depth is not None:
            gt_depth = gt_depth[:, ::-1]

` So when you set the flip_input = True, all inputs will be randomly flipped. If I did not understand your code correctly, plz let me know.

Firstly, I have pointed out the code for how to use --flip_input in my first answer. Secondly, the data augmentation operation is activated automatically during training (close it using --no_flip), and we do not use the data augmentation during inference, see the file https://github.com/sshan-zhao/ACMNet/blob/master/data/__init__.py

Sry, my fault, I misunderstand the meaning of "True" in the the data augmentation operation RandomImgAugment(True, True, Image.BICUBIC)] . Actually, "True" means the "No Flip in dataloader", the flip operation conduct at

if self.opt.flip_input:
            # according to https://github.com/kakaxi314/GuideNet,
            # this operation might be helpful to reduce the error greatly.
            input_s = torch.cat([self.sparse, self.sparse.flip(3)], 0)
            input_i = torch.cat([self.img, self.img.flip(3)], 0)
            input_K = torch.cat([self.K, self.K], 0)

in forward.

Anyway, thks for your patience. I will try this trick in our work