Closed graycrown closed 3 years ago
I noticed that the previous version of your work achieved RMSE=744 in KITTI online leader board , but the new submission got the RMSE=732. Can you tell us the key point you have done to improve the performance? E.g. adjust the network structure, using some training tricks or some data preprocessing/postprocessing methods.
Looking forward to your reply. thks
Hi, You can find the reason in run_test.sh and run_eval.sh. In the new submission, I use the flipping operation to improve the performance, inspired by GuideNet. The details can be found in https://github.com/sshan-zhao/ACMNet/blob/71dd7b2ce5e937c3299c45a158e8369deb152046/models/test_model.py#L41. In fact, this operation is a trick to improve the performance.
I noticed that the previous version of your work achieved RMSE=744 in KITTI online leader board , but the new submission got the RMSE=732. Can you tell us the key point you have done to improve the performance? E.g. adjust the network structure, using some training tricks or some data preprocessing/postprocessing methods. Looking forward to your reply. thks
Hi, You can find the reason in run_test.sh and run_eval.sh. In the new submission, I use the flipping operation to improve the performance, inspired by GuideNet. The details can be found in
. In fact, this operation is a trick to improve the performance.
After reading source code, I found that flip operation is adopted randomly in testing and validation stage, so the prediction result is not stable ? In my opinion, I think the flip operation should be fixed. In other words, all inputs should be flipped or not rather than flip partial inputs
Actually, It is not done randomly. Instead, you can set the -flip_input to achieve it. Can you point out the related code? 2021年7月9日 +1000 PM7:42 graycrown @.***>,写道:
I noticed that the previous version of your work achieved RMSE=744 in KITTI online leader board , but the new submission got the RMSE=732. Can you tell us the key point you have done to improve the performance? E.g. adjust the network structure, using some training tricks or some data preprocessing/postprocessing methods. Looking forward to your reply. thks Hi, You can find the reason in run_test.sh and run_eval.sh. In the new submission, I use the flipping operation to improve the performance, inspired by GuideNet. The details can be found in https://github.com/sshan-zhao/ACMNet/blob/71dd7b2ce5e937c3299c45a158e8369deb152046/models/test_model.py#L41 . In fact, this operation is a trick to improve the performance. After reading source code, I found that flip operation is adopted randomly in testing and validation stage, so the prediction result is not stable ? In my opinion, I think the flip operation should be fix. In other words, all inputs should be flipped or not rather than flip partial inputs — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
After re-reading your source code, I found that both training/testing share the same code for data augmentation which locate at flip operation.
` if self.flip: flip_prob = random.random() else: flip_prob = 0.0
if img is not None:
img = img.astype(np.float32)
if flip_prob >= 0.5:
img = img[:, ::-1, :]
img = img / 255.0
img = np.transpose(img, (2, 0, 1))
if flip_prob >= 0.5:
s_depth = s_depth[:, ::-1]
if gt_depth is not None:
gt_depth = gt_depth[:, ::-1]`
So when you set the flip_input = True, all inputs will be randomly flipped. If I did not understand your code correctly, plz let me know.
After re-reading your source code, I found that both training/testing share the same code for data augmentation which locate at flip operation. ` if self.flip: flip_prob = random.random() else: flip_prob = 0.0
if img is not None: img = img.astype(np.float32) if flip_prob >= 0.5: img = img[:, ::-1, :] img = img / 255.0 img = np.transpose(img, (2, 0, 1)) if flip_prob >= 0.5: s_depth = s_depth[:, ::-1] if gt_depth is not None: gt_depth = gt_depth[:, ::-1]
` So when you set the flip_input = True, all inputs will be randomly flipped. If I did not understand your code correctly, plz let me know.
Firstly, I have pointed out the code for how to use --flip_input in my first answer. Secondly, the data augmentation operation is activated automatically during training (close it using --no_flip), and we do not use the data augmentation during inference, see the file https://github.com/sshan-zhao/ACMNet/blob/master/data/__init__.py
After re-reading your source code, I found that both training/testing share the same code for data augmentation which locate at flip operation. ` if self.flip: flip_prob = random.random() else: flip_prob = 0.0
if img is not None: img = img.astype(np.float32) if flip_prob >= 0.5: img = img[:, ::-1, :] img = img / 255.0 img = np.transpose(img, (2, 0, 1)) if flip_prob >= 0.5: s_depth = s_depth[:, ::-1] if gt_depth is not None: gt_depth = gt_depth[:, ::-1]
` So when you set the flip_input = True, all inputs will be randomly flipped. If I did not understand your code correctly, plz let me know.
Firstly, I have pointed out the code for how to use --flip_input in my first answer. Secondly, the data augmentation operation is activated automatically during training (close it using --no_flip), and we do not use the data augmentation during inference, see the file https://github.com/sshan-zhao/ACMNet/blob/master/data/__init__.py
Sry, my fault, I misunderstand the meaning of "True" in the the data augmentation operation RandomImgAugment(True, True, Image.BICUBIC)]
. Actually, "True" means the "No Flip in dataloader", the flip operation conduct at
if self.opt.flip_input:
# according to https://github.com/kakaxi314/GuideNet,
# this operation might be helpful to reduce the error greatly.
input_s = torch.cat([self.sparse, self.sparse.flip(3)], 0)
input_i = torch.cat([self.img, self.img.flip(3)], 0)
input_K = torch.cat([self.K, self.K], 0)
in forward.
Anyway, thks for your patience. I will try this trick in our work
I noticed that the previous version of your work achieved RMSE=744 in KITTI online leader board , but the new submission got the RMSE=732. Can you tell us the key point you have done to improve the performance? E.g. adjust the network structure, using some training tricks or some data preprocessing/postprocessing methods.
Looking forward to your reply. thks