the accuracy of validation is much higher than accuracy of test

Project-MONAI / research-contributions

Implementations of recent research prototypes/demonstrations using MONAI.

https://monai.io/

Apache License 2.0

1.01k stars 334 forks source link

the accuracy of validation is much higher than accuracy of test #200

Open beihaiyou opened 1 year ago

beihaiyou commented 1 year ago

I modify the code of get_loader(), validation and test have the same dataloader, but accuracy is much lower while testing, could you please tell me how to get the same high accuracy as validation while testing? I use the same data in validation and test.

mingxin-zheng commented 1 year ago

Hi @beihaiyou , I am unclear about the situation. What dataset and MONAI components are you using?

HaoWang111 commented 1 year ago

I modify the code of get_loader(), validation and test have the same dataloader, but accuracy is much lower while testing, could you please tell me how to get the same high accuracy as validation while testing? I use the same data in validation and test.

Hi @beihaiyou, I had the same problem when I was doing the left atrial segmentation task. Given that I'm doing single organ segmentation, I've chosenout_channel = 2for foreground and background. Comparing the validation and testing phases of the code, I found that in the validation phase, the dice was the average of the dice of foreground and background, while in the testing phase, I only calculated the dice of foreground, which resulted in a large difference in accuracy.

beihaiyou commented 1 year ago

I modify the code of get_loader(), validation and test have the same dataloader, but accuracy is much lower while testing, could you please tell me how to get the same high accuracy as validation while testing? I use the same data in validation and test.

Hi @beihaiyou, I had the same problem when I was doing the left atrial segmentation task. Given that I'm doing single organ segmentation, I've chosenout_channel = 2for foreground and background. Comparing the validation and testing phases of the code, I found that in the validation phase, the dice was the average of the dice of foreground and background, while in the testing phase, I only calculated the dice of foreground, which resulted in a large difference in accuracy.

Thank you so much, my problem is smae as yours. Additionally, I find the val accuarcy function key word "include_background=True", maybe it is the reason. If conveniently we can add wechat and make further communication. My email is liangzh20172021@163.com

2259978541 commented 1 year ago

您好，我使用您提供的数据集BTCV和预处理权重进行测试时，测试精度只达到了0.77。

YeFengyu819 commented 3 months ago

Hello，I have the same problem that when testing, the dice accuracy is much lower than that in validation. When change "include_background=True" to "include_background=False" it get much better, but the test dice is still lower than the validation, even if the test data is the same as validation data to debug. I use the MRI Liver data from Liver HCC segment Opensource dataset, and set the roi_x, roi_y, roi_z as 64,64,64. I have no idea that if the decrease of dice due to the slicing window or the change of roi_x. 您好，我将"include_background=True"设置为False的时候这个问题确实有改善，但是测试的时候依然比验证的时候低10个点左右。为了排除测试集验证集本身的问题，我将测试集和验证集设成一模一样的两个集合，这个问题依然存在。我采用的是 Liver HCC segment公开数据集，并将roi_x, roi_y, roi_z都设成了64。在进行训练测试验证之前，这个数据集没有经过任何预处理（比如重采样，改变3D图像的大小等）我不确定是不是因为滑动窗口来推理的时候出了问题导致的。请问一下您有什么头绪吗？谢谢！！！