Different results on zero-shot learning

boqchen commented 2 years ago

Hi. Thx for the great work. I was trying to reproduce the baseline. I used REFUGE as a source domain and trained on the REFUGE train and valid data. Then I just tested this model on RIM-ONE w/o any adoption. To simplify the task I only did disk segmentation, i.e., I considered both cup and disk as the disk. In this setting, the upper bound I got on RIM-ONE (i.e., trained and tested both on RIM-ONE) is ~0.89 (DICE) and the lower bound (trained on REFUGE and tested on RIM-ONE) is only ~0.55. This in comparison with the results in the paper shows a big gap (the upper bound is lower than few-shot learning results and the lower bound is much lower than zero-shot learning results in the paper). I was wondering if you could provide more detail on data preprocessing, training details, etc.

askerlee commented 2 years ago

I actually observed similar results that sometimes with domain adaptation, the results are better than training on the target domain. The reason I think is the source domain (here REFUGE) contains many more images than the target domain (RIM-ONE), therefore the model learns many more variations of the fundus images (no matter on which domain). But due to some statistical differences (e.g., batchnorm stats don't match), the performance on target domain has a big drop. In that case, few-shot adaptation could be very effective and push the performance beyond training on target domain (RIM-ONE) only. But this is not always what I observed, and sometimes training on target domain indeed performs better. I didn't report such controversial results in the paper to due to the length limitations (I need many more experiments and explanation to address these observations).

boqchen commented 2 years ago

Thx for the insights. It helps a lot. 😊

askerlee / segtran

Different results on zero-shot learning #40