About the experiments in Sec.4.2, how to do data balance on ISBI 2012 datasets

Coldmooon commented 7 years ago

Hello, I'd like to replicate the results of experiments in Sec.4.2 and cite TI-Pooling in my paper. But I found that the ratio of white samples (label = 255) to black samples (label = 0) significantly affects the final classification accuracy. Could you describe the details on how you did data balance of ISBI 2012 dataset? I'd appreciate it if you kindly provide the script for data balance or the dataset of balanced version used in the paper.

dlaptev commented 7 years ago

Hi @Coldmooon, thanks for your interest. Yes, you are right, this dataset is poorly-balanced and one needs to do the data sampling.

Unfortunately, I did not manage to find the exact scripts I was using for the data pre-processing. If I recall correctly, for training we take all the available membrane patches (label = 0), and sample equal number of non-membrane patches (label = 1) randomly. On the boundaries we use mirroring.

One additional tip if you want to play with this dataset. The network topology we used is quite sub-optimal: we basically just reuse what we had for MNIST. I would at least try some more CIFAR-like architecture.

Coldmooon commented 7 years ago

Many thanks, that's clear enough.

dlaptev / TI-pooling

About the experiments in Sec.4.2, how to do data balance on ISBI 2012 datasets #5