binli123 / dsmil-wsi

DSMIL: Dual-stream multiple instance learning networks for tumor detection in Whole Slide Image
MIT License
378 stars 88 forks source link

Train/Test Split of Camelyon16 #22

Closed gokberkgul closed 3 years ago

gokberkgul commented 3 years ago

Hi,

In the code I noticed that you split the Camelyon16 dataset with the formula

train_path = bags_path.iloc[0:int(len(bags_path)*(1-args.split)), :]
test_path = bags_path.iloc[int(len(bags_path)*(1-args.split)):, :]

For the default argument of 0.2, this results in 320 train and 80 test WSIs. For the AUC and Accuracy results you have given in the paper, are they calculated with this split, or the standard 270 training/130 test WSI split of Camelyon16? If you used 320 training slides, is the SimCLR part also trained with 320 slides or 270 slides?

binli123 commented 3 years ago

The results in the paper are based on 270 training/130 test split and the SimCLR embedder is trained with the 270 training slides so the reported accuracy in the paper is a few percent lower. The uploaded embedder weights are trained with 270 training slides.