SIMclr training vs test sets configuration

Bontempogianpaolo1 commented 2 years ago

Hi @binli123 ,

I'm trying to replicate your results without success on camelyon16. I put the number of classes to 1 and also tried weights online for computing the feats on both training and test set. Even with that I still obtain only 0.7% AUC... So I start thinking about how I organized the data different from you. I downloaded the data from here: https://ftp.cngb.org/pub/gigadb/pub/10.5524/100001_101000/100439/CAMELYON16/ the data is divided into training and test. I used as threeshold 25 for filtering out background. So I used only the training set for training the self-supervised model. After that, even with the model you published on drive, I extracted feats with the compute_feat script for both training and test(especially with the fusion option). Finally, I modified the train_tcga for considering them as sources for the training set and the test set (270 /130 bags). Even

If instead, I use the features precomputed by you the mil model works. So the problem could be how I split data or how I extract embeddings. What am I missing?

binli123 commented 2 years ago

Could you check out the CSV files containing the features and labels？

Bontempogianpaolo1 commented 2 years ago

the csv seems correct... Here some screenshots of embeddings extracting using your pretrained model model_v2.pth found at https://drive.google.com/drive/folders/1_mumfTU3GJRtjfcJK_M0fWm048sYYFqi on patches extracted using 19 as threeshold:

camelyon.csv

normal143.csv

However comparing your features with mine the number of rows is different...So is it possible that the number of patches is influencing the results? Here the number of patches using different background thresholds for 5 different slides:

Slide name	th =19	th=25	your features
tumor_108	29905	402	23263
test_124	6693	3001	2402
tumor_095	39960	1002	31791
normal_137	33396	505	23443
tumor_076	61670	42057	19708

Maybe is the image quality not correct for your embedder? Here an example of patch extracted at level=0 magnitude=20

With this configuration the mil training remains under the 0.7 % AUC Thanks in advance for your reply

binli123 commented 2 years ago

the csv seems correct... Here some screenshots of embeddings extracting using your pretrained model model_v2.pth found at https://drive.google.com/drive/folders/1_mumfTU3GJRtjfcJK_M0fWm048sYYFqi on patches extracted using 19 as threeshold:

camelyon.csv normal143.csv

However comparing your features with mine the number of rows is different...So is it possible that the number of patches is influencing the results? Here the number of patches using different background thresholds for 5 different slides:

Slide name th =19 th=25 your features tumor_108 29905 402 23263 test_124 6693 3001 2402 tumor_095 39960 1002 31791 normal_137 33396 505 23443 tumor_076 61670 42057 19708 Maybe is the image quality not correct for your embedder? Here an example of patch extracted at level=0 magnitude=20

With this configuration the mil training remains under the 0.7 % AUC Thanks in advance for your reply

The feature values look strange. There are some abnormal values > 10. Did you use BatchnNorm or InstanceNorm consistently in the training and feature computation?

Bontempogianpaolo1 commented 2 years ago

I took directly your embedder without training and I passed it to the compute_feats script with InstanceNorm2d since it is the default parameter

binli123 commented 2 years ago

model_v2.pth

Have you tried model_v0.pth and model_v1.pth, did they also not work?

Bontempogianpaolo1 commented 2 years ago

not yet... I considered the v2 model as the best one

Bontempogianpaolo1 commented 2 years ago

screenshot features using model-v0

screenshot features using model-v1

binli123 commented 2 years ago

screenshot features using model-v0

screenshot features using model-v1

Those are very different from mine. There should not be values>10, they are all around the same scale. If you are using a newer GPU card please make sure cuda>=11.0, not 10.2

Bontempogianpaolo1 commented 2 years ago

sorry.. excel made some errors during the visualization... the real screenshots are these:

model-v0

model-v1

So all the numbers seems under the same scale....

binli123 commented 2 years ago

Does your normal_141_42_54.jpg look like this? My feature csv using v2 normal_141.csv

Bontempogianpaolo1 commented 2 years ago

I don't have it... What are the parameters you used for the script deepzoom_tiler.py in the case of Camelyon?

This is my normal_141 48_112.jpeg

this is my tumor_047 101_546.jpeg

Mine seems with an higher magnification maybe?

binli123 commented 2 years ago

It turns out that Camleyon16 consists of mixed magnifications, so by experimenting the correct configuration: python deepzoom_tiler.py -m 1 -b 20 -d Camelyon16-pilot -v tif

Bontempogianpaolo1 commented 2 years ago

In this way the magnitude become x10 right? is your embedder trained under this magnitude? Since it is inside the folder called x20 I didn't expect it

binli123 commented 2 years ago

In this way the magnitude become x10 right? is your embedder trained under this magnitude? Since it is inside the folder called x20 I didn't expect it

I think it is still 20x because the base magnification has ~0.25 micro/pixel which corresponds to 40x for the Aperio scanner (FDA standard). A 20x magnification corresponds to ~0.5 micron/pixel. Camelyon16 uses a mixture of magnifications with different micron/pixel.

Notice how their 20x and 40x scanners have almost the same micron/pixel? You will call the "20x" RUMC a "40x" image for UMCU. So better just use the FDA standard.

Bontempogianpaolo1 commented 2 years ago

Ok! I'm just trying it and inside the folder "temp" the patches are stored inside a "10" folder ( imagining it refers to the magnitude). Anyway, thank you very much for your replies! I'll just try the entire pipeline again with these new patches and I'll tell you the results as soon as possible

Bontempogianpaolo1 commented 2 years ago

It worked !! But I still have problems :(... I'm opening a new issue for that since it is not relative to the dataset but to the embedder

binli123 / dsmil-wsi

SIMclr training vs test sets configuration #39