Open yangsenwxy opened 2 years ago
https://drive.google.com/drive/folders/1_mumfTU3GJRtjfcJK_M0fWm048sYYFqi There are several model weights trained using only the training data. I also tested using both the training set and the testing set for SimCLR, the difference in the results is minor. What is the batch size you used? Please make sure the batch size is at least 512 and train for enough iterations in order to get an actual useful embedder from SimCLR, as pointed out in their paper. Bigger batch size and longer training time lead to better embedder and they have quite a big impact on the performance of the downstream task. The best embedder we obtained was trained for 2 months because of the large number of patches.
Plus, we are not the only ones who had luck with self-supervised learning on Camelyon16, https://arxiv.org/pdf/2012.03583.pdf where they showed that very high results can be obtained.
Thank you very much, I found that the features you extracted are only 0.86 if you train directly with the CLAM method.
Hi, are those weights trained using tcga data?
@raycaohmu
Camelyon16 weights: https://drive.google.com/drive/folders/1_mumfTU3GJRtjfcJK_M0fWm048sYYFqi
TCGA-lung weights: https://drive.google.com/drive/folders/1Rn_VpgM82VEfnjiVjDbObbBFHvs0V1OE
Camelyon16 weights: https://drive.google.com/drive/folders/1_mumfTU3GJRtjfcJK_M0fWm048sYYFqi
- see folder names for magnifications
TCGA-lung weights: https://drive.google.com/drive/folders/1Rn_VpgM82VEfnjiVjDbObbBFHvs0V1OE
- magnification: low=2.5x, high=10x
- pre-taining: v0 for 3 days, v1 for 2 weeks (better results)
Hi @GeorgeBatch,
I have seen the previous discussion on the magnification change for TCGA-lung patches. Could I please verify that when the above pre-trained model is specified as,
- magnification: low=2.5x, high=10x
this is only for 20x patches of the whole dataset? (so the pre-trained model is trained on 20x,5x (for 40x images) and 10x,2.5x (for 20x images))
Many thanks in advance. Piumi.
Hi @PiumiDS,
this is only for 20x patches of the whole dataset? (so the pre-trained model is trained on 20x,5x (for 40x images) and 10x,2.5x (for 20x images))
I am afraid I do not know the answer to your question myself. So here we will both need to wait for @binli123's answer
George
I have a question, your simlr is pre-training, does it include all the data of camelyon16 (training set and test set)? Because I found that your feature extractor is faulty, you leaked the information of the test set, I tried, only pre-trained on the training set, there is no such high result, I think you should check this problem carefully, resulting in your result is too high