binli123 / dsmil-wsi

DSMIL: Dual-stream multiple instance learning networks for tumor detection in Whole Slide Image
MIT License
358 stars 88 forks source link

Pre-trained Weights #21

Closed GeorgeBatch closed 2 years ago

GeorgeBatch commented 2 years ago

Hi Bin Li,

Thank you very much for such a thorough explanation of how everything works in the README! Also, thank you for making so many things public (representations, weights, and even some of the data).

Could you please explain which of the weights in the TCGA folder corresponds to which magnification? As I understand, one should be for the 5x and the other for the 20x magnification embedder network, but I do not understand which one is which.

Many thanks, George

binli123 commented 2 years ago

Those are all for high-mag. I uploaded low-mag weights to the same folder.

GeorgeBatch commented 2 years ago

Thank you very much! What's the difference between v0 and v1 models?

binli123 commented 2 years ago

They are the same models trained for different time lengths using SimCLR (~3 days vs ~ 2 weeks), you could monitor the loss via tensorboard and stop the training once the loss stops to change much. The recommended batch size is at least 512, in order to obtain a good embedded. For TCGA you could also train the embedded without considering MIL first (use slide label for patch labels) and then use the embedded to compute features for MIL. I have also added an option for using ImageNet pre-trained CNN to compute features.

GeorgeBatch commented 2 years ago

Thank you very much for the explanation! Just to clarify, was v0 trained for 3 days, and v1 - for 2 weeks?

binli123 commented 2 years ago

Thank you very much for the explanation! Just to clarify, was v0 trained for 3 days, and v1 - for 2 weeks?

Hi George,

I remember v0 leads to worse accuracy, so v0 should be the one with the shorter training time. Plus I lately check that the patches for TCGA seemed to be cropped from 10x / 2.5x instead of 20x / 5x, so that was possibly a misstatement in the paper. I need to confirm this regarding the actual magnifications, Camelyon16 seems to be scanned with a different type of scanner while TCGA mostly consists of 40x Aperio, but some of them also seem to be other types of scanners with different magnification levels. I have uploaded the patches I cropped for the experiments just in case you need them: https://drive.google.com/file/d/17zCn-WRNzxxxh8kkdBTbDLDZy0XZ3RIu/view?usp=sharing . You could place these patches in the WSI folder and start with compute_feats so on.

Best, Bin

GeorgeBatch commented 2 years ago

Hi Bin,

Thank you very much! Please let me know when you check the magnifications you used. I think 5x and 20x would make more sense for the lung histology images.

Thank you very much for sharing the files. What magnification of the cropped patches have you uploaded to the Google Drive folder?

Many thanks, George

binli123 commented 2 years ago

Hi Bin,

Thank you very much! Please let me know when you check the magnifications you used. I think 5x and 20x would make more sense for the lung histology images.

Thank you very much for sharing the files. What magnification of the cropped patches have you uploaded to the Google Drive folder?

Many thanks, George

Those are exactly what I used for the experiments in the paper. I thought they were 20x / 5x but when I recently compared them to the view of 20x in slide viewer they seem to be 10x / 2.5x. (not 1.25x)

GeorgeBatch commented 2 years ago

Thank you for both the clarification and for sharing the patches!