binli123 / dsmil-wsi

DSMIL: Dual-stream multiple instance learning networks for tumor detection in Whole Slide Image
MIT License
332 stars 84 forks source link

Requests for multiscale features #49

Closed Linhao-Qu closed 1 year ago

Linhao-Qu commented 1 year ago

Hello, I appreciate very much for your source codes, and I have read your paper carefully and found it very inspiring. But I would like to request you to open source or send me the multi-scale features of TCGA and Camelyon16 data used in your paper.You have only open sourced a single scale version of them in the repository. Thank you very much and I look forward to your reply!

sky1652169058 commented 1 year ago

I hope for that too!

binli123 commented 1 year ago

https://drive.google.com/drive/folders/1wHyaZkpgVGSoxPpaCeUCAFGhxcS9ZZEA?usp=sharing

Linhao-Qu commented 1 year ago

Thanks a lot for your kind offer! I think it may bother you but could you please also share the 5-scale features of TCGA? Your work is indeed very inspiring to us and such good work can bring more valuable contribution to the community.

binli123 commented 1 year ago

https://drive.google.com/drive/folders/1v0ZEgSIYgriYuRn_O2p0t3RXdFHtiA61?usp=sharing

Linhao-Qu commented 1 year ago

Hello! We really appreciate your help and contributions! I notice that there are two different folders in the google disk (cat and fuse). Could you please tell me what they mean? And I notice that one contain 512-dimensional features and the other contain 1024-dimensional features. I checked the paper and found that you concatenated the two 512-dimensional features to get the multi-scale mixed features. But I find a big difference between your new open-sourced features and the 20x Camelyon feautres you open sourced before. Could you tell me how to use these features. We would like to perform experiments with the two magnification features in your paper respectively. Really look forward to your reply. @binli123

binli123 commented 1 year ago

Regarding 'fuse' and 'cat', you can find them in the code: https://github.com/binli123/dsmil-wsi/blob/5216150fe7a1021c94ab1a2af6c942073bd37c4f/compute_feats.py#L110-L113

The multiscale features of camelyon16 used model-v2 in this link (you can also find this in the readme)

The features directly downloaded from the script used model-v0 in the above link. The difference is training batch size and time. model-v0: 4096 batch size for about 500 epochs (this was obtained after the paper), model-v2: 512 batch size for about 50 epochs

Linhao-Qu commented 1 year ago

Thanks for your help! We used the Cam16 20-scale features directly downloaded from the script and the Cam16 20-scale features split from the multiscale features (the first 512 dimensions) to perform the training with DSMIL. We found that the former could reach a very high AUC (above 0.97) and the latter could reach AUC=0.905 which is similar to that reported in the paper. It is possible that I made some mistakes in my operation, but I did not find the problem. Have you ever encountered this situation before? By the way, I split the training dataset and the testing dataset according to the path. We will use the former features to perform further experiments. Really thank you for your contributions! @binli123

binli123 commented 1 year ago

I think I have explained it in my previous answer that the multiscale feature were computed using model-v2 and the 20x feature downloaded from the script were computed using model-v0 and they were trained using different settings and for different times.