Request about Release the pre-processed Dataset

liuquande / FedDG-ELCFS

[CVPR'21] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space

243 stars 34 forks source link

Request about Release the pre-processed Dataset #3

Closed xxliang99 closed 3 years ago

xxliang99 commented 3 years ago

Dear Author, I met some problems at step 2 "Prepare the dataset", as I do not know what it exactly is in the .npy file (e.g. converted from raw image? or after some pre-processing? or something else), and there is no reference. Therefore, could you please release the re-organized dataset that we could directly applied in training? I mean, the "/Dataset" repository with clientxxx and proper placed .npy files in it. Dataset for either task would help a lot. Thank you very much for your attention. Also congratulations!

lichen14 commented 3 years ago

Sorry to say that I also found the same question during the reproduction process. According to the reference[52,10,40], I searched for relevant raw datasets in the optic segmentation task. The corresponding datasets are DRISHTI-GS1, RIM-ONE, and REFUGE. However, the corresponding relationship between the above datasets and the dataset site A~D used in the paper seems to be unclear. For example, I can speculate that siteA is DRISHTI-GS1, but RIM-ONE has released three versions, the numbers are 169, 455 and 159 respectively. So, is it siteB? Similarly, I haven't predict the attribution of siteC and D yet. Therefore, I have a similar problem with @VivianLiang1108 , and I hope the author can answer it or give a corresponding relationship. If you can directly give the preprocessed dataset, we will be grateful!!!! @liuquande

xxliang99 commented 3 years ago

@lichen14 On the line 63 of train_ELCFS.py says "slice_num = np.array([101, 159, 400, 400])", and its following code demonstrates that client_weight depends on the amount of samples from each client. Then we could infer that site B is RIM-ONE v3, while C & D remain unknown(but with the amount of each 400). The supporting team of REFUGE dataset somehow rejected my application and I still cannot find other resources online. If you are accessible to REFUGE and know its amount of data, then my guess might help. Hope it works :) (And I would be extremely grateful if there is an accessible download link of REFUGE)

liuquande commented 3 years ago

Hi Vivian and lichen,

For fundus datasets, we direct download the data from Fundus. The detailed information of each dataset could be found in the Supplementary of our arxiv paper. Among these data, samples of sites A are from Drishti-GS [52] dataset; samples of site B are from RIM-ONE-r3 [10] dataset; samples of site C, D are from the training and testing set of REFUGE [40] dataset.

Also thanks @VivianLiang1108 for the clarification.

Moreover, the input ".npy" files in data loader acturally denote these data. During preprocessing, we simply transform each data from ".png/jpg" to ".npy" format, in order to speed up the data loading and federated training process.