YijinHuang / Lesion-based-Contrastive-Learning

This is the official implementation of the paper Lesion-based Contrastive Learning for Diabetic Retinopathy Grading from Fundus Images.
38 stars 6 forks source link

About experiment details. #2

Closed hellohawaii closed 2 years ago

hellohawaii commented 2 years ago

Dear author:

Your paper is excellent and inspiring. I wonder how do you set the batch size when you run SimCLR on Kaggle dataset. I can not find this detail in your paper. And by the way, how many GPUs are used for your experiment?

Thank you in advance!

YijinHuang commented 2 years ago

Hi, thank you for your interest. For SimCLR (128 128), the batch size is set to be the same as Lesion-based CL, i.e. 768. For SimCLR (224 224), due to the limitation of CUDA memory, the batch size is set to be 256. Only one GPU, TITAN RTX with 24 GiB memory, is used for training.

hellohawaii commented 2 years ago

Thank you very much!

By the way, how do your model do Linear evaluation? Do I need to do dectection first to get possible lesion candidate patches and then train the linear layer? But if so, how to determine the labels of patches?

Sorry to bother you but I struggled to figure this out in your paper and the pytorch-classification repo you provided.

YijinHuang commented 2 years ago

You're welcome.

Detection is only involved before the training of Lesion-based CL, which provides lesion patches for CL. In the evaluation phase, for both transfer capacity evaluation and linear evaluation, we initialize the network with parameters from the model pre-trained by Lesion-based CL. The classification network takes entire images as input and outputs corresponding DR scores. You can also think of this procedure as normal transfer learning. The difference between transfer capacity evaluation and linear evaluation is that the convolution layers are frozen and only the linear layer is fine-tuned in the linear evaluation.

Note that the input size for CL and evaluation (classification ) is different. In the CL phase, the patch size is 128 128, while in the evaluation (classification) phase, the image size is 512 512. In this way, we can align the pixel size of the input for CL and that for downstream tasks, and thus, the parameters from patch-based training can be transferred to the model for entire images classification.

If you have further questions, please feel free to ask me.

hellohawaii commented 2 years ago

Thank you a lot. I do not have questions any more.

hellohawaii commented 2 years ago

Hi, sorry to bother you again, but I have one more question. In your paper, Page 8, you wrote that "However, crop-and-resize transformation may critically change the pixel size of the input. SimCLR (224 × 224) experiments are conducted based on the consideration that aligning the pixel size of the in- put for CL and that for downstream tasks may achieve better performance" .

By "aligning the pixel size", do you mean that when applying data augmentation for SimCLR, RandomCrop is used instead of RandomResizedCrop?Thank you in advance.

hellohawaii commented 2 years ago

Hi, could you please provide more details for the experiments on SimCLR? I got much lower kappa(0.25) results when trying to reproduce the baseline performance mentioned in your paper. To make it more specific, my quesitions are:

1.Did you use the same data augumentation for SimCLR as the Lesion-base CL? 3.What is the difference between SimCLR(224224)and SimCLR(128128). I wonder why SimCLR (224 × 224) can align pixel size while SimCLR(128*128) can not. I suppose that the input image size when doing linear evaluation for both SimCLR settings is 512 as the Lesion-base CL.

YijinHuang commented 2 years ago

Due to the space limitation of the paper, we did not provide many details for the experiments on SimCLR. Therefore, if you have any questions about it, please feel free to ask me.

  1. No. RandomResizedCrop is still applied for data augmentation, but to align the pixel size, cropping with a larger scale and ratio is used. In the Lesion-based CL, cropping scale and ratio are [0.8, 1.2], while in the plain SimCLR, [0.3, 1.2] is applied. In this way, we expect the pixel size of the more patches input in the CL close to that of the original images.
  2. Except for the cropping, all other data augmentation operations are the same as the Lesion-based CL.
  3. Yes, all settings of the evaluation for SimCLR are the same as the Lesion-based CL for a fair comparison. It is difficult to align the pixel size of the patches for SimCLR (128 128) with the original image because (1) the input size of SimCLR (128 128) is much smaller than the original image (512 * 512), (2) as mentioned in the first point, we increase the cropping scale and ratio to align the pixel size, but no performance improvement is observed when we further increase it.

By the way, did you process the dataset using the code we provided in tools?

hellohawaii commented 2 years ago

Thanks for your patient reply.

I used my own pre-processing code that crop out the black boundray. For those fundus images that are not whole circle like this "| ( ) |" (compared with fundus image like this "| O |" ), my codes will leave some black boundray on the shorter side while the tool you provided will stretch the shorter side to fit the image and make the image oval.

I will try to re-produce your results with the details you provided and let you know if I finally made it. Thank you again for your help!

YijinHuang commented 2 years ago

That's fine but if you use PIL to save your cropped image, remember to set quality=100, subsampling=0 when calling the save function. The quality of the input images will significantly affect the grading performance. Moreover, I have uploaded the configuration file for training the classification network in pytorch-classification. It might help you.