PathologyDataScience / NuCLS

NuCLS: A scalable crowdsourcing, deep learning approach and dataset for nucleus classification, localization and segmentation
MIT License
45 stars 13 forks source link

Please provide me some details regarding the implementation #3

Closed zunairaR closed 3 years ago

zunairaR commented 3 years ago

Hey! I found this dataset and implementation very useful and trying to implement it, but there are few things which are not getting into my mind. Please clear my queries. I'll highly appreciate quick reply from your side. Query 1: According to my understanding input images to the model are resized to 300x300. Can you please explain me how they are cropped for images of different sizes? Query 2: Is the function https://github.com/CancerDataScience/NuCLS/blob/c41cf71e1088012ee3fce681cc37744790645a5d/nucls_model/torchvision_detection_utils/transforms.py#L365 implements both the color normalization and augmentation reported in the paper?

Waiting for a quick reply. Thankyou in advance.

kheffah commented 3 years ago

Hello @zunairaR, Thank you for your interest, I'm very happy that you found the dataset and modeling approach useful. Apologies for missing your earlier message. Here's the answer to your questions:

1- The images are not resized, but randomly cropped on the fly. So every time an image is loaded, a different 300x300 pixel region is cropped as a form of augmentation (during training). This is done using this line. That being said, please note that MaskRCNN can actually accommodate images of different sizes even at training time, since all of the fully connected layers are done per object, not per image. If you take a look at datasets like MS COCO, you will actually notice that the images have varying size.

2- The function you alluded to only does the color augmentation, which is the most important step when training. The color normalization was done for the all images before training using the histomicstk standard library's deconvolution_based_normalization() function. See here for example.

I hope this answers your questions. Let me know if you have any further questions.

Cheers,

zunairaR commented 3 years ago

Thankyou so much for your reply. Please also mention, does the provided images already undergone the stain normalization or I have to explicitly do it before training?

Thank you

Wsalam

On Tue, May 11, 2021, 1:05 AM Mohamed Amgad Tageldin < @.***> wrote:

Hello @zunairaR https://github.com/zunairaR, Thank you for your interest, I'm very happy that you found the dataset and modeling approach useful. Apologies for missing your earlier message. Here's the answer to your questions:

1- The images are not resized, but randomly cropped on the fly. So every time an image is loaded, a different 300x300 pixel region is cropped as a form of augmentation (during training). This is done using this line https://github.com/CancerDataScience/NuCLS/blob/c41cf71e1088012ee3fce681cc37744790645a5d/nucls_model/DataLoadingUtils.py#L469. That being said, please note that MaskRCNN can actually accommodate images of different sizes even at training time, since all of the fully connected layers are done per object, not per image. If you take a look at datasets like MS COCO, you will actually notice that the images have varying size.

2- The function you alluded to only does the color augmentation, which is the most important step when training. The color normalization was done for the all images before training using the histomicstk standard library's deconvolution_based_normalization() function. See here https://digitalslidearchive.github.io/HistomicsTK/examples/color_normalization_and_augmentation.html for example.

I hope this answers your questions. Let me know if you have any further questions.

Cheers,

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CancerDataScience/NuCLS/issues/3#issuecomment-837253481, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANQKPZBQAH4SKUWLVBJILZDTNA4CLANCNFSM44RXKYSQ .

kheffah commented 3 years ago

@zunairaR The provided images are not color normalized. You can normalize them yourself before training if you like, or you may even decide to train without color normalization and rely on color augmentation alone.

kheffah commented 3 years ago

Hopefully I've answered your questions, if not feel free to re-open this issue.

zunairaR commented 3 years ago

Thankyou so much @kheffah for answering my queries. One thing which is still not clear to me is the "resizing with an aspect ratio" thing. Are you resizing the images during inference. Secondly, "digitally increasing magnification beyond 40x objective" how this is obtained? I'm sorry if I'm asking same things but there are few things I'm not clear with. Thanks in advance.

kheffah commented 3 years ago

@zunairaR You're welcome, of course. Resizing in NuCLS is done using a scale factor, instead of resizing to a fixed size. So for example, using a scale factor of 2.0, each image would be resized to twice it original size (during both training and inference). This parameter controls this behavior. "Digitally increasing magnification" is just another way of saying the same thing .. that even though the slide is scanned at 40x magnification, we upsample the image by a scale factor during training and inference.

zunairaR commented 3 years ago

Thank you for the clarification.

On Wed, May 19, 2021, 7:25 AM Mohamed Amgad Tageldin < @.***> wrote:

@zunairaR https://github.com/zunairaR You're welcome, of course. Resizing in NuCLS is done using a scale factor, instead of resizing to a fixed size. So for example, using a scale factor of 2.0, each image would be resized to twice it original size (during both training and inference). This parameter https://github.com/CancerDataScience/NuCLS/blob/main/nucls_model/MaskRCNN.py#L167 controls this behavior. "Digitally increasing magnification" is just another way of saying the same thing .. that even though the slide is scanned at 40x magnification, we upsample the image by a scale factor during training and inference.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CancerDataScience/NuCLS/issues/3#issuecomment-843695018, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANQKPZHGKROAJKTHCQEUZMDTOMOQJANCNFSM44RXKYSQ .