cvlab-stonybrook / SAMPath

Repository for "SAM-Path: A Segment Anything Model for Semantic Segmentation in Digital Pathology" (MedAGI2023, MICCAI2023 workshop)
30 stars 6 forks source link

How the overlapping ratio in image cropping is determined for the BCSS dataset? #3

Closed windygoo closed 11 months ago

jingweizhang-xyz commented 11 months ago

First of all, we try to not overlap images. But because sizes of images varies in the BCSS dataset, we need to overlap some but did not set a constant overlapping ratio. For each image, we first calculate how many patches can we divide by taking the upper bound. Then overlapping pixel of each image is calculated by averaging the remaining. E.g. for a image with width 3000 and we need image size of 1024, we will divide it into ceil(3000 / 1024) = 3 images. The remaining is 1024 * 3 - 300 = 72 pixels. Dividing by two overlapping region between 3 images, the overlapping pixel of each image is 36. This results in 3 image [1, 1024], [988, 2012], [1976, 3000].

windygoo commented 11 months ago

First of all, we try to not overlap images. But because sizes of images varies in the BCSS dataset, we need to overlap some but did not set a constant overlapping ratio. For each image, we first calculate how many patches can we divide by taking the upper bound. Then overlapping pixel of each image is calculated by averaging the remaining. E.g. for a image with width 3000 and we need image size of 1024, we will divide it into ceil(3000 / 1024) = 3 images. The remaining is 1024 * 3 - 300 = 72 pixels. Dividing by two overlapping region between 3 images, the overlapping pixel of each image is 36. This results in 3 image [1, 1024], [988, 2012], [1976, 3000].

Thanks! I get it.

I have two more questions:

  1. How is the fold ID determined for each image?
  2. What category does each number in the mask files represent?
jingweizhang-xyz commented 11 months ago
  1. For the bcss dataset, the test images (fold=-1) are the officially provided ones. For train and validation, they are randomly assigned.
  2. We use class 0 to represent regions not taken into consideration, because there are some unlabeled regions in the BCSS dataset. These regions are removed when calculating losses or metrics. The original BCSS dataset contains more than 20 classes and we, followed some previous works, used only 4 major classes of them. The rest are merged into the "other" class. 0: unlabeled 1, # other (gray). 2, # tumour (red) 3, # stroma (green) 4, # inflammatory (purple) 5, # necrosis (blue),
windygoo commented 11 months ago
  1. For the bcss dataset, the test images (fold=-1) are the officially provided ones. For train and validation, they are randomly assigned.
  2. We use class 0 to represent regions not taken into consideration, because there are some unlabeled regions in the BCSS dataset. These regions are removed when calculating losses or metrics. The original BCSS dataset contains more than 20 classes and we, followed some previous works, used only 4 major classes of them. The rest are merged into the "other" class. 0: unlabeled 1, # other (gray). 2, # tumour (red) 3, # stroma (green) 4, # inflammatory (purple) 5, # necrosis (blue),

Thanks very much! I get it.

windygoo commented 11 months ago
  1. For the bcss dataset, the test images (fold=-1) are the officially provided ones. For train and validation, they are randomly assigned.
  2. We use class 0 to represent regions not taken into consideration, because there are some unlabeled regions in the BCSS dataset. These regions are removed when calculating losses or metrics. The original BCSS dataset contains more than 20 classes and we, followed some previous works, used only 4 major classes of them. The rest are merged into the "other" class. 0: unlabeled 1, # other (gray). 2, # tumour (red) 3, # stroma (green) 4, # inflammatory (purple) 5, # necrosis (blue),

emmm, I have some other questions about the data pre-processing process. I am wondering if you can incorporate the pre-processing file into this repo. so that I can check it myself.

jingweizhang-xyz commented 11 months ago

Here are the preprocessing code. If you have any questions, please let me know as some code are from a previous project. preprocess.zip

windygoo commented 11 months ago

Here are the preprocessing code. If you have any questions, please let me know as some code are from a previous project. preprocess.zip

Thanks!