Closed windygoo closed 11 months ago
First of all, we try to not overlap images. But because sizes of images varies in the BCSS dataset, we need to overlap some but did not set a constant overlapping ratio. For each image, we first calculate how many patches can we divide by taking the upper bound. Then overlapping pixel of each image is calculated by averaging the remaining. E.g. for a image with width 3000 and we need image size of 1024, we will divide it into ceil(3000 / 1024) = 3 images. The remaining is 1024 * 3 - 300 = 72 pixels. Dividing by two overlapping region between 3 images, the overlapping pixel of each image is 36. This results in 3 image [1, 1024], [988, 2012], [1976, 3000].
Thanks! I get it.
I have two more questions:
- For the bcss dataset, the test images (fold=-1) are the officially provided ones. For train and validation, they are randomly assigned.
- We use class 0 to represent regions not taken into consideration, because there are some unlabeled regions in the BCSS dataset. These regions are removed when calculating losses or metrics. The original BCSS dataset contains more than 20 classes and we, followed some previous works, used only 4 major classes of them. The rest are merged into the "other" class. 0: unlabeled 1, # other (gray). 2, # tumour (red) 3, # stroma (green) 4, # inflammatory (purple) 5, # necrosis (blue),
Thanks very much! I get it.
- For the bcss dataset, the test images (fold=-1) are the officially provided ones. For train and validation, they are randomly assigned.
- We use class 0 to represent regions not taken into consideration, because there are some unlabeled regions in the BCSS dataset. These regions are removed when calculating losses or metrics. The original BCSS dataset contains more than 20 classes and we, followed some previous works, used only 4 major classes of them. The rest are merged into the "other" class. 0: unlabeled 1, # other (gray). 2, # tumour (red) 3, # stroma (green) 4, # inflammatory (purple) 5, # necrosis (blue),
emmm, I have some other questions about the data pre-processing process. I am wondering if you can incorporate the pre-processing file into this repo. so that I can check it myself.
Here are the preprocessing code. If you have any questions, please let me know as some code are from a previous project. preprocess.zip
Here are the preprocessing code. If you have any questions, please let me know as some code are from a previous project. preprocess.zip
Thanks!
First of all, we try to not overlap images. But because sizes of images varies in the BCSS dataset, we need to overlap some but did not set a constant overlapping ratio. For each image, we first calculate how many patches can we divide by taking the upper bound. Then overlapping pixel of each image is calculated by averaging the remaining. E.g. for a image with width 3000 and we need image size of 1024, we will divide it into ceil(3000 / 1024) = 3 images. The remaining is 1024 * 3 - 300 = 72 pixels. Dividing by two overlapping region between 3 images, the overlapping pixel of each image is 36. This results in 3 image [1, 1024], [988, 2012], [1976, 3000].