mahmoodlab / CLAM

Data-efficient and weakly supervised computational pathology on whole slide images - Nature Biomedical Engineering
http://clam.mahmoodlab.org
GNU General Public License v3.0
1.12k stars 362 forks source link

magnitude parameter #145

Open Bontempogianpaolo1 opened 2 years ago

Bontempogianpaolo1 commented 2 years ago

Hi,

thank you for your work. It is very amazing! Is it possible to ask the actual magnitude level of the wsi instead of the relative level of the wsi? The reason is that I'm working on a dataset on a different scale magnitude and I have the feeling that using the patch level hyperparameter doesn't guarantee the same scale on every slide. Since I really need this feature I could even help somehow.

Thank you!

fedshyvana commented 2 years ago

hi, could you clarify what you mean by magnitude level? you can access standard metadata such as micron per pixel (mpp) and magnification from the openslide object if that's what you mean.

Bontempogianpaolo1 commented 2 years ago

does CLAM consider these parameters? Looking the code it seems it considers only the patch_level parameter. Am I wrong?

fedshyvana commented 2 years ago

correct - it does not. Our typical workflow just specifies the desired magnification level by setting the patch_level paramter. Without modifying the codebase, I believe if you have slides that are of different resolutions, and therefore require different patch_levels, you should be able to compute what the level should be for each slide in advance (e.g. by accessing the metadata), and then just passing that list of patch_levels to the script via the --process_list argument.

fedshyvana commented 2 years ago

I agree a better a way might be for the script to automatically parse out which patch_level to use, and give the user the option of just specifying the desired magnification level

clemsgrs commented 1 year ago

Hi @Bontempogianpaolo1, not sure if it's still relevant but I've implemented the functionality you were looking for as it happened I also needed it. Basically, I've created an additional method for the WholeSlideImage class that returns the level whose pixel spacing (in micron) is the closest to the target value passed by the user:

def get_best_level_for_spacing(self, target_spacing: float):
        # OpenSlide gives the resolution in centimeters so we convert this to microns
        x_res = float(self.wsi.properties['tiff.XResolution']) / 10000
        y_res = float(self.wsi.properties['tiff.YResolution']) / 10000
        x_spacing, y_spacing = 1 / x_res, 1 / y_res
        downsample_x, downsample_y = target_spacing / x_spacing, target_spacing / y_spacing
        # get_best_level_for_downsample just chooses the largest level with a downsample less than user's downsample
        # see https://github.com/openslide/openslide/issues/274
        # so I made my own version of get_best_level_for_downsample
        assert self.get_best_level_for_downsample_custom(downsample_x) == self.get_best_level_for_downsample_custom(downsample_y)
        level = self.get_best_level_for_downsample_custom(downsample_x)
        return level

This function leverages another method, get_best_level_for_downsample_custom, which is a quick workaround to fix the behaviour of openslide default get_best_level_for_downsample method:

def get_best_level_for_downsample_custom(self, downsample):
        return np.argmin([abs(x-downsample) for x in self.wsi.level_downsamples])

Then you should be good to go by simply adding a target_spacing argument to the self.process_contour method and add the following line at the beginning of that same method:

patch_level = self.get_best_level_for_spacing(target_spacing)

Remark: this works if

For example, all my slides were acquired with an Aperio scanner, whose pixel spacing at 40x is 0.25 µm/pixel. Hence, by passing target_spacing = 0.50, I'm patching all slides at 20x.

Feel free to reach out if you have additional questions.

GeorgeBatch commented 7 months ago

There is another prerequisite. The resolution you want to get should be available at one of the levels.

Some slides (e.g. in TCGA) have downsample factors [1, 4, 16, 64] for levels [0, 1, 2, 3]. So the micron per pixel (mpp) values you can get with your code for your example are [0.25, 1, 4, 16] microns per pixel. So for 0.5 mpp and a target size of your tile 256 you would need to extract a tile of 512 pixels at 0.25 mpp and then resize it down to 256. So you still need to address this in the process list.