This repository accompanies Zingman et al. A comparative evaluation of image-to-image translation methods for stain transfer in histopathology (MIDL, 2023). The paper analyses strengths and weaknesses of image-to-image translation methods for stain transfer in histopathology, thereby allowing a rational choice of the most suitable approach.
├── README.md <- The top-level README for developers using this project.
│
├── config.yaml <- Default yaml configuration file for inference experiments.
│
├── data <- Directory for input data.
│
├── models <- Trained and serialized models.
│
├── results <- Generated images and metrics are saved here.
│
├── requirements.txt <- The requirements file for reproducing the analysis environment, e.g. generated with `pip freeze > requirements.txt`
│
├── src <- Source code for use in this project.
│ │
│ ├── data_processing <- Data manipulation functionalities.
│ │
│ ├── modelling <- Model arquitectures and definitions.
│ │
│ ├── tests <- Tests directory.
│ │
│ └── utils <- Utilities used by other scripts.
│
├── main.py <- Main execution file, used for generating fake images and for calculating metrics.
│
├── metric_calculation.py <- Calculates FID, WD, and SSIM metrics for generated images
│
└── visualizer.py <- Tool for visualizing generated fake images and inspecting metrics.
Create a working environment, with e.g. conda and activate it. It is recommended not to use virtualenv as there are problems with the spams library.
Clone this repository: git clone https://github.com/Boehringer-Ingelheim/stain-transfer
Install spams: conda install -c conda-forge python-spams
.
Install the required modules pip install -r requirements.txt
.
Download trained models and test or validation histopathological dataset from https://osf.io/byf27/.
Trained models can be saved in models/
folder, H&E stained samples and Masson's trichrome stained samples
from the dataset can be saved in data/he/
and data/masson/
folders.
python main.py --conf config_example_he2mt.yaml
, or use a different yaml configuration file
avaialble in the root folder of the project. This will generate artificially created samples (images with Masson's
Trichrome artificially stained tissue from images of H&E stained tissue). The detailed description
of the fields in the configuration file is available in config.yaml
python metric_calculation.py --real_source data/he/ --real_target data/masson/ --fakes results/masson_fake/
.
The given paths correspond to the paths in config_example_he2mt.yaml
. This will
generate excel file with computed FID, WD, and SSIM metrics that evaluate the quality of the
created images of artificially stained tissue. //: # ()
//: # ()
//: # ()
//: # ()
All parameters for generating fakes are inside the generate
key in the yaml.
You will find a models
key here which is for specifying what models to use,
and their associated weights. If weights
are not provided, then default
weights for each model will be loaded, based on the a2b
key which specifies
the direction of domain translation. The default weights are located in the
models folder and the name of each file is _modeldirection.pth where model
is the name of the model in lowercase and direction is either he2mt or
mt2he. For example, each of the following configuration would be using the same
model and weights file:
```yaml generate: models: names: [ cyclegan ] weights: [ models/cyclegan_he2mt.pth ] a2b: true ``` | ```yaml generate: models: names: [ cyclegan ] a2b: true ``` |
but this one uses a different weights files:
generate:
models:
names: [ cyclegan ]
weights: [ retrained.pth ]
a2b: true
You can generate fakes for a given data_path
using multiple models at once:
generate:
models:
names: [ cyclegan, cut, munit, macenko, pix2pix, vahadane ]
a2b: true
data_path: data/processed/HE/
Available models are:
For munit, drit, colorstat, macenko and vahadane there are different inference configurations available.
For munit and drit (the higher the number the higher the precedence):
target_tensor
key. This precomputed tensor can be for example, the
average style/attribute of all images in the target domain.target_path
. The
number of target images used for each translation will be the same
as batch_size
.For colorstat, macenko and vahadane (the higher the number the higher the precedence):
target_tensor
key.target_path
and the number of images to be considered in each
translation in target_samples
.You can use inference modes 1. and 3. (default mode and computing tensors on the fly) with multiple of these models, and other models, from a single configuration:
generate:
models:
names: [ cyclegan, cut, munit, macenko, pix2pix, vahadane ]
a2b: true
data_path: data/processed/HE/
target_path: data/processed/masson_trichrome
target_samples: 2
If you are using inference mode 2. (precomputed tensors) for any of these models, then you can't generate fakes at once from one single configuration file that includes any two of these models. The configuration on left is wrong, since this target tensor is a style tensor for munit, and would fail for macenko and vahadane. The one on the right is valid.
```yaml generate: models: names: [ cut, munit, macenko, pix2pix, vahadane ] a2b: true data_path: data/processed/HE/ target_tensor: precomputed_munit_style_tensor.pth ``` | ```yaml generate: models: names: [ cut, munit, pix2pix ] a2b: true data_path: data/processed/HE/ target_tensor: precomputed_munit_style_tensor.pth ``` |
There is also the option to precompute average tensors for all images in a given path. To do so, specify one of the following models in the configuration yaml:
data_path
computes the average mean
and std of images.data_path
computes the average stain
matrix and 99th percentile of the concentration matrix of images using macenko
method.data_path
computes the average style of
images. The default munit weights will be used if weights
is not specified
in the configuration. When using this model a2b
key should be specified to
know if data_path
contains images from domain A or B.Computed average tensors will be saved in _results/avtensors folder if no
results_path
is provided in the configuration. Each model will have its own
sub-folder with computed tensors. The computed tensors can then be set
as target_tensor
when generating fakes.
//: # ()
//: # ()
//: # ()
//: # ()
//: # ()
//: # ()
Use metric_calculation.py
to compute the metrics, e.g.:
python metric_calculation.py --real_source data/he/ --real_target data/masson/ --fakes results/masson_fake/ --device 0
Provide the following required arguments:
A csv with SSIM, FID and WD will be generated.
Performance of different Image-to-Image translation methods on validation dataset (please, see the details in A comparative evaluation of image-to-image translation methods for stain transfer in histopathology).
Model | FID $\downarrow$ | WD $\times 10^4$ $\downarrow$ | SSIM $\uparrow$ |
---|---|---|---|
CycleGAN | 16.33 | 1.46 | 0.951 |
CUT | 17.10 | 1.60 | 0.914 |
MUNIT | 19.20 | 1.61 | 0.871 |
StainGAN | 19.59 | 3.27 | 0.952 |
UNIT | 20.23 | 2.54 | 0.940 |
UTOM | 20.64 | 2.32 | 0.952 |
DRIT | 22.83 | 2.06 | 0.915 |
Pix2Pix | 48.47 | 8.42 | 0.998 |
StainNet | 50.49 | 11.41 | 0.972 |
ColorStat | 62.13 | 9.60 | 0.974 |
Macenko | 70.39 | 12.90 | 0.926 |
Vahadane | 76.55 | 15.14 | 0.911 |
@inproceedings{zingman2024comparative,
title={A comparative evaluation of image-to-image translation methods for stain transfer in histopathology},
author={Zingman, Igor and Frayle, Sergio and Tankoyeu, Ivan and Sukhanov, Sergey and Heinemann, Fabian},
booktitle={Medical Imaging with Deep Learning},
pages={1509--1525},
year={2024},
organization={PMLR}
}