spreka / biomagdsb

This repository contains the codes to run the nuclei segmentation pipeline of the BIOMAG group inspired by Kaggle's Data Science Bowl 2018 competition
52 stars 15 forks source link

Intro

This repository contains the codes to run the nuclei segmentation pipeline of the BIOMAG group inspired by Kaggle's Data Science Bowl 2018 competition

Some resulting masks obtained by our method:

picture

Prerequisites

Please see requirements.txt that can also be run as a bash script (Linux) or alternatively, you can copy the install commands to console corresponding to your system (command prompt (Windows) / terminal (Linux)) and execute them.

    git clone https://github.com/matterport/Mask_RCNN.git
    git checkout 53afbae5c5159b5a10ecd024a72b883a2b058314

Data

Our method expects images to be 8-bit 3 channels RGB images in .png format. See our script to convert your images.

Prediction

Download our pre-trained models from our google drive

You can choose either full prediction with post-processing or fast prediction; the former takes longer to complete and requires more VRAM.

Full prediction pipeline with post-processing

Predicts nuclei first with a presegmenter Mask R-CNN model, estimates cell sizes, predicts with multiple U-Net models and ensembles the results, then uses all of the above in a final post-processing step to refine the contours. To predict nuclei on images please edit either

and specify the following 3 directories with their corresponding full paths on your system:

Note: pre-processing scripts are provided to convert your test images. See further details in the documentation.

Fast prediction

Predicts nuclei with a presegmenter Mask R-CNN model that generalizes and performs well in varying image types. Produces fast results that can be improved with the post-processing option above. To predict fast: Please follow the steps of "PREDICTION WITH POST-PROCESSING" section for either of the files:

See further details in the documentation.

Custom validation

To use your custom folder of images as validation please run the following script according to your operating system:

See further details in the documentation.

Training

Obtain our pre-trained classifier pretrainedDistanceLearner.mat for training by either:

WARNING: it is possible to overwrite our provided trained models in this step. See documentation for details.

We include a .mat file with the validation image names we used for the Kaggle DSB2018 competition. If you would like to use your own images for this pupose, see Custom validation above.

WARNING: training will override the U-Net models we provide, we advise you make a copy of them first from the following relative path: \kaggle_workflow\unet\

To train on your own images please run the following script according to your operating system:

NOTE: for Windows you need to edit start_training.bat and set your python virtual environment path as indicated prior to running the script. It will open a second command prompt for necessary server running of pix2pix and must remain open until all pix2pix code execution is finished - which is indicated by the message "STYLE TRANSFER DONE:" in command prompt.

See further details in the documentation.

Parameter search for post-processing

A generally optimal set of parameters are provided in the scripts as default. However, you can run our parameter optimizer to best fit to your image set.

To find the most optimal parameters please run the following script according to your operating system:

and see the found parameters in the text file \kaggle_workflow\outputsValidation\paramsearch\paramsearchresult.txt

See further details in the documentation.

Prepare style transfer input for single experiment

To prepare style transfer on your own images coming from the same experiment please run the following script according to your operating system:

After this you are ought to run these training scripts instead of the ones above:

as these scripts would use the single experiment data for style transfer learning.

WARNING: If you do not provide your own mask folder for this step the default option will be \kaggle_workflow\outputs\presegment which is created by the fast segmentation step of our pipeline. Please run it prior to this step to avoid 'file not found' errors.

NOTE: This option should only be used if all your images come from the same experiment. If you provide mixed data, subsequent style transfer learning will result in flawed models and failed synthetic images.

Preprocess test images

If your test images are 16-bit you may want to convert them to 8-bit 3 channel images with either

Citation

Please cite our paper if you use our method:

Reka Hollandi, Abel Szkalisity, Timea Toth, Ervin Tasnadi, Csaba Molnar, Botond Mathe, Istvan Grexa, Jozsef Molnar, Arpad Balind, Mate Gorbe, Maria Kovacs, Ede Migh, Allen Goodman, Tamas Balassa, Krisztian Koos, Wenyu Wang, Juan Carlos Caicedo, Norbert Bara, Ferenc Kovacs, Lassi Paavolainen, Tivadar Danka, Andras Kriston, Anne Elizabeth Carpenter, Kevin Smith, Peter Horvath (2020): “nucleAIzer: a parameter-free deep learning framework for nucleus segmentation using image style transfer”, Cell Systems, Volume 10, Issue 5, 20 May 2020, Pages 453-458.e6