This repository contains the code to replicate the results reported in the preprint and the publication. It also includes additional topologies reported in the -Phd. thesis-.
The trained models used to produce the reported results are provided here. They should be decompressed in the root of the repository.
The correction of attenuation effects in Positron Emission Tomography (PET) imaging is fundamental to obtain a correct radiotracer distribution. However direct measurement of this attenuation map is not error-free and normally results in additional ionization radiation dose to the patient.
This model obtains the whole body attenuation map using a 3D U-Net generative adversarial network. The network is trained to learn the mapping from non attenuation corrected 18 F-fluorodeoxyglucose PET images to a synthetic Computerized Tomography (sCT) and also to label the input voxel tissue. The sCT image is further refined using an adversarial training scheme to recover higher frequency details and lost structures using context information.
This work is trained and tested on public available datasets, containing several PET images from different scanners with different radiotracer administration and reconstruction modalities.
The 3D-Unet and GAN topologies were tested on 133 samples from 8 distinct datasets. The resulting mean absolute error of the network is 103 ± 18 HU and a peak signal to noise ratio of 18.6 ± 1.5 dB.
The image shows the coronal and sagital central cuts of the AMC-009 sample of the NSCLC Radiogenomics test dataset. Each colum represents the Non-Attenuation Correct PET (NAC-PET) input, the sCT of the 3D U-Net, the sCT from the GAN refined 3D U-Unet and the objective CT or ground truth, respectively. The bone tissue is presented using a 3D projection in the following image:
It can be seen that the proposed method recovers good quality attenuation maps from unseen data from different scanners, patients and lesions, showing that the technique can be used on multiple sources. Also, the method performs with accuracy comparable to other methods showing their fitting for PET image correction.
The network topology is composed of a 3D U-Net generator and a convolutional critic (or discriminator). An additional segmentation branch is used to regularize the training. Nevertheless, the adversarial loss gradient flow is limited to the last part of the network.
Top branch is used for label segmentation (auxiliary task) and bottom branch for artificial CT generation. The bottom output is the synthetic CT refined by the GAN layers and built from the 3D U-Net outputs. All convolutional operations use a filter of size 3×3×3 except the output layer which use a 1×1×1 filter. The network posses 5 resolution levels, each of them composed of two convolutional layers with filter shape of 3×3×3 and Rectified Linear Unit (ReLU) activation. Instead of convolutional resampling the resolution changes are performed using trilinear up- or down-sampling. After each convolutional layer, voxel normalization along feature maps is applied. Also, at each convolutional layer, a scaling factor based on He’s is applied to the filter kernel.
The discriminator or critic is a 3D convolutional network:
The critic or discriminator network is a fully convolutional network with ReLU activation in all layers, only the last layer has no activation. The input of this network is a two channel volume composed of the NAC-PET volume and the real or sCT image. The output of the network is a value proportional to quality value of the generated image. The network is conformed by 4 resolution levels with two convolutional layers per level. Each convolution has a filter size of 3 × 3 × 3 and ReLU activation. No batch or pixel normalization is applied. The last two layers of the critic are a flatten operation followed by a single dense layer with linear output.
The progressive growing GAN (ProGAN) is based on a 3D-Unet. It's topology is similar to the GAN. The main difference is the training procedure. The ProGAN is trained from low resolutions and it is incrementally expanded until reaching the desired resolution. The generator topology can be described as:
and the discriminator or critic:
Each coloured section is a bock that grows over the previous structure. The neural network (NN) starts with the structure shown in red, once this resolution level is trained this connection is eliminated and the next resolution level is connected (the intermediate resolution structure is shown in green). This process continues until the final resolution structure is added which shields the desired resolution. The blocks marked with Aux. are only used during the training of its specific resolution and then discarded.
The training of the model requires:
The training times will largely depend on the GPU and CPU speed. A solid state storage unit is advised but not required. To run the test notebooks a GPU is not required, but will speed up the process. Testing requires at least 8GB CPU RAM. A docker image is provided to run the code, here, and then executing the jupyter notebook server with:
docker run -u $UID:$UID --gpus all -v /PATH/TO/REPOSITORY:/tf/iAR-PET -p 5000:8888 -it ar-pet jupyter notebook --ip=0.0.0.0 --allow-root
and opening on the browser: http://localhost:5000/tree/
Alternatively, the code can be run in an environment with python 3.6.8, jupyter notebook support and the following libraries:
The work is based on a public dataset from The Cancer Image Archive (TCIA). Within ./datasets/DICOM/ you will find a ".csv" file and a ".tcia" for each data source:
The data can be downloaded opening the ".tcia" file using the NBIADataRetriever. The data must be downloaded into the ./datasets/DICOM/ folder with the "Descriptive Directory Name" option. The ".tcia" file was tested using the NBIADataRetriever v3.6 build "2020-04-03_15-44". The ".csv" contains the same information as the ".tcia" file.
After a successful download the directory should look like this:
In the ./datasets/DICOM/ folder the Generate_train_and_validation_dataset notebook is found. It must be executed to generate the training files. This process is very long, it takes approx. 5 hs to run on a 4x3.6Ghz CPU. It will produce a large output, approx. 120 GB if using the largest resolution of 256x256x512 voxels. The process support other datasets or larger selections of the HNSCC dataset (the dataset is updated from time to time). It is advised to inspect the output to find any defective sample if using samples from outside the provided list. The output is a series of ".tfrecord" files, for different data resolution:
The training of the different models is done using three specific notebooks:
The notebooks are prepared to use multiple GPUs as the memory requirements of the model is large. However it should work with single GPUs set-ups (if the GPU memory is larger than 6 GB).
To compute the testing metrics the ./Test_Dataset.ipynb notebook should be executed. The notebook tests a selected model using a selected test dataset. The model and dataset selection is done inside the notebook. After execution it will produce the following outputs under ./test_outputs/SELECTED_MODEL/SELECTED_DATASET/: