abhishekdiphu / Automatic-keypoint-localization-in-2dmedical-images

Using Yu Chen et.al AdversarialPoseNet for landmark localization in 2D medical images (lower extremities)
7 stars 0 forks source link

AdversarialPoseNet-2DMedical

Yu Chen's Adversarial-PoseNet for landmark localization in 2D medical images (lower extrimites)

Pytorch implementation of chen et al. "Adversarial PoseNet" for landmark localization on medical data. The method was proposed by Yu Chen, Chunhua Shen, Xiu-Shen Wei, Lingqiao Liu, Jian Yang in Adversarial PoseNet: A Structure-aware Convolutional Network for Human Pose Estimation.

Goal

The goal of this work will be to investigate the role of adversarial learning for keypoint localization of 6 landmarks in lower extremities taken from dataset of 660 X-ray images by incorporating priors about the structure of the lower extremities pose components implicitly during the training of a network. For this analysis, an already established generative adversarial network architecture is being presented which predicts heatmaps. The effectiveness of the network is trained and tested on the X-ray medical images of lower extremities and evaluated in terms of localization accuracy (within 10 mm tolerance). Occlusions in medical images happens due to certain causes such as putting prosthetics on the bone joints or restricted view, making it harder for landmark localization. However, under this conditions, human vison can predict near accurate poses by exploiting geometric orientation of joint inter-connectivity between bones in the medical images.

Model Architecture:

1. generator architecture :

Network training :

Adversarial training :

Pose Discriminator training :

where p-real are ground-truth of the real heatmaps. All of them are labelled as 1. whereas P-fake is the label for the generated (fake) heatmaps, and the size of the pfake is [1x6], where the value of p-fake is either ’0’, β€˜1’. 0 if predicted key-point is incorrectly localized, 1 if accurately localized.

Confidnece Discriminator training :

Where c-fake is the ground truth confidence label for fake heatmaps. During training the confidence network, the real heatmaps are labelled with a 1 x 6 (6 is the number of body parts) unit vector c_real. The confidence of the fake (predicted) heatmap should be high when it is close to ground truth and low otherwise. The output range of values in c-fake is either 0 or 1. 0 if predicted key-point is incorrectly localized, 1 if accurately localized by the generator.

Generator training (multi-tasking) :

Task 1:

Task 2:

Sample input images (left) & its corresponding ground truth heatmap(right):

Results Visualization

The results of this implementation:

Adversarial PoseNet:

Stack-hour-glass Network(supervised setup):

localization rate(percentage of correct keypoints) within 10 mm on the test set:

 please note : localization rate(percentage of correct keypoints) within 20 mm was on average 98% accross all six landmarks. 

Metric Used :

-.Euclidian distance (predicted co-ordinates , ground-truth co-ordinates) < 10 mm
-.Euclidian distance (predicted co-ordinates , ground-truth co-ordinates) < 20 mm

for more information refer:

Main Prerequisites

Getting Started

Installation

Training and Test Details

To train a model, run any of the .sh file starting with "train". for example

trainmodelmedical-exp-22.sh 
- During training , one can see how the network is learning on batch of input samples by looking inside the folder 

trainingImages/ and can also visualize the area of interest of the network during training, while localizing the keypoints on the images , by using the localization maps of the last convolutional layers as shown below:

 <img src="https://github.com/abhishekdiphu/Automatic-keypoint-localization-in-2dmedical-images/raw/main/readmeimages/superim.png" width="300px"/>

Models are saved to `./trainmodel/` (can be changed in the --modelName).  

To test the model,
```bash
test.sh

Datasets

The dataset is not publically available , and has been taken from the authors of the paper ,

"Detection and Localization of Landmarks in the Lower Extremities Using an Automatically Learned Conditional Random Field

Reference