Open Kinpzz opened 7 years ago
本文主要是一篇关于对双肺和心脏进行语义分割的论文,作者认为器官语义分割是针对胸片(CXR)构建计算机辅助诊断系统的重要一步,器官的区域提供了丰富的结构信息,可用于诊断许多病症。而目前胸片又因辐射小、花费低,而十分普遍,给放射科工作者带来了巨大的工作量。所以本文的研究具有现实意义。同时该研究也存在着巨大的挑战,CXR为2d灰度图片,且目前公开数据集数据量很少(多只有几百张),无法直接应用在大规模数据集上训练好的网络模型。作者据此提出了SCAN框架,该模型采用了GAN(生成对抗网络)的思想,包含了一个分割网络(segmentation network)和一个判别网络(critic network),采用零和博弈的思想,在公开数据集JSRT和Montgomery上进行单独交替训练。这两个网络都是一个复杂的神经网络,包含FCN、和VGG-based(VGG基础上进行修改)、残差块(residual block)。这是一个数据依赖性小(不依赖大规模数据)、参数量小的模型,取得了一个高准确率(人类专家水平)、高效率(<1s)、迁移性强(泛化能力强)的结果,超过该研究领域的state-of-the-art Registration-based approach。
Registration-based approach: to build a lung model for a test patient, finds patients in an existing database that are most similar to the test patient and perform linear deformation of their lung profiles based on key point matching.(比较法;关键点匹配)
Aims to assign a pre-defined class to each pixel
We note that there is a growing body of recent works that apply neural networks end-to-end on CXR images [25, 34]. These models directly output clinical targets such as disease labels without well-defined intermediate outputs to aid interpretability. Furthermore, they generally require a large number of CXR images for training, which is not readily available for many clinical tasks involving CXR images.(目前一些成果的不足:结果未输出辅助性中间结果,直接输出标签,且需要大量训练数据)
Authors adapt FCNs to gray-scale CXR images uder the stringent constraint of very limited trainning dataset of 247 images. It departs from the usual VGG architecture and can be trained without transfer learning from existing models or dataset.(论文方法:FCN+对抗网络,仅需要少量训练数据,不依赖现有模型或数据库)
Adversarial trainning was first proposed in Generative Adversarial Network (GAN)
Use the critic to learn these higher order structures and guide the segmentation network to generate masks more consistent with the learned global structures.
![figure3]()
$$ \min_S \max_D \lbrace J(S,D):=\sum_{i=1}^N J_s(S(x_i), y_i) - \lambda [J_d(D(x_i, y_i), 1) + J_d(D(x_i, S(x_i)),0)] \rbrace$$
上述公式可以拆分为下面两个阶段:
Train the critic network by minimizing the following objective with respect to $D$ for a fixed $S$: $$ \sum_{i=1}[J_d(D(x_i, y_i), 1) + J_d(D(x_i, S(x_i)),0) $$ 相比于Eq(1) 优化公式,少了负号,所以变成了最小化问题。
Given a fixed D, we train the segmentation network by minimizing hte following objective with respect to $S$: $$ \sum_{i=1}^N J_s(S(x_i),y_i) + \lambda J_d(D(x_i,S(x_i)),0)$$
参考
Use two publicly available dataset with at least lung field annotations.
ChestX-ray8
ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks(标杆) on Weakly-Supervised(弱监督) Classification and Localization of Common Thorax(胸部) Diseases
Keyword
弱监督:没有基于像素标记的训练图像,只有基于图像类别标签的图像, image-level class labels only
Background
Mordern hospitals' PACS(Picture Archiving and Communication Systems) has a tremendous number of X-ray imaging studies accompanied by radiological reports(ie. loosely labeled). Open question: How this type of hospital-size Knowledge database --used for-> large-scale high precision computer-aided diagnosis(CAD) systems.
State-of-the-art object dection and segmentation
Dataset
Object Detection
Main limitation of recent notable work
All proposed methods are eavaluated on some small-to-middle scale problems of (at most) several hunders patients. The performance of deep learning techniques remians unclear when it scales up to tens of thousands of patient sudies. 目前研究的不足:样本量偏小,数据稀缺
Related Work
There have been recent efforts on creating openly available annotated medical image database.
Motivation
Main Work
Construting Database
ChestX-ray8
Labeling Disease Names by Text Mining(标签提取)
Tools
Noise(上述工具存在噪声问题)
Eliminate noisy labeling by ruling out negated pathological statements(否认形式的陈述) and uncertain mentions of findings and diseases, e.g., "suggesting obstructive lung disease". Use regular expression can not capture various syntatic constructions for multiple subjects. for example, "clear of A and B" -> A as a negation but not B.
Improvement: syntactic level, utilize the syntactic dependency information. Define rules on the dependency graph, by utilizing the dependency label and direction information between words. 相比于之前Tools的改进:
Steps
Quality Control
Using OpenI API, retrieve a total of 3851 unique radiology reports for validation. Performance相比于MetaMap有较大的提升
Processing Chest X-ray Images
Bouding Box for Pathologies
Unified DCNN Framework
Multi-label Setup
8-dimensional label vector $$ y = [y_1,..., y_c, ..., y_C], y_c \in {0,1], C= 8$$ for each image. $$ y_c $$ indicates the presence with respect to according pathology. Normal: [0, 0, 0, 0, 0, 0, 0, 0]
Transition Layer
To transform the activations from previous layers into a uniform dimension of output, $ S \times S \times D, S \in {8, 16, 32}$. D represents the dimension of features at spatial location $$ (i, j), i,j \in {1, ..., S}, which can be avried in different model settings, e.g., D=1024 for GoogLeNet and D=2048 for ResNet.
通过卷积层运算把不同的pre-trained model转换为$S \times S \times D$的输出。
Multi-label Classification Loss Layer
在损失函数中加入了对正负样本均衡性的考虑。
Global Pooling Layer
采用了全局池化层代替全连接层和softmax层,减少参数量,防止过拟合。并且设计了一个全局最大池化层和全局LSE池化层结合的方案,max和ave之间权衡。
prediction layer
预测层将全局池化层的输出转换为$1 \times C$ 维度。并利用ROC曲线进行不同阈值效果的筛选。 ROC参考:http://blog.csdn.net/pipisorry/article/details/51788927?locationNum=1&fps=1 http://blog.csdn.net/taoyanqi8932/article/details/54409314?locationNum=5&fps=1
Heat map
Bounding Box Generation:
![Figure 4]()
Experiments
CNN
Performance
ROC
AUC
Different pooling strategies
W-CEL
Disease Localization