NorbertZheng / read-papers

My paper reading notes.
MIT License
7 stars 0 forks source link

Sik-Ho Tsang | Review -- UNETR: Transformers for 3D Medical Image Segmentation. #78

Closed NorbertZheng closed 1 year ago

NorbertZheng commented 1 year ago

Sik-Ho Tsang. Review — UNETR: Transformers for 3D Medical Image Segmentation.

NorbertZheng commented 1 year ago

Overview

UNETR, ViT as Encoder, CNN as Decoder image UNETR consists of a transformer encoder that directly utilizes 3D patches and is connected to a CNN-based decoder via skip connection.

UNETR: Transformers for 3D Medical Image Segmentation, UNETR, by NVIDIA, and Vanderbilt University, 2022 WACV, Over 340 Citations. Medical Imaging, Medical Image Analysis, Image Segmentation, U-Net, Transformer, Vision Transformer, ViT

Biomedical Image Segmentation 2015 … 2021 [Expanded U-Net] [3-D RU-Net] [nnU-Net] [TransUNet] [CoTr] [TransBTS] [Swin-Unet] ==== My Other Paper Readings Also Over Here ====

NorbertZheng commented 1 year ago

UNEt TRansformers (UNETR)

3D ViT as Backbone

NorbertZheng commented 1 year ago
NorbertZheng commented 1 year ago

U-Net-Like Encoder Decoder Architecture

image

NorbertZheng commented 1 year ago

Loss Function

The loss function is a combination of soft dice loss and cross-entropy loss: image

NorbertZheng commented 1 year ago

Results

BTCV

image Quantitative comparisons of segmentation performance in BTCV test set. Top and bottom sections represent the benchmarks of Standard and Free Competitions respectively.

image Qualitative comparison of different baselines in BTCV cross-validation.

NorbertZheng commented 1 year ago

MSD

image Quantitative comparisons of the segmentation performance in brain tumor and spleen segmentation tasks of the MSD dataset.

image

NorbertZheng commented 1 year ago

Ablation Studies

image Effect of the decoder architecture on segmentation performance. NUP, PUP and MLA denote Naive UpSampling, Progressive UpSampling and Multi-scale Aggregation.

image Effect of patch resolution on segmentation performance.

image Comparison of number of parameters, FLOPs and averaged inference time for various models in BTCV experiments.