DebeshJha / PVTFormer

Liver segmentation using Deep Learning on LiTS 2017 Dataset
https://arxiv.org/pdf/2401.09630.pdf
15 stars 3 forks source link
colonoscopy encoder-decoder medical-image-segmentation polyp-segmentation pvtv2 transformer unet-image-segmentation

CT Liver Segmentation Via PVT-based Encoding and Refined Decoding

Overview

PVTFormer is a novel encoder-decoder framework designed for precise liver segmentation from CT scans. At its core, it utilizes the Pyramid Vision Transformer (PVT v2) as a pretrained encoder, enhancing the segmentation process with its unique ability to handle variable-resolution input images and produce multi-scale representations. Our approach includes a novel hierarchical decoding strategy that incorporates specialized upscaling in the Up block with effective multi-scale feature fusion in the Decoder. This approach significantly enhances the network's ability to delineate detailed semantic features, which is vital for precise liver segmentation.

Key Features

-Encoder-Decoder Framework: Incorporates PVT v2 for efficient and rich feature extraction.

-Hierarchical Decoding Strategy: Enhances semantic features for high-quality segmentation masks.

-Efficient Feature Processing: Combines residual learning with Transformer mechanisms for optimal feature representation.

-High Performance Metrics: Achieves impressive dice coefficients and mean IoUs, outperforming state-of-the-art methods.

PVTFormer Architecture

Architecture Advantages:

Applications

PVTFormer is highly effective for healthy liver segmentation, with potential applications in other medical imaging areas. It represents a significant advancement in medical image segmentation, offering a robust solution for accurate diagnosis and treatment planning.

Uses of PVTFormer:

Dataset

LiTS dataset

Results

Qualitative results comparison of the SOTA methods

Quantitative results comparison of the SOTA methods

Citation

Please cite our paper if you find the work useful:

@inproceedings{jha2024ct,
  title={CT Liver Segmentation via PVT-based Encoding and Refined Decoding},
  author={Jha, Debesh and Tomar, Nikhil Kumar and Biswas, Koushik and Durak, Gorkem and Medetalibeyoglu, Alpay and Antalek, Matthew and Velichko, Yury and Ladner, Daniela and Borhani, Amir and Bagci, Ulas},
  bookarticle={Proceedings of the International Symposium on Biomedical Imaging (ISBI)},
  year={2024}
}

Contact

Please contact debesh.jha@northwestern.edu for any further questions.