Abstract
X-ray computed tomography (CT) plays a key role in digitizing three-dimensional structures for a wide range of medical and industrial applications. Traditional CT systems often rely on standard circular and helical scan trajectories, which may not be optimal for challenging scenarios involving large objects, complex structures, or resource constraints. In response to these challenges, we are exploring the potential of twin robotic CT systems, which offer the flexibility to acquire projections from arbitrary views around the object of interest. Ensuring complete and mathematically sound reconstructions becomes critical in such systems. In this work, we present an integer programming-based CT trajectory optimization method. Utilizing discrete data completeness conditions, we formulate an optimization problem to select an optimized set of projections. This approach enforces data completeness and considers absorption-based metrics for reliability evaluation. We compare our method with an equidistant circular CT trajectory and a greedy approach. While greedy already performs well in some cases, we provide a way to improve greedy-based projection selection using an integer optimization approach. Our approach improves CT trajectories and quantifies the optimality of the solution in terms of an optimality gap.
Keyword: clinical
Optimizing Skin Lesion Classification via Multimodal Data and Auxiliary Task Integration
Abstract
The rising global prevalence of skin conditions, some of which can escalate to life-threatening stages if not timely diagnosed and treated, presents a significant healthcare challenge. This issue is particularly acute in remote areas where limited access to healthcare often results in delayed treatment, allowing skin diseases to advance to more critical stages. One of the primary challenges in diagnosing skin diseases is their low inter-class variations, as many exhibit similar visual characteristics, making accurate classification challenging. This research introduces a novel multimodal method for classifying skin lesions, integrating smartphone-captured images with essential clinical and demographic information. This approach mimics the diagnostic process employed by medical professionals. A distinctive aspect of this method is the integration of an auxiliary task focused on super-resolution image prediction. This component plays a crucial role in refining visual details and enhancing feature extraction, leading to improved differentiation between classes and, consequently, elevating the overall effectiveness of the model. The experimental evaluations have been conducted using the PAD-UFES20 dataset, applying various deep-learning architectures. The results of these experiments not only demonstrate the effectiveness of the proposed method but also its potential applicability under-resourced healthcare environments.
BioFusionNet: Deep Learning-Based Survival Risk Stratification in ER+ Breast Cancer Through Multifeature and Multimodal Data Fusion
Abstract
Breast cancer is a significant health concern affecting millions of women worldwide. Accurate survival risk stratification plays a crucial role in guiding personalised treatment decisions and improving patient outcomes. Here we present BioFusionNet, a deep learning framework that fuses image-derived features with genetic and clinical data to achieve a holistic patient profile and perform survival risk stratification of ER+ breast cancer patients. We employ multiple self-supervised feature extractors, namely DINO and MoCoV3, pretrained on histopathology patches to capture detailed histopathological image features. We then utilise a variational autoencoder (VAE) to fuse these features, and harness the latent space of the VAE to feed into a self-attention network, generating patient-level features. Next, we develop a co-dual-cross-attention mechanism to combine the histopathological features with genetic data, enabling the model to capture the interplay between them. Additionally, clinical data is incorporated using a feed-forward network (FFN), further enhancing predictive performance and achieving comprehensive multimodal feature integration. Furthermore, we introduce a weighted Cox loss function, specifically designed to handle imbalanced survival data, which is a common challenge in the field. The proposed model achieves a mean concordance index (C-index) of 0.77 and a time-dependent area under the curve (AUC) of 0.84, outperforming state-of-the-art methods. It predicts risk (high versus low) with prognostic significance for overall survival (OS) in univariate analysis (HR=2.99, 95% CI: 1.88--4.78, p<0.005), and maintains independent significance in multivariate analysis incorporating standard clinicopathological variables (HR=2.91, 95% CI: 1.80--4.68, p<0.005). The proposed method not only improves model performance but also addresses a critical gap in handling imbalanced data.
Fusion of Diffusion Weighted MRI and Clinical Data for Predicting Functional Outcome after Acute Ischemic Stroke with Deep Contrastive Learning
Authors: Chia-Ling Tsai, Hui-Yun Su, Shen-Feng Sung, Wei-Yang Lin, Ying-Ying Su, Tzu-Hsien Yang, Man-Lin Mai
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Abstract
Stroke is a common disabling neurological condition that affects about one-quarter of the adult population over age 25; more than half of patients still have poor outcomes, such as permanent functional dependence or even death, after the onset of acute stroke. The aim of this study is to investigate the efficacy of diffusion-weighted MRI modalities combining with structured health profile on predicting the functional outcome to facilitate early intervention. A deep fusion learning network is proposed with two-stage training: the first stage focuses on cross-modality representation learning and the second stage on classification. Supervised contrastive learning is exploited to learn discriminative features that separate the two classes of patients from embeddings of individual modalities and from the fused multimodal embedding. The network takes as the input DWI and ADC images, and structured health profile data. The outcome is the prediction of the patient needing long-term care at 3 months after the onset of stroke. Trained and evaluated with a dataset of 3297 patients, our proposed fusion model achieves 0.87, 0.80 and 80.45% for AUC, F1-score and accuracy, respectively, outperforming existing models that consolidate both imaging and structured data in the medical domain. If trained with comprehensive clinical variables, including NIHSS and comorbidities, the gain from images on making accurate prediction is not considered substantial, but significant. However, diffusion-weighted MRI can replace NIHSS to achieve comparable level of accuracy combining with other readily available clinical variables for better generalization.
DABS-LS: Deep Atlas-Based Segmentation Using Regional Level Set Self-Supervision
Authors: Hannah G. Mason, Jack H. Noble
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Abstract
Cochlear implants (CIs) are neural prosthetics used to treat patients with severe-to-profound hearing loss. Patient-specific modeling of CI stimulation of the auditory nerve fiber (ANFs) can help audiologists improve the CI programming. These models require localization of the ANFs relative to surrounding anatomy and the CI. Localization is challenging because the ANFs are so small they are not directly visible in clinical imaging. In this work, we hypothesize the position of the ANFs can be accurately inferred from the location of the internal auditory canal (IAC), which has high contrast in CT, since the ANFs pass through this canal between the cochlea and the brain. Inspired by VoxelMorph, in this paper we propose a deep atlas-based IAC segmentation network. We create a single atlas in which the IAC and ANFs are pre-localized. Our network is trained to produce deformation fields (DFs) mapping coordinates from the atlas to new target volumes and that accurately segment the IAC. We hypothesize that DFs that accurately segment the IAC in target images will also facilitate accurate atlas-based localization of the ANFs. As opposed to VoxelMorph, which aims to produce DFs that accurately register the entire volume, our novel contribution is an entirely self-supervised training scheme that aims to produce DFs that accurately segment the target structure. This self-supervision is facilitated using a regional level set (LS) inspired loss function. We call our method Deep Atlas Based Segmentation using Level Sets (DABS-LS). Results show that DABS-LS outperforms VoxelMorph for IAC segmentation. Tests with publicly available datasets for trachea and kidney segmentation also show significant improvement in segmentation accuracy, demonstrating the generalizability of the method.
Semi-weakly-supervised neural network training for medical image registration
Authors: Yiwen Li, Yunguan Fu, Iani J.M.B. Gayo, Qianye Yang, Zhe Min, Shaheer U. Saeed, Wen Yan, Yipei Wang, J. Alison Noble, Mark Emberton, Matthew J. Clarkson, Dean C. Barratt, Victor A. Prisacariu, Yipeng Hu
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
Abstract
For training registration networks, weak supervision from segmented corresponding regions-of-interest (ROIs) have been proven effective for (a) supplementing unsupervised methods, and (b) being used independently in registration tasks in which unsupervised losses are unavailable or ineffective. This correspondence-informing supervision entails cost in annotation that requires significant specialised effort. This paper describes a semi-weakly-supervised registration pipeline that improves the model performance, when only a small corresponding-ROI-labelled dataset is available, by exploiting unlabelled image pairs. We examine two types of augmentation methods by perturbation on network weights and image resampling, such that consistency-based unsupervised losses can be applied on unlabelled data. The novel WarpDDF and RegCut approaches are proposed to allow commutative perturbation between an image pair and the predicted spatial transformation (i.e. respective input and output of registration networks), distinct from existing perturbation methods for classification or segmentation. Experiments using 589 male pelvic MR images, labelled with eight anatomical ROIs, show the improvement in registration performance and the ablated contributions from the individual strategies. Furthermore, this study attempts to construct one of the first computational atlases for pelvic structures, enabled by registering inter-subject MRs, and quantifies the significant differences due to the proposed semi-weak supervision with a discussion on the potential clinical use of example atlas-derived statistics.
Keyword: biomedical
There is no result
Keyword: radiology
There is no result
Keyword: radiography
There is no result
Keyword: medical
CodaMal: Contrastive Domain Adaptation for Malaria Detection in Low-Cost Microscopes
Authors: Ishan Rajendrakumar Dave, Tristan de Blegiers, Chen Chen, Mubarak Shah
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Abstract
Malaria is a major health issue worldwide, and its diagnosis requires scalable solutions that can work effectively with low-cost microscopes (LCM). Deep learning-based methods have shown success in computer-aided diagnosis from microscopic images. However, these methods need annotated images that show cells affected by malaria parasites and their life stages. Annotating images from LCM significantly increases the burden on medical experts compared to annotating images from high-cost microscopes (HCM). For this reason, a practical solution would be trained on HCM images which should generalize well on LCM images during testing. While earlier methods adopted a multi-stage learning process, they did not offer an end-to-end approach. In this work, we present an end-to-end learning framework, named CodaMal (Contrastive Domain Adpation for Malaria). In order to bridge the gap between HCM (training) and LCM (testing), we propose a domain adaptive contrastive loss. It reduces the domain shift by promoting similarity between the representations of HCM and its corresponding LCM image, without imposing an additional annotation burden. In addition, the training objective includes object detection objectives with carefully designed augmentations, ensuring the accurate detection of malaria parasites. On the publicly available large-scale M5-dataset, our proposed method shows a significant improvement of 16% over the state-of-the-art methods in terms of the mean average precision metric (mAP), provides 21x speed up during inference, and requires only half learnable parameters than the prior methods. Our code is publicly available.
Real-Time Model-Based Quantitative Ultrasound and Radar
Authors: Tom Sharon, Yonina C. Eldar
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Abstract
Ultrasound and radar signals are highly beneficial for medical imaging as they are non-invasive and non-ionizing. Traditional imaging techniques have limitations in terms of contrast and physical interpretation. Quantitative medical imaging can display various physical properties such as speed of sound, density, conductivity, and relative permittivity. This makes it useful for a wider range of applications, including improving cancer detection, diagnosing fatty liver, and fast stroke imaging. However, current quantitative imaging techniques that estimate physical properties from received signals, such as Full Waveform Inversion, are time-consuming and tend to converge to local minima, making them unsuitable for medical imaging. To address these challenges, we propose a neural network based on the physical model of wave propagation, which defines the relationship between the received signals and physical properties. Our network can reconstruct multiple physical properties in less than one second for complex and realistic scenarios, using data from only eight elements. We demonstrate the effectiveness of our approach for both radar and ultrasound signals.
U$^2$MRPD: Unsupervised undersampled MRI reconstruction by prompting a large latent diffusion model
Authors: Ziqi Gao, S. Kevin Zhou
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Abstract
Implicit visual knowledge in a large latent diffusion model (LLDM) pre-trained on natural images is rich and hypothetically universal to natural and medical images. To test this hypothesis, we introduce a novel framework for Unsupervised Undersampled MRI Reconstruction by Prompting a pre-trained large latent Diffusion model ( U$^2$MRPD). Existing data-driven, supervised undersampled MRI reconstruction networks are typically of limited generalizability and adaptability toward diverse data acquisition scenarios; yet U$^2$MRPD supports image-specific MRI reconstruction by prompting an LLDM with an MRSampler tailored for complex-valued MRI images. With any single-source or diverse-source MRI dataset, U$^2$MRPD's performance is further boosted by an MRAdapter while keeping the generative image priors intact. Experiments on multiple datasets show that U$^2$MRPD achieves comparable or better performance than supervised and MRI diffusion methods on in-domain datasets while demonstrating the best generalizability on out-of-domain datasets. To the best of our knowledge, U$^2$MRPD is the {\bf first} unsupervised method that demonstrates the universal prowess of a LLDM, %trained on magnitude-only natural images in medical imaging, attaining the best adaptability for both MRI database-free and database-available scenarios and generalizability towards out-of-domain data.
Selective Prediction for Semantic Segmentation using Post-Hoc Confidence Estimation and Its Performance under Distribution Shift
Authors: Bruno Laboissiere Camargos Borges, Bruno Machado Pacheco, Danilo Silva
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Abstract
Semantic segmentation plays a crucial role in various computer vision applications, yet its efficacy is often hindered by the lack of high-quality labeled data. To address this challenge, a common strategy is to leverage models trained on data from different populations, such as publicly available datasets. This approach, however, leads to the distribution shift problem, presenting a reduced performance on the population of interest. In scenarios where model errors can have significant consequences, selective prediction methods offer a means to mitigate risks and reduce reliance on expert supervision. This paper investigates selective prediction for semantic segmentation in low-resource settings, thus focusing on post-hoc confidence estimators applied to pre-trained models operating under distribution shift. We propose a novel image-level confidence measure tailored for semantic segmentation and demonstrate its effectiveness through experiments on three medical imaging tasks. Our findings show that post-hoc confidence estimators offer a cost-effective approach to reducing the impacts of distribution shift.
In-Vivo Hyperspectral Human Brain Image Database for Brain Cancer Detection
Authors: H. Fabelo, S. Ortega, A. Szolna, D. Bulters, J.F. Pineiro, S. Kabwama, A. Shanahan, H. Bulstrode, S. Bisshopp, B.R. Kiran, D. Ravi, R. Lazcano, D. Madronal, C. Sosa, C. Espino, M. Marquez, M. De la Luz Plaza, R. Camacho, D. Carrera, M. Hernandez, G.M. Callico, J. Morera, B. Stanciulescu, G.Z. Yang, R. Salvador, E. Juarez, C. Sanz, R. Sarmiento
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
Abstract
The use of hyperspectral imaging for medical applications is becoming more common in recent years. One of the main obstacles that researchers find when developing hyperspectral algorithms for medical applications is the lack of specific, publicly available, and hyperspectral medical data. The work described in this paper was developed within the framework of the European project HELICoiD (HypErspectraL Imaging Cancer Detection), which had as a main goal the application of hyperspectral imaging to the delineation of brain tumors in real-time during neurosurgical operations. In this paper, the methodology followed to generate the first hyperspectral database of in-vivo human brain tissues is presented. Data was acquired employing a customized hyperspectral acquisition system capable of capturing information in the Visual and Near InfraRed (VNIR) range from 400 to 1000 nm. Repeatability was assessed for the cases where two images of the same scene were captured consecutively. The analysis reveals that the system works more efficiently in the spectral range between 450 and 900 nm. A total of 36 hyperspectral images from 22 different patients were obtained. From these data, more than 300 000 spectral signatures were labeled employing a semi-automatic methodology based on the spectral angle mapper algorithm. Four different classes were defined: normal tissue, tumor tissue, blood vessel, and background elements. All the hyperspectral data has been made available in a public repository.
Weak-Mamba-UNet: Visual Mamba Makes CNN and ViT Work Better for Scribble-based Medical Image Segmentation
Authors: Ziyang Wang, Chao Ma
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
Abstract
Medical image segmentation is increasingly reliant on deep learning techniques, yet the promising performance often come with high annotation costs. This paper introduces Weak-Mamba-UNet, an innovative weakly-supervised learning (WSL) framework that leverages the capabilities of Convolutional Neural Network (CNN), Vision Transformer (ViT), and the cutting-edge Visual Mamba (VMamba) architecture for medical image segmentation, especially when dealing with scribble-based annotations. The proposed WSL strategy incorporates three distinct architecture but same symmetrical encoder-decoder networks: a CNN-based UNet for detailed local feature extraction, a Swin Transformer-based SwinUNet for comprehensive global context understanding, and a VMamba-based Mamba-UNet for efficient long-range dependency modeling. The key concept of this framework is a collaborative and cross-supervisory mechanism that employs pseudo labels to facilitate iterative learning and refinement across the networks. The effectiveness of Weak-Mamba-UNet is validated on a publicly available MRI cardiac segmentation dataset with processed scribble annotations, where it surpasses the performance of a similar WSL framework utilizing only UNet or SwinUNet. This highlights its potential in scenarios with sparse or imprecise annotations. The source code is made publicly accessible.
Keyword: chest
There is no result
Keyword: x-ray
Integer Optimization of CT Trajectories using a Discrete Data Completeness Formulation
Keyword: clinical
Optimizing Skin Lesion Classification via Multimodal Data and Auxiliary Task Integration
BioFusionNet: Deep Learning-Based Survival Risk Stratification in ER+ Breast Cancer Through Multifeature and Multimodal Data Fusion
Fusion of Diffusion Weighted MRI and Clinical Data for Predicting Functional Outcome after Acute Ischemic Stroke with Deep Contrastive Learning
DABS-LS: Deep Atlas-Based Segmentation Using Regional Level Set Self-Supervision
Semi-weakly-supervised neural network training for medical image registration
Keyword: biomedical
There is no result
Keyword: radiology
There is no result
Keyword: radiography
There is no result
Keyword: medical
CodaMal: Contrastive Domain Adaptation for Malaria Detection in Low-Cost Microscopes
Real-Time Model-Based Quantitative Ultrasound and Radar
U$^2$MRPD: Unsupervised undersampled MRI reconstruction by prompting a large latent diffusion model
Selective Prediction for Semantic Segmentation using Post-Hoc Confidence Estimation and Its Performance under Distribution Shift
In-Vivo Hyperspectral Human Brain Image Database for Brain Cancer Detection
Weak-Mamba-UNet: Visual Mamba Makes CNN and ViT Work Better for Scribble-based Medical Image Segmentation
Keyword: chexpert
There is no result