Abstract
In the realm of medical imaging, inverse problems aim to infer high-quality images from incomplete, noisy measurements, with the objective of minimizing expenses and risks to patients in clinical settings. The Diffusion Models have recently emerged as a promising approach to such practical challenges, proving particularly useful for the zero-shot inference of images from partially acquired measurements in Magnetic Resonance Imaging (MRI) and Computed Tomography (CT). A central challenge in this approach, however, is how to guide an unconditional prediction to conform to the measurement information. Existing methods rely on deficient projection or inefficient posterior score approximation guidance, which often leads to suboptimal performance. In this paper, we propose \underline{\textbf{B}}i-level \underline{G}uided \underline{D}iffusion \underline{M}odels ({BGDM}), a zero-shot imaging framework that efficiently steers the initial unconditional prediction through a \emph{bi-level} guidance strategy. Specifically, BGDM first approximates an \emph{inner-level} conditional posterior mean as an initial measurement-consistent reference point and then solves an \emph{outer-level} proximal optimization objective to reinforce the measurement consistency. Our experimental findings, using publicly available MRI and CT medical datasets, reveal that BGDM is more effective and efficient compared to the baselines, faithfully generating high-fidelity medical images and substantially reducing hallucinatory artifacts in cases of severe degradation.
Keyword: medical visualization
There is no result
Keyword: interactive volume
There is no result
Keyword: rendering
There is no result
Keyword: cinematic rendering
There is no result
Keyword: volume data
There is no result
Keyword: remote visualization
There is no result
Keyword: direct volume rendering
There is no result
Keyword: mobile device
There is no result
Keyword: transfer function
There is no result
Keyword: retrieval
There is no result
Keyword: video retrieval
There is no result
Keyword: mobile
There is no result
Keyword: smartphone
There is no result
Keyword: medical volume data
There is no result
Keyword: webgpu
There is no result
Keyword: webgl
There is no result
Keyword: pre-rendering
There is no result
Keyword: prerendering
There is no result
Keyword: motion prediction
There is no result
Keyword: incremental learning
There is no result
Keyword: svm incremental
There is no result
Keyword: nerf
There is no result
Keyword: multiorgan
There is no result
Keyword: multi-organ
There is no result
Keyword: multi organ
There is no result
Keyword: SAM
Mitigating analytical variability in fMRI results with style transfer
Abstract
We propose a novel approach to improve the reproducibility of neuroimaging results by converting statistic maps across different functional MRI pipelines. We make the assumption that pipelines can be considered as a style component of data and propose to use different generative models, among which, Diffusion Models (DM) to convert data between pipelines. We design a new DM-based unsupervised multi-domain image-to-image transition framework and constrain the generation of 3D fMRI statistic maps using the latent space of an auxiliary classifier that distinguishes statistic maps from different pipelines. We extend traditional sampling techniques used in DM to improve the transition performance. Our experiments demonstrate that our proposed methods are successful: pipelines can indeed be transferred, providing an important source of data augmentation for future medical studies.
Deep Phase Coded Image Prior
Authors: Nimrod Shabtay, Eli Schwartz, Raja Giryes
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
Abstract
Phase-coded imaging is a computational imaging method designed to tackle tasks such as passive depth estimation and extended depth of field (EDOF) using depth cues inserted during image capture. Most of the current deep learning-based methods for depth estimation or all-in-focus imaging require a training dataset with high-quality depth maps and an optimal focus point at infinity for all-in-focus images. Such datasets are difficult to create, usually synthetic, and require external graphic programs. We propose a new method named "Deep Phase Coded Image Prior" (DPCIP) for jointly recovering the depth map and all-in-focus image from a coded-phase image using solely the captured image and the optical information of the imaging system. Our approach does not depend on any specific dataset and surpasses prior supervised techniques utilizing the same imaging system. This improvement is achieved through the utilization of a problem formulation based on implicit neural representation (INR) and deep image prior (DIP). Due to our zero-shot method, we overcome the barrier of acquiring accurate ground-truth data of depth maps and all-in-focus images for each new phase-coded system introduced. This allows focusing mainly on developing the imaging system, and not on ground-truth data collection.
Real-GDSR: Real-World Guided DSM Super-Resolution via Edge-Enhancing Residual Network
Authors: Daniel Panangian, Ksenia Bittner
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
Abstract
A low-resolution digital surface model (DSM) features distinctive attributes impacted by noise, sensor limitations and data acquisition conditions, which failed to be replicated using simple interpolation methods like bicubic. This causes super-resolution models trained on synthetic data does not perform effectively on real ones. Training a model on real low and high resolution DSMs pairs is also a challenge because of the lack of information. On the other hand, the existence of other imaging modalities of the same scene can be used to enrich the information needed for large-scale super-resolution. In this work, we introduce a novel methodology to address the intricacies of real-world DSM super-resolution, named REAL-GDSR, breaking down this ill-posed problem into two steps. The first step involves the utilization of a residual local refinement network. This strategic approach departs from conventional methods that trained to directly predict height values instead of the differences (residuals) and utilize large receptive fields in their networks. The second step introduces a diffusion-based technique that enhances the results on a global scale, with a primary focus on smoothing and edge preservation. Our experiments underscore the effectiveness of the proposed method. We conduct a comprehensive evaluation, comparing it to recent state-of-the-art techniques in the domain of real-world DSM super-resolution (SR). Our approach consistently outperforms these existing methods, as evidenced through qualitative and quantitative assessments.
Towards Efficient and Accurate CT Segmentation via Edge-Preserving Probabilistic Downsampling
Authors: Shahzad Ali, Yu Rim Lee, Soo Young Park, Won Young Tak, Soon Ki Jung
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Abstract
Downsampling images and labels, often necessitated by limited resources or to expedite network training, leads to the loss of small objects and thin boundaries. This undermines the segmentation network's capacity to interpret images accurately and predict detailed labels, resulting in diminished performance compared to processing at original resolutions. This situation exemplifies the trade-off between efficiency and accuracy, with higher downsampling factors further impairing segmentation outcomes. Preserving information during downsampling is especially critical for medical image segmentation tasks. To tackle this challenge, we introduce a novel method named Edge-preserving Probabilistic Downsampling (EPD). It utilizes class uncertainty within a local window to produce soft labels, with the window size dictating the downsampling factor. This enables a network to produce quality predictions at low resolutions. Beyond preserving edge details more effectively than conventional nearest-neighbor downsampling, employing a similar algorithm for images, it surpasses bilinear interpolation in image downsampling, enhancing overall performance. Our method significantly improved Intersection over Union (IoU) to 2.85%, 8.65%, and 11.89% when downsampling data to 1/2, 1/4, and 1/8, respectively, compared to conventional interpolation methods.
Torque-Minimizing Control Allocation for Overactuated Quadrupedal Locomotion
Abstract
In this paper, we improve upon a method for optimal control of quadrupedal robots which utilizes a full-order model of the system. The original method utilizes offline nonlinear optimal control to synthesize a control scheme which exponentially orbitally stabilizes the closed-loop system. However, it is not able to handle the overactuated phases which frequently occur during quadrupedal locomotion as a result of the multi-contact nature of the system. We propose a modified method, which handles overactuated gait phases in a way that utilizes the full range of available actuators to minimize torque expenditure without requiring output trajectories to be modified. It is shown that the system under the proposed controller exhibits the same properties, i.e. exponential orbital stability, with the same or lower point-wise torque magnitude. A simulation study demonstrates that the reduction in torque may in certain cases be substantial.
Zak-OTFS for Integration of Sensing and Communication
Authors: Muhammad Ubadah, Saif Khan Mohammed, Ronny Hadani, Shachar Kons, Ananthanarayanan Chockalingam, Robert Calderbank
Subjects: Signal Processing (eess.SP); Information Theory (cs.IT)
Abstract
The Zak-OTFS input/output (I/O) relation is predictable and non-fading when the delay and Doppler periods are greater than the effective channel delay and Doppler spreads, a condition which we refer to as the crystallization condition. The filter taps can simply be read off from the response to a single Zak-OTFS point (impulse) pulsone waveform, and the I/O relation can be reconstructed for a sampled system that operates under finite duration and bandwidth constraints. Predictability opens up the possibility of a model-free mode of operation. The time-domain realization of a Zak-OTFS point pulsone is a pulse train modulated by a tone, hence the name, pulsone. The Peak-to-Average Power Ratio (PAPR) of a pulsone is about $15$ dB, and we describe a general method for constructing a spread pulsone for which the time-domain realization has a PAPR of about 6dB. We construct the spread pulsone by applying a type of discrete spreading filter to a Zak-OTFS point pulsone. The self-ambiguity function of the point pulsone is supported on the period lattice ${\Lambda}{p}$, and by applying a discrete chirp filter, we obtain a spread pulsone with a self-ambiguity function that is supported on a rotated lattice ${\Lambda^}$. We show that if the channel satisfies the crystallization conditions with respect to ${\Lambda^}$ then the effective DD domain filter taps can simply be read off from the cross-ambiguity between the channel response to the spread pulsone and the transmitted spread pulsone. If, in addition, the channel satisfies the crystallization conditions with respect to the period lattice ${\Lambda}{p}$, then in an OTFS frame consisting of a spread pilot pulsone and point data pulsones, after cancelling the received signal corresponding to the spread pulsone, we can recover the channel response to any data pulsone.
Keyword: volume render
There is no result
Keyword: volumetric render
There is no result
Keyword: remote render
There is no result
Keyword: hybrid render
There is no result
Keyword: raycast
There is no result
Keyword: medical imaging
Bi-level Guided Diffusion Models for Zero-Shot Medical Imaging Inverse Problems
Keyword: medical visualization
There is no result
Keyword: interactive volume
There is no result
Keyword: rendering
There is no result
Keyword: cinematic rendering
There is no result
Keyword: volume data
There is no result
Keyword: remote visualization
There is no result
Keyword: direct volume rendering
There is no result
Keyword: mobile device
There is no result
Keyword: transfer function
There is no result
Keyword: retrieval
There is no result
Keyword: video retrieval
There is no result
Keyword: mobile
There is no result
Keyword: smartphone
There is no result
Keyword: medical volume data
There is no result
Keyword: webgpu
There is no result
Keyword: webgl
There is no result
Keyword: pre-rendering
There is no result
Keyword: prerendering
There is no result
Keyword: motion prediction
There is no result
Keyword: incremental learning
There is no result
Keyword: svm incremental
There is no result
Keyword: nerf
There is no result
Keyword: multiorgan
There is no result
Keyword: multi-organ
There is no result
Keyword: multi organ
There is no result
Keyword: SAM
Mitigating analytical variability in fMRI results with style transfer
Deep Phase Coded Image Prior
Real-GDSR: Real-World Guided DSM Super-Resolution via Edge-Enhancing Residual Network
Towards Efficient and Accurate CT Segmentation via Edge-Preserving Probabilistic Downsampling
Torque-Minimizing Control Allocation for Overactuated Quadrupedal Locomotion
Zak-OTFS for Integration of Sensing and Communication