Abstract
Style transfer is a promising approach to close the sim-to-real gap in medical endoscopy. Rendering realistic endoscopic videos by traversing pre-operative scans (such as MRI or CT) can generate realistic simulations as well as ground truth camera poses and depth maps. Although image-to-image (I2I) translation models such as CycleGAN perform well, they are unsuitable for video-to-video synthesis due to the lack of temporal consistency, resulting in artifacts between frames. We propose MeshBrush, a neural mesh stylization method to synthesize temporally consistent videos with differentiable rendering. MeshBrush uses the underlying geometry of patient imaging data while leveraging existing I2I methods. With learned per-vertex textures, the stylized mesh guarantees consistency while producing high-fidelity outputs. We demonstrate that mesh stylization is a promising approach for creating realistic simulations for downstream tasks such as training and preoperative planning. Although our method is tested and designed for ureteroscopy, its components are transferable to general endoscopic and laparoscopic procedures.
Skeleton Recall Loss for Connectivity Conserving and Resource Efficient Segmentation of Thin Tubular Structures
Authors: Yannick Kirchhoff, Maximilian R. Rokuss, Saikat Roy, Balint Kovacs, Constantin Ulrich, Tassilo Wald, Maximilian Zenk, Philipp Vollmuth, Jens Kleesiek, Fabian Isensee, Klaus Maier-Hein
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Abstract
Accurately segmenting thin tubular structures, such as vessels, nerves, roads or concrete cracks, is a crucial task in computer vision. Standard deep learning-based segmentation loss functions, such as Dice or Cross-Entropy, focus on volumetric overlap, often at the expense of preserving structural connectivity or topology. This can lead to segmentation errors that adversely affect downstream tasks, including flow calculation, navigation, and structural inspection. Although current topology-focused losses mark an improvement, they introduce significant computational and memory overheads. This is particularly relevant for 3D data, rendering these losses infeasible for larger volumes as well as increasingly important multi-class segmentation problems. To mitigate this, we propose a novel Skeleton Recall Loss, which effectively addresses these challenges by circumventing intensive GPU-based calculations with inexpensive CPU operations. It demonstrates overall superior performance to current state-of-the-art approaches on five public datasets for topology-preserving segmentation, while substantially reducing computational overheads by more than 90%. In doing so, we introduce the first multi-class capable loss function for thin structure segmentation, excelling in both efficiency and efficacy for topology-preservation.
Keyword: cinematic rendering
There is no result
Keyword: volume data
There is no result
Keyword: remote visualization
There is no result
Keyword: direct volume rendering
There is no result
Keyword: mobile device
Age-of-Information-Aware Distributed Task Offloading and Resource Allocation in Mobile Edge Computing Networks
Authors: Minwoo Kim, Jonggyu Jang, Youngchol Choi, Hyun Jong Yang
Abstract
The growth in artificial intelligence (AI) technology has attracted substantial interests in age-of-information (AoI)-aware task offloading of mobile edge computing (MEC)-namely, minimizing service latency. Additionally, the use of MEC systems poses an additional problem arising from limited battery resources of MDs. This paper tackles the pressing challenge of AoI-aware distributed task offloading optimization, where user association (UA), resource allocation (RA), full-task offloading, and battery of mobile devices (MDs) are jointly considered. In existing studies, joint optimization of overall task offloading and UA is seldom considered due to the complexity of combinatorial optimization problems, and in cases where it is considered, linear objective functions such as power consumption are adopted. Revolutionizing the realm of MEC, our objective includes all major components contributing to users' quality of experience, including AoI and energy consumption. To achieve this, we first formulate an NP-hard combinatorial problem, where the objective function comprises three elements: communication latency, computation latency, and battery usage. We derive a closed-form RA solution of the problem; next, we provide a distributed pricing-based UA solution. We simulate the proposed algorithm for various vision and language AI tasks. Our numerical results show that the proposed method Pareto-dominates baseline methods. More specifically, the results demonstrate that the proposed method can outperform baseline methods by 1.62 times smaller AoI with 41.2% less energy consumption.
Keyword: transfer function
There is no result
Keyword: retrieval
There is no result
Keyword: video retrieval
There is no result
Keyword: mobile
Age-of-Information-Aware Distributed Task Offloading and Resource Allocation in Mobile Edge Computing Networks
Authors: Minwoo Kim, Jonggyu Jang, Youngchol Choi, Hyun Jong Yang
Abstract
The growth in artificial intelligence (AI) technology has attracted substantial interests in age-of-information (AoI)-aware task offloading of mobile edge computing (MEC)-namely, minimizing service latency. Additionally, the use of MEC systems poses an additional problem arising from limited battery resources of MDs. This paper tackles the pressing challenge of AoI-aware distributed task offloading optimization, where user association (UA), resource allocation (RA), full-task offloading, and battery of mobile devices (MDs) are jointly considered. In existing studies, joint optimization of overall task offloading and UA is seldom considered due to the complexity of combinatorial optimization problems, and in cases where it is considered, linear objective functions such as power consumption are adopted. Revolutionizing the realm of MEC, our objective includes all major components contributing to users' quality of experience, including AoI and energy consumption. To achieve this, we first formulate an NP-hard combinatorial problem, where the objective function comprises three elements: communication latency, computation latency, and battery usage. We derive a closed-form RA solution of the problem; next, we provide a distributed pricing-based UA solution. We simulate the proposed algorithm for various vision and language AI tasks. Our numerical results show that the proposed method Pareto-dominates baseline methods. More specifically, the results demonstrate that the proposed method can outperform baseline methods by 1.62 times smaller AoI with 41.2% less energy consumption.
Keyword: smartphone
There is no result
Keyword: medical volume data
There is no result
Keyword: webgpu
There is no result
Keyword: webgl
There is no result
Keyword: pre-rendering
There is no result
Keyword: prerendering
There is no result
Keyword: motion prediction
There is no result
Keyword: incremental learning
There is no result
Keyword: svm incremental
There is no result
Keyword: nerf
There is no result
Keyword: multiorgan
There is no result
Keyword: multi-organ
There is no result
Keyword: multi organ
There is no result
Keyword: SAM
Risk-averse Learning with Non-Stationary Distributions
Authors: Siyi Wang, Zifan Wang, Xinlei Yi, Michael M. Zavlanos, Karl H. Johansson, Sandra Hirche
Subjects: Systems and Control (eess.SY); Machine Learning (cs.LG)
Abstract
Considering non-stationary environments in online optimization enables decision-maker to effectively adapt to changes and improve its performance over time. In such cases, it is favorable to adopt a strategy that minimizes the negative impact of change to avoid potentially risky situations. In this paper, we investigate risk-averse online optimization where the distribution of the random cost changes over time. We minimize risk-averse objective function using the Conditional Value at Risk (CVaR) as risk measure. Due to the difficulty in obtaining the exact CVaR gradient, we employ a zeroth-order optimization approach that queries the cost function values multiple times at each iteration and estimates the CVaR gradient using the sampled values. To facilitate the regret analysis, we use a variation metric based on Wasserstein distance to capture time-varying distributions. Given that the distribution variation is sub-linear in the total number of episodes, we show that our designed learning algorithm achieves sub-linear dynamic regret with high probability for both convex and strongly convex functions. Moreover, theoretical results suggest that increasing the number of samples leads to a reduction in the dynamic regret bounds until the sampling number reaches a specific limit. Finally, we provide numerical experiments of dynamic pricing in a parking lot to illustrate the efficacy of the designed algorithm.
Decentralized Learning Strategies for Estimation Error Minimization with Graph Neural Networks
Abstract
We address the challenge of sampling and remote estimation for autoregressive Markovian processes in a multi-hop wireless network with statistically-identical agents. Agents cache the most recent samples from others and communicate over wireless collision channels governed by an underlying graph topology. Our goal is to minimize time-average estimation error and/or age of information with decentralized scalable sampling and transmission policies, considering both oblivious (where decision-making is independent of the physical processes) and non-oblivious policies (where decision-making depends on physical processes). We prove that in oblivious policies, minimizing estimation error is equivalent to minimizing the age of information. The complexity of the problem, especially the multi-dimensional action spaces and arbitrary network topologies, makes theoretical methods for finding optimal transmission policies intractable. We optimize the policies using a graphical multi-agent reinforcement learning framework, where each agent employs a permutation-equivariant graph neural network architecture. Theoretically, we prove that our proposed framework exhibits desirable transferability properties, allowing transmission policies trained on small- or moderate-size networks to be executed effectively on large-scale topologies. Numerical experiments demonstrate that (i) Our proposed framework outperforms state-of-the-art baselines; (ii) The trained policies are transferable to larger networks, and their performance gains increase with the number of agents; (iii) The training procedure withstands non-stationarity even if we utilize independent learning techniques; and, (iv) Recurrence is pivotal in both independent learning and centralized training and decentralized execution, and improves the resilience to non-stationarity in independent learning.
Riemannian Covariance Fitting for Direction-of-Arrival Estimation
Authors: Joseph S. Picard, Amitay Bar, Ronen Talmon
Abstract
Covariance fitting (CF) is a comprehensive approach for direction of arrival (DoA) estimation, consolidating many common solutions. Standard practice is to use Euclidean criteria for CF, disregarding the intrinsic Hermitian positive-definite (HPD) geometry of the spatial covariance matrices. We assert that this oversight leads to inherent limitations. In this paper, as a remedy, we present a comprehensive study of the use of various Riemannian metrics of HPD matrices in CF. We focus on the advantages of the Affine-Invariant (AI) and the Log-Euclidean (LE) Riemannian metrics. Consequently, we propose a new practical beamformer based on the LE metric and derive analytically its spatial characteristics, such as the beamwidth and sidelobe attenuation, under noisy conditions. Comparing these features to classical beamformers shows significant advantage. In addition, we demonstrate, both theoretically and experimentally, the LE beamformer's robustness in scenarios with small sample sizes and in the presence of noise, interference, and multipath channels.
Segmentation-Guided Knee Radiograph Generation using Conditional Diffusion Models
Abstract
Deep learning-based medical image processing algorithms require representative data during development. In particular, surgical data might be difficult to obtain, and high-quality public datasets are limited. To overcome this limitation and augment datasets, a widely adopted solution is the generation of synthetic images. In this work, we employ conditional diffusion models to generate knee radiographs from contour and bone segmentations. Remarkably, two distinct strategies are presented by incorporating the segmentation as a condition into the sampling and training process, namely, conditional sampling and conditional training. The results demonstrate that both methods can generate realistic images while adhering to the conditioning segmentation. The conditional training method outperforms the conditional sampling method and the conventional U-Net.
Keyword: volume render
There is no result
Keyword: volumetric render
There is no result
Keyword: remote render
There is no result
Keyword: hybrid render
There is no result
Keyword: raycast
There is no result
Keyword: medical imaging
There is no result
Keyword: medical visualization
There is no result
Keyword: interactive volume
There is no result
Keyword: rendering
MeshBrush: Painting the Anatomical Mesh with Neural Stylization for Endoscopy
Skeleton Recall Loss for Connectivity Conserving and Resource Efficient Segmentation of Thin Tubular Structures
Keyword: cinematic rendering
There is no result
Keyword: volume data
There is no result
Keyword: remote visualization
There is no result
Keyword: direct volume rendering
There is no result
Keyword: mobile device
Age-of-Information-Aware Distributed Task Offloading and Resource Allocation in Mobile Edge Computing Networks
Keyword: transfer function
There is no result
Keyword: retrieval
There is no result
Keyword: video retrieval
There is no result
Keyword: mobile
Age-of-Information-Aware Distributed Task Offloading and Resource Allocation in Mobile Edge Computing Networks
Keyword: smartphone
There is no result
Keyword: medical volume data
There is no result
Keyword: webgpu
There is no result
Keyword: webgl
There is no result
Keyword: pre-rendering
There is no result
Keyword: prerendering
There is no result
Keyword: motion prediction
There is no result
Keyword: incremental learning
There is no result
Keyword: svm incremental
There is no result
Keyword: nerf
There is no result
Keyword: multiorgan
There is no result
Keyword: multi-organ
There is no result
Keyword: multi organ
There is no result
Keyword: SAM
Risk-averse Learning with Non-Stationary Distributions
Decentralized Learning Strategies for Estimation Error Minimization with Graph Neural Networks
Riemannian Covariance Fitting for Direction-of-Arrival Estimation
Segmentation-Guided Knee Radiograph Generation using Conditional Diffusion Models