Abstract
Recently, U-shaped networks have dominated the field of medical image segmentation due to their simple and easily tuned structure. However, existing U-shaped segmentation networks: 1) mostly focus on designing complex self-attention modules to compensate for the lack of long-term dependence based on convolution operation, which increases the overall number of parameters and computational complexity of the network; 2) simply fuse the features of encoder and decoder, ignoring the connection between their spatial locations. In this paper, we rethink the above problem and build a lightweight medical image segmentation network, called SegNetr. Specifically, we introduce a novel SegNetr block that can perform local-global interactions dynamically at any stage and with only linear complexity. At the same time, we design a general information retention skip connection (IRSC) to preserve the spatial location information of encoder features and achieve accurate fusion with the decoder features. We validate the effectiveness of SegNetr on four mainstream medical image segmentation datasets, with 59\% and 76\% fewer parameters and GFLOPs than vanilla U-Net, while achieving segmentation performance comparable to state-of-the-art methods. Notably, the components proposed in this paper can be applied to other U-shaped networks to improve their segmentation performance.
Power-Aperture Resource Allocation for a MPAR with Communications Capabilities
Authors: Augusto Aubry, Antonio De Maio, Luca Pallotta
Abstract
Multifunction phased array radars (MPARs) exploit the intrinsic flexibility of their active electronically steered array (ESA) to perform, at the same time, a multitude of operations, such as search, tracking, fire control, classification, and communications. This paper aims at addressing the MPAR resource allocation so as to satisfy the quality of service (QoS) demanded by both line of sight (LOS) and non line of sight (NLOS) search operations along with communications tasks. To this end, the ranges at which the cumulative detection probability and the channel capacity per bandwidth reach a desired value are introduced as task quality metrics for the search and communication functions, respectively. Then, to quantify the satisfaction level of each task, for each of them a bespoke utility function is defined to map the associated quality metric into the corresponding perceived utility. Hence, assigning different priority weights to each task, the resource allocation problem, in terms of radar power aperture (PAP) specification, is formulated as a constrained optimization problem whose solution optimizes the global radar QoS. Several simulations are conducted in scenarios of practical interest to prove the effectiveness of the approach.
Topology-Aware Loss for Aorta and Great Vessel Segmentation in Computed Tomography Images
Authors: Seher Ozcelik, Sinan Unver, Ilke Ali Gurses, Rustu Turkay, Cigdem Gunduz-Demir
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Abstract
Segmentation networks are not explicitly imposed to learn global invariants of an image, such as the shape of an object and the geometry between multiple objects, when they are trained with a standard loss function. On the other hand, incorporating such invariants into network training may help improve performance for various segmentation tasks when they are the intrinsic characteristics of the objects to be segmented. One example is segmentation of aorta and great vessels in computed tomography (CT) images where vessels are found in a particular geometry in the body due to the human anatomy and they mostly seem as round objects on a 2D CT image. This paper addresses this issue by introducing a new topology-aware loss function that penalizes topology dissimilarities between the ground truth and prediction through persistent homology. Different from the previously suggested segmentation network designs, which apply the threshold filtration on a likelihood function of the prediction map and the Betti numbers of the ground truth, this paper proposes to apply the Vietoris-Rips filtration to obtain persistence diagrams of both ground truth and prediction maps and calculate the dissimilarity with the Wasserstein distance between the corresponding persistence diagrams. The use of this filtration has advantage of modeling shape and geometry at the same time, which may not happen when the threshold filtration is applied. Our experiments on 4327 CT images of 24 subjects reveal that the proposed topology-aware loss function leads to better results than its counterparts, indicating the effectiveness of this use.
Recovering implicit pitch contours from formants in whispered speech
Abstract
Whispered speech is characterised by a noise-like excitation that results in the lack of fundamental frequency. Considering that prosodic phenomena such as intonation are perceived through f0 variation, the perception of whispered prosody is relatively difficult. At the same time, studies have shown that speakers do attempt to produce intonation when whispering and that prosodic variability is being transmitted, suggesting that intonation "survives" in whispered formant structure. In this paper, we aim to estimate the way in which formant contours correlate with an "implicit" pitch contour in whisper, using a machine learning model. We propose a two-step method: using a parallel corpus, we first transform the whispered formants into their phonated equivalents using a denoising autoencoder. We then analyse the formant contours to predict phonated pitch contour variation. We observe that our method is effective in establishing a relationship between whispered and phonated formants and in uncovering implicit pitch contours in whisper.
Keyword: volume render
There is no result
Keyword: volumetric render
There is no result
Keyword: remote render
There is no result
Keyword: hybrid render
There is no result
Keyword: raycast
There is no result
Keyword: medical imaging
There is no result
Keyword: medical visualization
There is no result
Keyword: interactive volume
There is no result
Keyword: rendering
There is no result
Keyword: cinematic rendering
There is no result
Keyword: volume data
There is no result
Keyword: remote visualization
There is no result
Keyword: direct volume rendering
There is no result
Keyword: mobile device
There is no result
Keyword: transfer function
There is no result
Keyword: retrieval
There is no result
Keyword: video retrieval
There is no result
Keyword: mobile
There is no result
Keyword: smartphone
There is no result
Keyword: medical volume data
There is no result
Keyword: webgpu
There is no result
Keyword: webgl
There is no result
Keyword: pre-rendering
There is no result
Keyword: prerendering
There is no result
Keyword: motion prediction
There is no result
Keyword: incremental learning
There is no result
Keyword: svm incremental
There is no result
Keyword: nerf
There is no result
Keyword: multiorgan
There is no result
Keyword: multi-organ
There is no result
Keyword: multi organ
There is no result
Keyword: SAM
SegNetr: Rethinking the local-global interactions and skip connections in U-shaped networks
Power-Aperture Resource Allocation for a MPAR with Communications Capabilities
Topology-Aware Loss for Aorta and Great Vessel Segmentation in Computed Tomography Images
Recovering implicit pitch contours from formants in whispered speech