【EESS】New submissions for Mon, 6 May 24

Keyword: volume render

There is no result

Keyword: volumetric render

There is no result

Keyword: remote render

There is no result

Keyword: hybrid render

There is no result

Keyword: raycast

There is no result

Keyword: medical imaging

Report on the AAPM Grand Challenge on deep generative modeling for learning medical image statistics

Authors: Rucha Deshpande, Varun A. Kelkar, Dimitrios Gotsis, Prabhat Kc, Rongping Zeng, Kyle J. Myers, Frank J. Brooks, Mark A. Anastasio
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Medical Physics (physics.med-ph)
Arxiv link: https://arxiv.org/abs/2405.01822
Pdf link: https://arxiv.org/pdf/2405.01822
Abstract The findings of the 2023 AAPM Grand Challenge on Deep Generative Modeling for Learning Medical Image Statistics are reported in this Special Report. The goal of this challenge was to promote the development of deep generative models (DGMs) for medical imaging and to emphasize the need for their domain-relevant assessment via the analysis of relevant image statistics. As part of this Grand Challenge, a training dataset was developed based on 3D anthropomorphic breast phantoms from the VICTRE virtual imaging toolbox. A two-stage evaluation procedure consisting of a preliminary check for memorization and image quality (based on the Frechet Inception distance (FID)), and a second stage evaluating the reproducibility of image statistics corresponding to domain-relevant radiomic features was developed. A summary measure was employed to rank the submissions. Additional analyses of submissions was performed to assess DGM performance specific to individual feature families, and to identify various artifacts. 58 submissions from 12 unique users were received for this Challenge. The top-ranked submission employed a conditional latent diffusion model, whereas the joint runners-up employed a generative adversarial network, followed by another network for image superresolution. We observed that the overall ranking of the top 9 submissions according to our evaluation method (i) did not match the FID-based ranking, and (ii) differed with respect to individual feature families. Another important finding from our additional analyses was that different DGMs demonstrated similar kinds of artifacts. This Grand Challenge highlighted the need for domain-specific evaluation to further DGM design as well as deployment. It also demonstrated that the specification of a DGM may differ depending on its intended use.
Keyword: medical visualization

There is no result

Keyword: interactive volume

There is no result

Keyword: rendering

There is no result

Keyword: cinematic rendering

There is no result

Keyword: volume data

There is no result

Keyword: remote visualization

There is no result

Keyword: direct volume rendering

There is no result

Keyword: mobile device

There is no result

Keyword: transfer function

There is no result

Keyword: retrieval

Analysing PolSAR data from vegetation by using the subaperture decomposition approach
Authors: J. David Ballester-Berman
Subjects: Signal Processing (eess.SP)
Arxiv link: https://arxiv.org/abs/2405.02007
Pdf link: https://arxiv.org/pdf/2405.02007
Abstract A common assumption in radar remote sensing studies for vegetation is that radar returns originate from a target made up by a set of uniformly distributed isotropic scatterers. Nonetheless, several studies in the literature have noted that orientation effects and heterogeneities have a noticeable impact in backscattering signatures according to the specific vegetation type and sensor frequency. In this paper we have employed the subaperture decomposition technique (i.e. a time-frequency analysis) and the 3-D Barakat degree of polarisation to assess the variation of the volume backscatterig power as a function of the azimuth look angle. Three different datasets, i.e. multi-frequency indoor acquisitions over short vegetation samples, and P-band airborne data and L-band satellite data over boreal and tropical forest, respectively, have been employed in this study. We have argued that despite depolarising effects may be only sensed through a small portion of the synthetic aperture, they can lead to overestimated retrievals of the volume scattering for the full resolution image. This has direct implications in the existing model-based and model-free polarimetric SAR decompositions.
Keyword: video retrieval

There is no result

Keyword: mobile

Reinforcement Learning control strategies for Electric Vehicles and Renewable energy sources Virtual Power Plants
Authors: Francesco Maldonato, Izgh Hadachi
Subjects: Systems and Control (eess.SY)
Arxiv link: https://arxiv.org/abs/2405.01889
Pdf link: https://arxiv.org/pdf/2405.01889
Abstract The increasing demand for direct electric energy in the grid is also tied to the increase of Electric Vehicle (EV) usage in the cities, which eventually will totally substitute combustion engine Vehicles. Nevertheless, this high amount of energy required, which is stored in the EV batteries, is not always used and it can constitute a virtual power plant on its own. Bidirectional EVs equipped with batteries connected to the grid can therefore charge or discharge energy depending on public needs, producing a smart shift of energy where and when needed. EVs employed as mobile storage devices can add resilience and supply/demand balance benefits to specific loads, in many cases as part of a Microgrid (MG). Depending on the direction of the energy transfer, EVs can provide backup power to households through vehicle-to-house (V2H) charging, or storing unused renewable power through renewable-to-vehicle (RE2V) charging. V2H and RE2V solutions can complement renewable power sources like solar photovoltaic (PV) panels and wind turbines (WT), which fluctuate over time, increasing the self-consumption and autarky. The concept of distributed energy resources (DERs) is becoming more and more present and requires new solutions for the integration of multiple complementary resources with variable supply over time. The development of these ideas is coupled with the growth of new AI techniques that will potentially be the managing core of such systems. Machine learning techniques can model the energy grid environment in such a flexible way that constant optimization is possible. This fascinating working principle introduces the wider concept of an interconnected, shared, decentralized grid of energy. This research on Reinforcement Learning control strategies for Electric Vehicles and Renewable energy sources Virtual Power Plants focuses on providing solutions for such energy supply optimization models.
Enhancing NLoS RIS-Aided Localization with Optimization and Machine Learning
Authors: Rafael A. Aguiar, Nuno Paulino, Luís M. Pessoa
Subjects: Signal Processing (eess.SP)
Arxiv link: https://arxiv.org/abs/2405.01928
Pdf link: https://arxiv.org/pdf/2405.01928
Abstract This paper introduces two machine learning optimization algorithms to significantly enhance position estimation in Reconfigurable Intelligent Surface (RIS) aided localization for mobile user equipment in Non-Line-of-Sight conditions. Leveraging the strengths of these algorithms, we present two methods capable of achieving extremely high accuracy, reaching sub-centimeter or even sub-millimeter levels at 3.5 GHz. The simulation results highlight the potential of these approaches, showing significant improvements in indoor mobile localization. The demonstrated precision and reliability of the proposed methods offer new opportunities for practical applications in real-world scenarios, particularly in Non-Line-of-Sight indoor localization. By evaluating four optimization techniques, we determine that a combination of a Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) results in localization errors under 30 cm in 90 % of the cases, and under 5 mm for close to 85 % of cases when considering a simulated room of 10 m by 10 m where two of the walls are equipped with RIS tiles.
Multipath-based SLAM with Cooperation and Map Fusion
Authors: Erik Leitinger, Lukas Wielandner, Alexander Venus, Klaus Witrisal
Subjects: Signal Processing (eess.SP)
Arxiv link: https://arxiv.org/abs/2405.02126
Pdf link: https://arxiv.org/pdf/2405.02126
Abstract Multipath-based simultaneous localization and mapping (MP-SLAM) is a promising approach in wireless networks for obtaining position information of transmitters and receivers as well as information on the propagation environment. MP-SLAM models specular reflections of radio frequency (RF) signals at flat surfaces as virtual anchors (VAs), the mirror images of base stations (BSs). Conventional methods for MP-SLAM consider a single mobile terminal (MT) which has to be localized. The availability of additional MTs paves the way for utilizing additional information in the scenario. Specifically enabling MTs to exchange information allows for data fusion over different observations of VAs made by different MTs. Furthermore, cooperative localization becomes possible in addition to multipath-based localization. Utilizing this additional information enables more robust mapping and higher localization accuracy.
Keyword: smartphone

There is no result

Keyword: medical volume data

There is no result

Keyword: webgpu

There is no result

Keyword: webgl

There is no result

Keyword: pre-rendering

There is no result

Keyword: prerendering

There is no result

Keyword: motion prediction

There is no result

Keyword: incremental learning

There is no result

Keyword: svm incremental

There is no result

Keyword: nerf

There is no result

Keyword: multiorgan

There is no result

Keyword: multi-organ

There is no result

Keyword: multi organ

There is no result

Keyword: SAM

Deep Learning Descriptor Hybridization with Feature Reduction for Accurate Cervical Cancer Colposcopy Image Classification
Authors: Saurabh Saini, Kapil Ahuja, Siddartha Chennareddy, Karthik Boddupalli
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2405.01600
Pdf link: https://arxiv.org/pdf/2405.01600
Abstract Cervical cancer stands as a predominant cause of female mortality, underscoring the need for regular screenings to enable early diagnosis and preemptive treatment of pre-cancerous conditions. The transformation zone in the cervix, where cellular differentiation occurs, plays a critical role in the detection of abnormalities. Colposcopy has emerged as a pivotal tool in cervical cancer prevention since it provides a meticulous examination of cervical abnormalities. However, challenges in visual evaluation necessitate the development of Computer Aided Diagnosis (CAD) systems. We propose a novel CAD system that combines the strengths of various deep-learning descriptors (ResNet50, ResNet101, and ResNet152) with appropriate feature normalization (min-max) as well as feature reduction technique (LDA). The combination of different descriptors ensures that all the features (low-level like edges and colour, high-level like shape and texture) are captured, feature normalization prevents biased learning, and feature reduction avoids overfitting. We do experiments on the IARC dataset provided by WHO. The dataset is initially segmented and balanced. Our approach achieves exceptional performance in the range of 97%-100% for both the normal-abnormal and the type classification. A competitive approach for type classification on the same dataset achieved 81%-91% performance.
Converting Anyone's Voice: End-to-End Expressive Voice Conversion with a Conditional Diffusion Model
Authors: Zongyang Du, Junchen Lu, Kun Zhou, Lakshmish Kaushik, Berrak Sisman
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
Arxiv link: https://arxiv.org/abs/2405.01730
Pdf link: https://arxiv.org/pdf/2405.01730
Abstract Expressive voice conversion (VC) conducts speaker identity conversion for emotional speakers by jointly converting speaker identity and emotional style. Emotional style modeling for arbitrary speakers in expressive VC has not been extensively explored. Previous approaches have relied on vocoders for speech reconstruction, which makes speech quality heavily dependent on the performance of vocoders. A major challenge of expressive VC lies in emotion prosody modeling. To address these challenges, this paper proposes a fully end-to-end expressive VC framework based on a conditional denoising diffusion probabilistic model (DDPM). We utilize speech units derived from self-supervised speech models as content conditioning, along with deep features extracted from speech emotion recognition and speaker verification systems to model emotional style and speaker identity. Objective and subjective evaluations show the effectiveness of our framework. Codes and samples are publicly available.
RF Chain-Free mmWave Transmission: Modeling and Experimental Verification
Authors: M.Yaser Yağan, Ibrahim Hökelek, Ali E. Pusane, Ali Görçin
Subjects: Signal Processing (eess.SP)
Arxiv link: https://arxiv.org/abs/2405.01931
Pdf link: https://arxiv.org/pdf/2405.01931
Abstract The utilization of millimeter wave frequency bands is expected to become prevalent in the following communication systems. However, generating and transmitting communication signals over these frequencies is not as straightforward as in sub-6 GHz frequencies due to complex transceiver structures. As an alternative to conventional transmitter architectures, this paper investigates the implementation of time-modulated arrays to effectively modulate and transmit high-quality communication signals at millimeter wave frequencies. By exploiting the array structures and analog beamformers, which are the fundamental components of millimeter wave transmitters, secure and low-cost transmission can be achieved. Though, harmonics of theoretically infinite bandwidth arise as a fundamental problem in this approach. Thus, this paper presents a frequency analysis tool for the time-modulated arrays with hardware impairments and shows how controlling the sampling period can reduce the harmonics. Furthermore, the derived results are experimentally verified at 25 GHz with two important remarks. First, the phase error of received signals can be reduced by 32% using the proposed architecture. Second, the harmonics can be significantly suppressed by the correct choice of sampling period for the given hardware.
Analysing PolSAR data from vegetation by using the subaperture decomposition approach
Authors: J. David Ballester-Berman
Subjects: Signal Processing (eess.SP)
Arxiv link: https://arxiv.org/abs/2405.02007
Pdf link: https://arxiv.org/pdf/2405.02007
Abstract A common assumption in radar remote sensing studies for vegetation is that radar returns originate from a target made up by a set of uniformly distributed isotropic scatterers. Nonetheless, several studies in the literature have noted that orientation effects and heterogeneities have a noticeable impact in backscattering signatures according to the specific vegetation type and sensor frequency. In this paper we have employed the subaperture decomposition technique (i.e. a time-frequency analysis) and the 3-D Barakat degree of polarisation to assess the variation of the volume backscatterig power as a function of the azimuth look angle. Three different datasets, i.e. multi-frequency indoor acquisitions over short vegetation samples, and P-band airborne data and L-band satellite data over boreal and tropical forest, respectively, have been employed in this study. We have argued that despite depolarising effects may be only sensed through a small portion of the synthetic aperture, they can lead to overestimated retrievals of the volume scattering for the full resolution image. This has direct implications in the existing model-based and model-free polarimetric SAR decompositions.
Physics-informed generative neural networks for RF propagation prediction with application to indoor body perception
Authors: Federica Fieramosca, Vittorio Rampa, Michele D'Amico, Stefano Savazzi
Subjects: Signal Processing (eess.SP); Artificial Intelligence (cs.AI); Systems and Control (eess.SY)
Arxiv link: https://arxiv.org/abs/2405.02131
Pdf link: https://arxiv.org/pdf/2405.02131
Abstract Electromagnetic (EM) body models designed to predict Radio-Frequency (RF) propagation are time-consuming methods which prevent their adoption in strict real-time computational imaging problems, such as human body localization and sensing. Physics-informed Generative Neural Network (GNN) models have been recently proposed to reproduce EM effects, namely to simulate or reconstruct missing data or samples by incorporating relevant EM principles and constraints. The paper discusses a Variational Auto-Encoder (VAE) model which is trained to reproduce the effects of human motions on the EM field and incorporate EM body diffraction principles. Proposed physics-informed generative neural network models are verified against both classical diffraction-based EM tools and full-wave EM body simulations.
Reference-Free Image Quality Metric for Degradation and Reconstruction Artifacts
Authors: Han Cui, Alfredo De Goyeneche, Efrat Shimron, Boyuan Ma, Michael Lustig
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2405.02208
Pdf link: https://arxiv.org/pdf/2405.02208
Abstract Image Quality Assessment (IQA) is essential in various Computer Vision tasks such as image deblurring and super-resolution. However, most IQA methods require reference images, which are not always available. While there are some reference-free IQA metrics, they have limitations in simulating human perception and discerning subtle image quality variations. We hypothesize that the JPEG quality factor is representatives of image quality measurement, and a well-trained neural network can learn to accurately evaluate image quality without requiring a clean reference, as it can recognize image degradation artifacts based on prior knowledge. Thus, we developed a reference-free quality evaluation network, dubbed "Quality Factor (QF) Predictor", which does not require any reference. Our QF Predictor is a lightweight, fully convolutional network comprising seven layers. The model is trained in a self-supervised manner: it receives JPEG compressed image patch with a random QF as input, is trained to accurately predict the corresponding QF. We demonstrate the versatility of the model by applying it to various tasks. First, our QF Predictor can generalize to measure the severity of various image artifacts, such as Gaussian Blur and Gaussian noise. Second, we show that the QF Predictor can be trained to predict the undersampling rate of images reconstructed from Magnetic Resonance Imaging (MRI) data.

Yukeaaa / arxiv-daily

【EESS】New submissions for Mon, 6 May 24 #1390

Keyword: volume render

Keyword: volumetric render

Keyword: remote render

Keyword: hybrid render

Keyword: raycast

Keyword: medical imaging

Report on the AAPM Grand Challenge on deep generative modeling for learning medical image statistics

Keyword: medical visualization

Keyword: interactive volume

Keyword: rendering

Keyword: cinematic rendering

Keyword: volume data

Keyword: remote visualization

Keyword: direct volume rendering

Keyword: mobile device

Keyword: transfer function

Keyword: retrieval

Analysing PolSAR data from vegetation by using the subaperture decomposition approach

Keyword: video retrieval

Keyword: mobile

Reinforcement Learning control strategies for Electric Vehicles and Renewable energy sources Virtual Power Plants

Enhancing NLoS RIS-Aided Localization with Optimization and Machine Learning

Multipath-based SLAM with Cooperation and Map Fusion

Keyword: smartphone

Keyword: medical volume data

Keyword: webgpu

Keyword: webgl

Keyword: pre-rendering

Keyword: prerendering

Keyword: motion prediction

Keyword: incremental learning

Keyword: svm incremental

Keyword: nerf

Keyword: multiorgan

Keyword: multi-organ

Keyword: multi organ

Keyword: SAM

Deep Learning Descriptor Hybridization with Feature Reduction for Accurate Cervical Cancer Colposcopy Image Classification

Converting Anyone's Voice: End-to-End Expressive Voice Conversion with a Conditional Diffusion Model

RF Chain-Free mmWave Transmission: Modeling and Experimental Verification

Analysing PolSAR data from vegetation by using the subaperture decomposition approach

Physics-informed generative neural networks for RF propagation prediction with application to indoor body perception

Reference-Free Image Quality Metric for Degradation and Reconstruction Artifacts