【EESS】New submissions for Mon, 1 Apr 24

Keyword: volume render

There is no result

Keyword: volumetric render

There is no result

Keyword: remote render

There is no result

Keyword: hybrid render

There is no result

Keyword: raycast

There is no result

Keyword: medical imaging

There is no result

Keyword: medical visualization

There is no result

Keyword: interactive volume

There is no result

Keyword: rendering

SCINeRF: Neural Radiance Fields from a Snapshot Compressive Image

Authors: Yunhao Li, Xiaodong Wang, Ping Wang, Xin Yuan, Peidong Liu
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2403.20018
Pdf link: https://arxiv.org/pdf/2403.20018
Abstract In this paper, we explore the potential of Snapshot Compressive Imaging (SCI) technique for recovering the underlying 3D scene representation from a single temporal compressed image. SCI is a cost-effective method that enables the recording of high-dimensional data, such as hyperspectral or temporal information, into a single image using low-cost 2D imaging sensors. To achieve this, a series of specially designed 2D masks are usually employed, which not only reduces storage requirements but also offers potential privacy protection. Inspired by this, to take one step further, our approach builds upon the powerful 3D scene representation capabilities of neural radiance fields (NeRF). Specifically, we formulate the physical imaging process of SCI as part of the training of NeRF, allowing us to exploit its impressive performance in capturing complex scene structures. To assess the effectiveness of our method, we conduct extensive evaluations using both synthetic data and real data captured by our SCI system. Extensive experimental results demonstrate that our proposed approach surpasses the state-of-the-art methods in terms of image reconstruction and novel view image synthesis. Moreover, our method also exhibits the ability to restore high frame-rate multi-view consistent images by leveraging SCI and the rendering capabilities of NeRF. The code is available at https://github.com/WU-CVGL/SCINeRF.
Keyword: cinematic rendering

There is no result

Keyword: volume data

There is no result

Keyword: remote visualization

There is no result

Keyword: direct volume rendering

There is no result

Keyword: mobile device

There is no result

Keyword: transfer function

There is no result

Keyword: retrieval

There is no result

Keyword: video retrieval

There is no result

Keyword: mobile

UltraLight VM-UNet: Parallel Vision Mamba Significantly Reduces Parameters for Skin Lesion Segmentation
Authors: Renkai Wu, Yinghao Liu, Pengchen Liang, Qing Chang
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2403.20035
Pdf link: https://arxiv.org/pdf/2403.20035
Abstract Traditionally for improving the segmentation performance of models, most approaches prefer to use adding more complex modules. And this is not suitable for the medical field, especially for mobile medical devices, where computationally loaded models are not suitable for real clinical environments due to computational resource constraints. Recently, state-space models (SSMs), represented by Mamba, have become a strong competitor to traditional CNNs and Transformers. In this paper, we deeply explore the key elements of parameter influence in Mamba and propose an UltraLight Vision Mamba UNet (UltraLight VM-UNet) based on this. Specifically, we propose a method for processing features in parallel Vision Mamba, named PVM Layer, which achieves excellent performance with the lowest computational load while keeping the overall number of processing channels constant. We conducted comparisons and ablation experiments with several state-of-the-art lightweight models on three skin lesion public datasets and demonstrated that the UltraLight VM-UNet exhibits the same strong performance competitiveness with parameters of only 0.049M and GFLOPs of 0.060. In addition, this study deeply explores the key elements of parameter influence in Mamba, which will lay a theoretical foundation for Mamba to possibly become a new mainstream module for lightweighting in the future. The code is available from https://github.com/wurenkai/UltraLight-VM-UNet .
Keyword: smartphone

There is no result

Keyword: medical volume data

There is no result

Keyword: webgpu

There is no result

Keyword: webgl

There is no result

Keyword: pre-rendering

There is no result

Keyword: prerendering

There is no result

Keyword: motion prediction

There is no result

Keyword: incremental learning

There is no result

Keyword: svm incremental

There is no result

Keyword: nerf

SCINeRF: Neural Radiance Fields from a Snapshot Compressive Image
Authors: Yunhao Li, Xiaodong Wang, Ping Wang, Xin Yuan, Peidong Liu
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2403.20018
Pdf link: https://arxiv.org/pdf/2403.20018
Abstract In this paper, we explore the potential of Snapshot Compressive Imaging (SCI) technique for recovering the underlying 3D scene representation from a single temporal compressed image. SCI is a cost-effective method that enables the recording of high-dimensional data, such as hyperspectral or temporal information, into a single image using low-cost 2D imaging sensors. To achieve this, a series of specially designed 2D masks are usually employed, which not only reduces storage requirements but also offers potential privacy protection. Inspired by this, to take one step further, our approach builds upon the powerful 3D scene representation capabilities of neural radiance fields (NeRF). Specifically, we formulate the physical imaging process of SCI as part of the training of NeRF, allowing us to exploit its impressive performance in capturing complex scene structures. To assess the effectiveness of our method, we conduct extensive evaluations using both synthetic data and real data captured by our SCI system. Extensive experimental results demonstrate that our proposed approach surpasses the state-of-the-art methods in terms of image reconstruction and novel view image synthesis. Moreover, our method also exhibits the ability to restore high frame-rate multi-view consistent images by leveraging SCI and the rendering capabilities of NeRF. The code is available at https://github.com/WU-CVGL/SCINeRF.
Keyword: multiorgan

There is no result

Keyword: multi-organ

There is no result

Keyword: multi organ

There is no result

Keyword: SAM

Keeping Up With the Winner! Targeted Advertisement to Communities in Social Networks
Authors: Shailaja Mallick, Vishwaraj Doshi, Do Young Eun
Subjects: Systems and Control (eess.SY); Social and Information Networks (cs.SI)
Arxiv link: https://arxiv.org/abs/2403.19903
Pdf link: https://arxiv.org/pdf/2403.19903
Abstract When a new product enters a market already dominated by an existing product, will it survive along with this dominant product? Most of the existing works have shown the coexistence of two competing products spreading/being adopted on overlaid graphs with same set of users. However, when it comes to the survival of a weaker product on the same graph, it has been established that the stronger one dominates the market and wipes out the other. This paper makes a step towards narrowing this gap so that a new/weaker product can also survive along with its competitor with a positive market share. Specifically, we identify a locally optimal set of users to induce a community that is targeted with advertisement by the product launching company under a given budget constraint. To this end, we model the system as competing Susceptible-Infected-Susceptible (SIS) epidemics and employ perturbation techniques to quantify and attain a positive market share in a cost-efficient manner. Our extensive simulation results with real-world graph dataset show that with our choice of target users, a new product can establish itself with positive market share, which otherwise would be dominated and eventually wiped out of the competitive market under the same budget constraint.
Fractional Delay Alignment Modulation for Spatially Sparse Wireless Communications
Authors: Zhiwen Zhou, Zhiqiang Xiao, Yong Zeng
Subjects: Signal Processing (eess.SP)
Arxiv link: https://arxiv.org/abs/2403.19951
Pdf link: https://arxiv.org/pdf/2403.19951
Abstract Delay alignment modulation (DAM) is a novel transmission technique for wireless systems with high spatial resolution by leveraging delay compensation and path-based beamforming, to mitigate the inter-symbol interference (ISI) without resorting to complex channel equalization or multi-carrier transmission. However, most existing studies on DAM consider a simplified scenario by assuming that the channel multi-path delays are integer multiples of the signal sampling interval. This paper investigates DAM for the more general and practical scenarios with fractional multi-path delays. We first analyze the impact of fractional multi-path delays on the existing DAM design, termed integer DAM (iDAM), which can only achieve delay compensations that are integer multiples of the sampling interval. It is revealed that the existence of fractional multi-path delays renders iDAM no longer possible to achieve perfect delay alignment. To address this issue, we propose a more generic DAM design called fractional DAM (fDAM), which achieves fractional delay pre-compensation via upsampling and fractional delay filtering. By leveraging the Farrow filter structure, the proposed approach can eliminate ISI without real-time computation of filter coefficients, as typically required in traditional channel equalization techniques. Simulation results demonstrate that the proposed fDAM outperforms the existing iDAM and orthogonal frequency division multiplexing (OFDM) in terms of symbol error rate (SER) and spectral efficiency, while maintaining a comparable peak-to-average power ratio (PAPR) as iDAM, which is considerably lower than OFDM.
Multi-task Magnetic Resonance Imaging Reconstruction using Meta-learning
Authors: Wanyu Bian, Albert Jang, Fang Liu
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Optimization and Control (math.OC)
Arxiv link: https://arxiv.org/abs/2403.19966
Pdf link: https://arxiv.org/pdf/2403.19966
Abstract Using single-task deep learning methods to reconstruct Magnetic Resonance Imaging (MRI) data acquired with different imaging sequences is inherently challenging. The trained deep learning model typically lacks generalizability, and the dissimilarity among image datasets with different types of contrast leads to suboptimal learning performance. This paper proposes a meta-learning approach to efficiently learn image features from multiple MR image datasets. Our algorithm can perform multi-task learning to simultaneously reconstruct MR images acquired using different imaging sequences with different image contrasts. The experiment results demonstrate the ability of our new meta-learning reconstruction method to successfully reconstruct highly-undersampled k-space data from multiple MRI datasets simultaneously, outperforming other compelling reconstruction methods previously developed for single-task learning.
A multi-stage semi-supervised learning for ankle fracture classification on CT images
Authors: Hongzhi Liu, Guicheng Li, Jiacheng Nie, Hui Tang, Chunfeng Yang, Qianjin Feng, Hailin Xu, Yang Chen
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2403.19983
Pdf link: https://arxiv.org/pdf/2403.19983
Abstract Because of the complicated mechanism of ankle injury, it is very difficult to diagnose ankle fracture in clinic. In order to simplify the process of fracture diagnosis, an automatic diagnosis model of ankle fracture was proposed. Firstly, a tibia-fibula segmentation network is proposed for the joint tibiofibular region of the ankle joint, and the corresponding segmentation dataset is established on the basis of fracture data. Secondly, the image registration method is used to register the bone segmentation mask with the normal bone mask. Finally, a semi-supervised classifier is constructed to make full use of a large number of unlabeled data to classify ankle fractures. Experiments show that the proposed method can segment fractures with fracture lines accurately and has better performance than the general method. At the same time, this method is superior to classification network in several indexes.
Nonparametric Bellman Mappings for Reinforcement Learning: Application to Robust Adaptive Filtering
Authors: Yuki Akiyama, Minh Vu, Konstantinos Slavakis
Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2403.20020
Pdf link: https://arxiv.org/pdf/2403.20020
Abstract This paper designs novel nonparametric Bellman mappings in reproducing kernel Hilbert spaces (RKHSs) for reinforcement learning (RL). The proposed mappings benefit from the rich approximating properties of RKHSs, adopt no assumptions on the statistics of the data owing to their nonparametric nature, require no knowledge on transition probabilities of Markov decision processes, and may operate without any training data. Moreover, they allow for sampling on-the-fly via the design of trajectory samples, re-use past test data via experience replay, effect dimensionality reduction by random Fourier features, and enable computationally lightweight operations to fit into efficient online or time-adaptive learning. The paper offers also a variational framework to design the free parameters of the proposed Bellman mappings, and shows that appropriate choices of those parameters yield several popular Bellman-mapping designs. As an application, the proposed mappings are employed to offer a novel solution to the problem of countering outliers in adaptive filtering. More specifically, with no prior information on the statistics of the outliers and no training data, a policy-iteration algorithm is introduced to select online, per time instance, the ``optimal'' coefficient p in the least-mean-p-power-error method. Numerical tests on synthetic data showcase, in most of the cases, the superior performance of the proposed solution over several RL and non-RL schemes.
UltraLight VM-UNet: Parallel Vision Mamba Significantly Reduces Parameters for Skin Lesion Segmentation
Authors: Renkai Wu, Yinghao Liu, Pengchen Liang, Qing Chang
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2403.20035
Pdf link: https://arxiv.org/pdf/2403.20035
Abstract Traditionally for improving the segmentation performance of models, most approaches prefer to use adding more complex modules. And this is not suitable for the medical field, especially for mobile medical devices, where computationally loaded models are not suitable for real clinical environments due to computational resource constraints. Recently, state-space models (SSMs), represented by Mamba, have become a strong competitor to traditional CNNs and Transformers. In this paper, we deeply explore the key elements of parameter influence in Mamba and propose an UltraLight Vision Mamba UNet (UltraLight VM-UNet) based on this. Specifically, we propose a method for processing features in parallel Vision Mamba, named PVM Layer, which achieves excellent performance with the lowest computational load while keeping the overall number of processing channels constant. We conducted comparisons and ablation experiments with several state-of-the-art lightweight models on three skin lesion public datasets and demonstrated that the UltraLight VM-UNet exhibits the same strong performance competitiveness with parameters of only 0.049M and GFLOPs of 0.060. In addition, this study deeply explores the key elements of parameter influence in Mamba, which will lay a theoretical foundation for Mamba to possibly become a new mainstream module for lightweighting in the future. The code is available from https://github.com/wurenkai/UltraLight-VM-UNet .
Exploring Pathological Speech Quality Assessment with ASR-Powered Wav2Vec2 in Data-Scarce Context
Authors: Tuan Nguyen, Corinne Fredouille, Alain Ghio, Mathieu Balaguer, Virginie Woisard
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
Arxiv link: https://arxiv.org/abs/2403.20184
Pdf link: https://arxiv.org/pdf/2403.20184
Abstract Automatic speech quality assessment has raised more attention as an alternative or support to traditional perceptual clinical evaluation. However, most research so far only gains good results on simple tasks such as binary classification, largely due to data scarcity. To deal with this challenge, current works tend to segment patients' audio files into many samples to augment the datasets. Nevertheless, this approach has limitations, as it indirectly relates overall audio scores to individual segments. This paper introduces a novel approach where the system learns at the audio level instead of segments despite data scarcity. This paper proposes to use the pre-trained Wav2Vec2 architecture for both SSL, and ASR as feature extractor in speech assessment. Carried out on the HNC dataset, our ASR-driven approach established a new baseline compared with other approaches, obtaining average $MSE=0.73$ and $MSE=1.15$ for the prediction of intelligibility and severity scores respectively, using only 95 training samples. It shows that the ASR based Wav2Vec2 model brings the best results and may indicate a strong correlation between ASR and speech quality assessment. We also measure its ability on variable segment durations and speech content, exploring factors influencing its decision.
Evolving Semantic Communication with Generative Model
Authors: Shunpu Tang, Qianqian Yang, Deniz Gündüz, Zhaoyang Zhang
Subjects: Signal Processing (eess.SP)
Arxiv link: https://arxiv.org/abs/2403.20237
Pdf link: https://arxiv.org/pdf/2403.20237
Abstract Recently, learning-based semantic communication (SemCom) has emerged as a promising approach in the upcoming 6G network and researchers have made remarkable efforts in this field. However, existing works have yet to fully explore the advantages of the evolving nature of learning-based systems, where knowledge accumulates during transmission have the potential to enhance system performance. In this paper, we explore an evolving semantic communication system for image transmission, referred to as ESemCom, with the capability to continuously enhance transmission efficiency. The system features a novel channel-aware semantic encoder that utilizes a pre-trained Semantic StyleGAN to extract the channel-correlated latent variables consisting of serval semantic vectors from the input images, which can be directly transmitted over a noisy channel without further channel coding. Moreover, we introduce a semantic caching mechanism that dynamically stores the transmitted semantic vectors in the local caching memory of both the transmitter and receiver. The cached semantic vectors are then exploited to eliminate the need to transmit similar codes in subsequent transmission, thus further reducing communication overhead. Simulation results highlight the evolving performance of the proposed system in terms of transmission efficiency, achieving superior perceptual quality with an average bandwidth compression ratio (BCR) of 1/192 for a sequence of 100 testing images compared to DeepJSCC and Inverse JSCC with the same BCR. Code of this paper is available at \url{https://github.com/recusant7/GAN_SeCom}.

Yukeaaa / arxiv-daily

【EESS】New submissions for Mon, 1 Apr 24 #1340

Keyword: volume render

Keyword: volumetric render

Keyword: remote render

Keyword: hybrid render

Keyword: raycast

Keyword: medical imaging

Keyword: medical visualization

Keyword: interactive volume

Keyword: rendering

SCINeRF: Neural Radiance Fields from a Snapshot Compressive Image

Keyword: cinematic rendering

Keyword: volume data

Keyword: remote visualization

Keyword: direct volume rendering

Keyword: mobile device

Keyword: transfer function

Keyword: retrieval

Keyword: video retrieval

Keyword: mobile

UltraLight VM-UNet: Parallel Vision Mamba Significantly Reduces Parameters for Skin Lesion Segmentation

Keyword: smartphone

Keyword: medical volume data

Keyword: webgpu

Keyword: webgl

Keyword: pre-rendering

Keyword: prerendering

Keyword: motion prediction

Keyword: incremental learning

Keyword: svm incremental

Keyword: nerf

SCINeRF: Neural Radiance Fields from a Snapshot Compressive Image

Keyword: multiorgan

Keyword: multi-organ

Keyword: multi organ

Keyword: SAM

Keeping Up With the Winner! Targeted Advertisement to Communities in Social Networks

Fractional Delay Alignment Modulation for Spatially Sparse Wireless Communications

Multi-task Magnetic Resonance Imaging Reconstruction using Meta-learning

A multi-stage semi-supervised learning for ankle fracture classification on CT images

Nonparametric Bellman Mappings for Reinforcement Learning: Application to Robust Adaptive Filtering

UltraLight VM-UNet: Parallel Vision Mamba Significantly Reduces Parameters for Skin Lesion Segmentation

Exploring Pathological Speech Quality Assessment with ASR-Powered Wav2Vec2 in Data-Scarce Context

Evolving Semantic Communication with Generative Model