【EESS】New submissions for Wed, 17 Apr 24

Keyword: volume render

There is no result

Keyword: volumetric render

There is no result

Keyword: remote render

There is no result

Keyword: hybrid render

There is no result

Keyword: raycast

There is no result

Keyword: medical imaging

Distributed Federated Learning-Based Deep Learning Model for Privacy MRI Brain Tumor Detection

Authors: Lisang Zhou, Meng Wang, Ning Zhou
Subjects: Image and Video Processing (eess.IV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2404.10026
Pdf link: https://arxiv.org/pdf/2404.10026
Abstract Distributed training can facilitate the processing of large medical image datasets, and improve the accuracy and efficiency of disease diagnosis while protecting patient privacy, which is crucial for achieving efficient medical image analysis and accelerating medical research progress. This paper presents an innovative approach to medical image classification, leveraging Federated Learning (FL) to address the dual challenges of data privacy and efficient disease diagnosis. Traditional Centralized Machine Learning models, despite their widespread use in medical imaging for tasks such as disease diagnosis, raise significant privacy concerns due to the sensitive nature of patient data. As an alternative, FL emerges as a promising solution by allowing the training of a collective global model across local clients without centralizing the data, thus preserving privacy. Focusing on the application of FL in Magnetic Resonance Imaging (MRI) brain tumor detection, this study demonstrates the effectiveness of the Federated Learning framework coupled with EfficientNet-B0 and the FedAvg algorithm in enhancing both privacy and diagnostic accuracy. Through a meticulous selection of preprocessing methods, algorithms, and hyperparameters, and a comparative analysis of various Convolutional Neural Network (CNN) architectures, the research uncovers optimal strategies for image classification. The experimental results reveal that EfficientNet-B0 outperforms other models like ResNet in handling data heterogeneity and achieving higher accuracy and lower loss, highlighting the potential of FL in overcoming the limitations of traditional models. The study underscores the significance of addressing data heterogeneity and proposes further research directions for broadening the applicability of FL in medical image analysis.
RapidVol: Rapid Reconstruction of 3D Ultrasound Volumes from Sensorless 2D Scans
Authors: Mark C. Eid, Pak-Hei Yeung, Madeleine K. Wyburd, João F. Henriques, Ana I.L. Namburete
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2404.10766
Pdf link: https://arxiv.org/pdf/2404.10766
Abstract Two-dimensional (2D) freehand ultrasonography is one of the most commonly used medical imaging modalities, particularly in obstetrics and gynaecology. However, it only captures 2D cross-sectional views of inherently 3D anatomies, losing valuable contextual information. As an alternative to requiring costly and complex 3D ultrasound scanners, 3D volumes can be constructed from 2D scans using machine learning. However this usually requires long computational time. Here, we propose RapidVol: a neural representation framework to speed up slice-to-volume ultrasound reconstruction. We use tensor-rank decomposition, to decompose the typical 3D volume into sets of tri-planes, and store those instead, as well as a small neural network. A set of 2D ultrasound scans, with their ground truth (or estimated) 3D position and orientation (pose) is all that is required to form a complete 3D reconstruction. Reconstructions are formed from real fetal brain scans, and then evaluated by requesting novel cross-sectional views. When compared to prior approaches based on fully implicit representation (e.g. neural radiance fields), our method is over 3x quicker, 46% more accurate, and if given inaccurate poses is more robust. Further speed-up is also possible by reconstructing from a structural prior rather than from scratch.
Keyword: medical visualization

There is no result

Keyword: interactive volume

There is no result

Keyword: rendering

There is no result

Keyword: cinematic rendering

There is no result

Keyword: volume data

There is no result

Keyword: remote visualization

There is no result

Keyword: direct volume rendering

There is no result

Keyword: mobile device

There is no result

Keyword: transfer function

Dynamic Complex-Frequency Control of Grid-Forming Converters
Authors: Roger Domingo-Enrich, Xiuqiang He, Verena Häberle, Florian Dörfler
Subjects: Systems and Control (eess.SY)
Arxiv link: https://arxiv.org/abs/2404.10071
Pdf link: https://arxiv.org/pdf/2404.10071
Abstract Complex droop control, alternatively known as dispatchable virtual oscillator control (dVOC), stands out for its unique capabilities in synchronization and voltage stabilization among existing control strategies for grid-forming converters. Complex droop control leverages the novel concept of ``complex frequency'', thereby establishing a coupled connection between active and reactive power inputs and frequency and rate-of-change-of voltage outputs. However, its reliance on static droop gains limits its ability to exhibit crucial dynamic response behaviors required in future power systems. To address this limitation, this paper introduces \textit{dynamic complex-frequency control}, upgrading static droop gains with dynamic transfer functions to enhance the richness and flexibility in dynamic responses for frequency and voltage control. Unlike existing approaches, the complex-frequency control framework treats frequency and voltage dynamics collectively, ensuring small-signal stability for frequency synchronization and voltage stabilization simultaneously. The control framework is validated through detailed numerical case studies on the IEEE nine-bus system, also showcasing its applicability in multi-converter setups.
Keyword: retrieval

There is no result

Keyword: video retrieval

There is no result

Keyword: mobile

There is no result

Keyword: smartphone

Wireless Earphone-based Real-Time Monitoring of Breathing Exercises: A Deep Learning Approach
Authors: Hassam Khan Wazir, Zaid Waghoo, Vikram Kapila
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2404.10310
Pdf link: https://arxiv.org/pdf/2404.10310
Abstract Several therapy routines require deep breathing exercises as a key component and patients undergoing such therapies must perform these exercises regularly. Assessing the outcome of a therapy and tailoring its course necessitates monitoring a patient's compliance with the therapy. While therapy compliance monitoring is routine in a clinical environment, it is challenging to do in an at-home setting. This is so because a home setting lacks access to specialized equipment and skilled professionals needed to effectively monitor the performance of a therapy routine by a patient. For some types of therapies, these challenges can be addressed with the use of consumer-grade hardware, such as earphones and smartphones, as practical solutions. To accurately monitor breathing exercises using wireless earphones, this paper proposes a framework that has the potential for assessing a patient's compliance with an at-home therapy. The proposed system performs real-time detection of breathing phases and channels with high accuracy by processing a $\mathbf{500}$ ms audio signal through two convolutional neural networks. The first network, called a channel classifier, distinguishes between nasal and oral breathing, and a pause. The second network, called a phase classifier, determines whether the audio segment is from inhalation or exhalation. According to $k$-fold cross-validation, the channel and phase classifiers achieved a maximum F1 score of $\mathbf{97.99\%}$ and $\mathbf{89.46\%}$, respectively. The results demonstrate the potential of using commodity earphones for real-time breathing channel and phase detection for breathing therapy compliance monitoring.
Rawformer: Unpaired Raw-to-Raw Translation for Learnable Camera ISPs
Authors: Georgy Perevozchikov, Nancy Mehta, Mahmoud Afifi, Radu Timofte
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2404.10700
Pdf link: https://arxiv.org/pdf/2404.10700
Abstract Modern smartphone camera quality heavily relies on the image signal processor (ISP) to enhance captured raw images, utilizing carefully designed modules to produce final output images encoded in a standard color space (e.g., sRGB). Neural-based end-to-end learnable ISPs offer promising advancements, potentially replacing traditional ISPs with their ability to adapt without requiring extensive tuning for each new camera model, as is often the case for nearly every module in traditional ISPs. However, the key challenge with the recent learning-based ISPs is the urge to collect large paired datasets for each distinct camera model due to the influence of intrinsic camera characteristics on the formation of input raw images. This paper tackles this challenge by introducing a novel method for unpaired learning of raw-to-raw translation across diverse cameras. Specifically, we propose Rawformer, an unsupervised Transformer-based encoder-decoder method for raw-to-raw translation. It accurately maps raw images captured by a certain camera to the target camera, facilitating the generalization of learnable ISPs to new unseen cameras. Our method demonstrates superior performance on real camera datasets, achieving higher accuracy compared to previous state-of-the-art techniques, and preserving a more robust correlation between the original and translated raw images.
Keyword: medical volume data

There is no result

Keyword: webgpu

There is no result

Keyword: webgl

There is no result

Keyword: pre-rendering

There is no result

Keyword: prerendering

There is no result

Keyword: motion prediction

There is no result

Keyword: incremental learning

There is no result

Keyword: svm incremental

There is no result

Keyword: nerf

There is no result

Keyword: multiorgan

There is no result

Keyword: multi-organ

There is no result

Keyword: multi organ

There is no result

Keyword: SAM

Learning and Optimization for Price-based Demand Response of Electric Vehicle Charging
Authors: Chengyang Gu, Yuxin Pan, Ruohong Liu, Yize Chen
Subjects: Systems and Control (eess.SY); Artificial Intelligence (cs.AI)
Arxiv link: https://arxiv.org/abs/2404.10311
Pdf link: https://arxiv.org/pdf/2404.10311
Abstract In the context of charging electric vehicles (EVs), the price-based demand response (PBDR) is becoming increasingly significant for charging load management. Such response usually encourages cost-sensitive customers to adjust their energy demand in response to changes in price for financial incentives. Thus, to model and optimize EV charging, it is important for charging station operator to model the PBDR patterns of EV customers by precisely predicting charging demands given price signals. Then the operator refers to these demands to optimize charging station power allocation policy. The standard pipeline involves offline fitting of a PBDR function based on historical EV charging records, followed by applying estimated EV demands in downstream charging station operation optimization. In this work, we propose a new decision-focused end-to-end framework for PBDR modeling that combines prediction errors and downstream optimization cost errors in the model learning stage. We evaluate the effectiveness of our method on a simulation of charging station operation with synthetic PBDR patterns of EV customers, and experimental results demonstrate that this framework can provide a more reliable prediction model for the ultimate optimization process, leading to more effective optimization solutions in terms of cost savings and charging station operation objectives with only a few training samples.
Adapting SAM for Surgical Instrument Tracking and Segmentation in Endoscopic Submucosal Dissection Videos
Authors: Jieming Yu, Long Bai, Guankun Wang, An Wang, Xiaoxiao Yang, Huxin Gao, Hongliang Ren
Subjects: Image and Video Processing (eess.IV)
Arxiv link: https://arxiv.org/abs/2404.10640
Pdf link: https://arxiv.org/pdf/2404.10640
Abstract The precise tracking and segmentation of surgical instruments have led to a remarkable enhancement in the efficiency of surgical procedures. However, the challenge lies in achieving accurate segmentation of surgical instruments while minimizing the need for manual annotation and reducing the time required for the segmentation process. To tackle this, we propose a novel framework for surgical instrument segmentation and tracking. Specifically, with a tiny subset of frames for segmentation, we ensure accurate segmentation across the entire surgical video. Our method adopts a two-stage approach to efficiently segment videos. Initially, we utilize the Segment-Anything (SAM) model, which has been fine-tuned using the Low-Rank Adaptation (LoRA) on the EndoVis17 Dataset. The fine-tuned SAM model is applied to segment the initial frames of the video accurately. Subsequently, we deploy the XMem++ tracking algorithm to follow the annotated frames, thereby facilitating the segmentation of the entire video sequence. This workflow enables us to precisely segment and track objects within the video. Through extensive evaluation of the in-distribution dataset (EndoVis17) and the out-of-distribution datasets (EndoVis18 \& the endoscopic submucosal dissection surgery (ESD) dataset), our framework demonstrates exceptional accuracy and robustness, thus showcasing its potential to advance the automated robotic-assisted surgery.
AV-GAN: Attention-Based Varifocal Generative Adversarial Network for Uneven Medical Image Translation
Authors: Zexin Li, Yiyang Lin, Zijie Fang, Shuyan Li, Xiu Li
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2404.10714
Pdf link: https://arxiv.org/pdf/2404.10714
Abstract Different types of staining highlight different structures in organs, thereby assisting in diagnosis. However, due to the impossibility of repeated staining, we cannot obtain different types of stained slides of the same tissue area. Translating the slide that is easy to obtain (e.g., H&E) to slides of staining types difficult to obtain (e.g., MT, PAS) is a promising way to solve this problem. However, some regions are closely connected to other regions, and to maintain this connection, they often have complex structures and are difficult to translate, which may lead to wrong translations. In this paper, we propose the Attention-Based Varifocal Generative Adversarial Network (AV-GAN), which solves multiple problems in pathologic image translation tasks, such as uneven translation difficulty in different regions, mutual interference of multiple resolution information, and nuclear deformation. Specifically, we develop an Attention-Based Key Region Selection Module, which can attend to regions with higher translation difficulty. We then develop a Varifocal Module to translate these regions at multiple resolutions. Experimental results show that our proposed AV-GAN outperforms existing image translation methods with two virtual kidney tissue staining tasks and improves FID values by 15.9 and 4.16 respectively in the H&E-MT and H&E-PAS tasks.

Yukeaaa / arxiv-daily

【EESS】New submissions for Wed, 17 Apr 24 #1370

Keyword: volume render

Keyword: volumetric render

Keyword: remote render

Keyword: hybrid render

Keyword: raycast

Keyword: medical imaging

Distributed Federated Learning-Based Deep Learning Model for Privacy MRI Brain Tumor Detection

RapidVol: Rapid Reconstruction of 3D Ultrasound Volumes from Sensorless 2D Scans

Keyword: medical visualization

Keyword: interactive volume

Keyword: rendering

Keyword: cinematic rendering

Keyword: volume data

Keyword: remote visualization

Keyword: direct volume rendering

Keyword: mobile device

Keyword: transfer function

Dynamic Complex-Frequency Control of Grid-Forming Converters

Keyword: retrieval

Keyword: video retrieval

Keyword: mobile

Keyword: smartphone

Wireless Earphone-based Real-Time Monitoring of Breathing Exercises: A Deep Learning Approach

Rawformer: Unpaired Raw-to-Raw Translation for Learnable Camera ISPs

Keyword: medical volume data

Keyword: webgpu

Keyword: webgl

Keyword: pre-rendering

Keyword: prerendering

Keyword: motion prediction

Keyword: incremental learning

Keyword: svm incremental

Keyword: nerf

Keyword: multiorgan

Keyword: multi-organ

Keyword: multi organ

Keyword: SAM

Learning and Optimization for Price-based Demand Response of Electric Vehicle Charging

Adapting SAM for Surgical Instrument Tracking and Segmentation in Endoscopic Submucosal Dissection Videos

AV-GAN: Attention-Based Varifocal Generative Adversarial Network for Uneven Medical Image Translation