New submissions for Mon, 11 Sep 23

Keyword: sgd

There is no result

Keyword: optimization

A recommender for the management of chronic pain in patients undergoing spinal cord stimulation

Authors: Authors: Tigran Tchrakian, Mykhaylo Zayats, Alessandra Pascale, Dat Huynh, Pritish Parida, Carla Agurto Rios, Sergiy Zhuk, Jeffrey L. Rogers, ENVISION Studies Physician Author Group, Boston Scientific Research Scientists Consortium
Subjects: Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2309.03918
Pdf link: https://arxiv.org/pdf/2309.03918
Abstract Spinal cord stimulation (SCS) is a therapeutic approach used for the management of chronic pain. It involves the delivery of electrical impulses to the spinal cord via an implanted device, which when given suitable stimulus parameters can mask or block pain signals. Selection of optimal stimulation parameters usually happens in the clinic under the care of a provider whereas at-home SCS optimization is managed by the patient. In this paper, we propose a recommender system for the management of pain in chronic pain patients undergoing SCS. In particular, we use a contextual multi-armed bandit (CMAB) approach to develop a system that recommends SCS settings to patients with the aim of improving their condition. These recommendations, sent directly to patients though a digital health ecosystem, combined with a patient monitoring system closes the therapeutic loop around a chronic pain patient over their entire patient journey. We evaluated the system in a cohort of SCS-implanted ENVISION study subjects (Clinicaltrials.gov ID: NCT03240588) using a combination of quality of life metrics and Patient States (PS), a novel measure of holistic outcomes. SCS recommendations provided statistically significant improvement in clinical outcomes (pain and/or QoL) in 85\% of all subjects (N=21). Among subjects in moderate PS (N=7) prior to receiving recommendations, 100\% showed statistically significant improvements and 5/7 had improved PS dwell time. This analysis suggests SCS patients may benefit from SCS recommendations, resulting in additional clinical improvement on top of benefits already received from SCS therapy.
Automatic Algorithm Selection for Pseudo-Boolean Optimization with Given Computational Time Limits
Authors: Authors: Catalina Pezo, Dorit Hochbaum, Julio Godoy, Roberto Asin-Acha
Subjects: Artificial Intelligence (cs.AI); Logic in Computer Science (cs.LO)
Arxiv link: https://arxiv.org/abs/2309.03924
Pdf link: https://arxiv.org/pdf/2309.03924
Abstract Machine learning (ML) techniques have been proposed to automatically select the best solver from a portfolio of solvers, based on predicted performance. These techniques have been applied to various problems, such as Boolean Satisfiability, Traveling Salesperson, Graph Coloring, and others. These methods, known as meta-solvers, take an instance of a problem and a portfolio of solvers as input. They then predict the best-performing solver and execute it to deliver a solution. Typically, the quality of the solution improves with a longer computational time. This has led to the development of anytime selectors, which consider both the instance and a user-prescribed computational time limit. Anytime meta-solvers predict the best-performing solver within the specified time limit. Constructing an anytime meta-solver is considerably more challenging than building a meta-solver without the "anytime" feature. In this study, we focus on the task of designing anytime meta-solvers for the NP-hard optimization problem of Pseudo-Boolean Optimization (PBO), which generalizes Satisfiability and Maximum Satisfiability problems. The effectiveness of our approach is demonstrated via extensive empirical study in which our anytime meta-solver improves dramatically on the performance of Mixed Integer Programming solver Gurobi, which is the best-performing single solver in the portfolio. For example, out of all instances and time limits for which Gurobi failed to find feasible solutions, our meta-solver identified feasible solutions for 47% of these.
Improving Resnet-9 Generalization Trained on Small Datasets
Authors: Authors: Omar Mohamed Awad, Habib Hajimolahoseini, Michael Lim, Gurpreet Gosal, Walid Ahmed, Yang Liu, Gordon Deng
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2309.03965
Pdf link: https://arxiv.org/pdf/2309.03965
Abstract This paper presents our proposed approach that won the first prize at the ICLR competition on Hardware Aware Efficient Training. The challenge is to achieve the highest possible accuracy in an image classification task in less than 10 minutes. The training is done on a small dataset of 5000 images picked randomly from CIFAR-10 dataset. The evaluation is performed by the competition organizers on a secret dataset with 1000 images of the same size. Our approach includes applying a series of technique for improving the generalization of ResNet-9 including: sharpness aware optimization, label smoothing, gradient centralization, input patch whitening as well as metalearning based training. Our experiments show that the ResNet-9 can achieve the accuracy of 88% while trained only on a 10% subset of CIFAR-10 dataset in less than 10 minuets
One-to-Multiple Clean-Label Image Camouflage (OmClic) based Backdoor Attack on Deep Learning
Authors: Authors: Guohong Wang, Hua Ma, Yansong Gao, Alsharif Abuadbba, Zhi Zhang, Wei Kang, Said F. Al-Sarawib, Gongxuan Zhang, Derek Abbott
Subjects: Cryptography and Security (cs.CR)
Arxiv link: https://arxiv.org/abs/2309.04036
Pdf link: https://arxiv.org/pdf/2309.04036
Abstract Image camouflage has been utilized to create clean-label poisoned images for implanting backdoor into a DL model. But there exists a crucial limitation that one attack/poisoned image can only fit a single input size of the DL model, which greatly increases its attack budget when attacking multiple commonly adopted input sizes of DL models. This work proposes to constructively craft an attack image through camouflaging but can fit multiple DL models' input sizes simultaneously, namely OmClic. Thus, through OmClic, we are able to always implant a backdoor regardless of which common input size is chosen by the user to train the DL model given the same attack budget (i.e., a fraction of the poisoning rate). With our camouflaging algorithm formulated as a multi-objective optimization, M=5 input sizes can be concurrently targeted with one attack image, which artifact is retained to be almost visually imperceptible at the same time. Extensive evaluations validate the proposed OmClic can reliably succeed in various settings using diverse types of images. Further experiments on OmClic based backdoor insertion to DL models show that high backdoor performances (i.e., attack success rate and clean data accuracy) are achievable no matter which common input size is randomly chosen by the user to train the model. So that the OmClic based backdoor attack budget is reduced by M$\times$ compared to the state-of-the-art camouflage based backdoor attack as a baseline. Significantly, the same set of OmClic based poisonous attack images is transferable to different model architectures for backdoor implant.
Sample-Efficient Co-Design of Robotic Agents Using Multi-fidelity Training on Universal Policy Network
Authors: Authors: Kishan R. Nagiredla, Buddhika L. Semage, Thommen G. Karimpanal, Arun Kumar A. V, Santu Rana
Subjects: Robotics (cs.RO); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2309.04085
Pdf link: https://arxiv.org/pdf/2309.04085
Abstract Co-design involves simultaneously optimizing the controller and agents physical design. Its inherent bi-level optimization formulation necessitates an outer loop design optimization driven by an inner loop control optimization. This can be challenging when the design space is large and each design evaluation involves data-intensive reinforcement learning process for control optimization. To improve the sample-efficiency we propose a multi-fidelity-based design exploration strategy based on Hyperband where we tie the controllers learnt across the design spaces through a universal policy learner for warm-starting the subsequent controller learning problems. Further, we recommend a particular way of traversing the Hyperband generated design matrix that ensures that the stochasticity of the Hyperband is reduced the most with the increasing warm starting effect of the universal policy learner as it is strengthened with each new design evaluation. Experiments performed on a wide range of agent design problems demonstrate the superiority of our method compared to the baselines. Additionally, analysis of the optimized designs shows interesting design alterations including design simplifications and non-intuitive alterations that have emerged in the biological world.
From Text to Mask: Localizing Entities Using the Attention of Text-to-Image Diffusion Models
Authors: Authors: Changming Xiao, Qi Yang, Feng Zhou, Changshui Zhang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2309.04109
Pdf link: https://arxiv.org/pdf/2309.04109
Abstract Diffusion models have revolted the field of text-to-image generation recently. The unique way of fusing text and image information contributes to their remarkable capability of generating highly text-related images. From another perspective, these generative models imply clues about the precise correlation between words and pixels. In this work, a simple but effective method is proposed to utilize the attention mechanism in the denoising network of text-to-image diffusion models. Without re-training nor inference-time optimization, the semantic grounding of phrases can be attained directly. We evaluate our method on Pascal VOC 2012 and Microsoft COCO 2014 under weakly-supervised semantic segmentation setting and our method achieves superior performance to prior methods. In addition, the acquired word-pixel correlation is found to be generalizable for the learned text embedding of customized generation methods, requiring only a few modifications. To validate our discovery, we introduce a new practical task called "personalized referring image segmentation" with a new dataset. Experiments in various situations demonstrate the advantages of our method compared to strong baselines on this task. In summary, our work reveals a novel way to extract the rich multi-modal knowledge hidden in diffusion models for segmentation.
A Two-Stage Training Framework for Joint Speech Compression and Enhancement
Authors: Authors: Jiayi Huang, Zeyu Yan, Wenbin Jiang, Fei Wen
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
Arxiv link: https://arxiv.org/abs/2309.04132
Pdf link: https://arxiv.org/pdf/2309.04132
Abstract This paper considers the joint compression and enhancement problem for speech signal in the presence of noise. Recently, the SoundStream codec, which relies on end-to-end joint training of an encoder-decoder pair and a residual vector quantizer by a combination of adversarial and reconstruction losses,has shown very promising performance, especially in subjective perception quality. In this work, we provide a theoretical result to show that, to simultaneously achieve low distortion and high perception in the presence of noise, there exist an optimal two-stage optimization procedure for the joint compression and enhancement problem. This procedure firstly optimizes an encoder-decoder pair using only distortion loss and then fixes the encoder to optimize a perceptual decoder using perception loss. Based on this result, we construct a two-stage training framework for joint compression and enhancement of noisy speech signal. Unlike existing training methods which are heuristic, the proposed two-stage training method has a theoretical foundation. Finally, experimental results for various noise and bit-rate conditions are provided. The results demonstrate that a codec trained by the proposed framework can outperform SoundStream and other representative codecs in terms of both objective and subjective evaluation metrics. Code is available at \textit{https://github.com/jscscloris/SEStream}.
Depth Completion with Multiple Balanced Bases and Confidence for Dense Monocular SLAM
Authors: Authors: Weijian Xie, Guanyi Chu, Quanhao Qian, Yihao Yu, Hai Li, Danpeng Chen, Shangjin Zhai, Nan Wang, Hujun Bao, Guofeng Zhang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2309.04145
Pdf link: https://arxiv.org/pdf/2309.04145
Abstract Dense SLAM based on monocular cameras does indeed have immense application value in the field of AR/VR, especially when it is performed on a mobile device. In this paper, we propose a novel method that integrates a light-weight depth completion network into a sparse SLAM system using a multi-basis depth representation, so that dense mapping can be performed online even on a mobile phone. Specifically, we present a specifically optimized multi-basis depth completion network, called BBC-Net, tailored to the characteristics of traditional sparse SLAM systems. BBC-Net can predict multiple balanced bases and a confidence map from a monocular image with sparse points generated by off-the-shelf keypoint-based SLAM systems. The final depth is a linear combination of predicted depth bases that can be optimized by tuning the corresponding weights. To seamlessly incorporate the weights into traditional SLAM optimization and ensure efficiency and robustness, we design a set of depth weight factors, which makes our network a versatile plug-in module, facilitating easy integration into various existing sparse SLAM systems and significantly enhancing global depth consistency through bundle adjustment. To verify the portability of our method, we integrate BBC-Net into two representative SLAM systems. The experimental results on various datasets show that the proposed method achieves better performance in monocular dense mapping than the state-of-the-art methods. We provide an online demo running on a mobile phone, which verifies the efficiency and mapping quality of the proposed method in real-world scenarios.
Double RIS-Assisted MIMO Systems Over Spatially Correlated Rician Fading Channels and Finite Scatterers
Authors: Authors: Ha An Le, Trinh Van Chien, Van Duc Nguyen, Wan Choi
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)
Arxiv link: https://arxiv.org/abs/2309.04178
Pdf link: https://arxiv.org/pdf/2309.04178
Abstract This paper investigates double RIS-assisted MIMO communication systems over Rician fading channels with finite scatterers, spatial correlation, and the existence of a double-scattering link between the transceiver. First, the statistical information is driven in closed form for the aggregated channels, unveiling various influences of the system and environment on the average channel power gains. Next, we study two active and passive beamforming designs corresponding to two objectives. The first problem maximizes channel capacity by jointly optimizing the active precoding and combining matrices at the transceivers and passive beamforming at the double RISs subject to the transmitting power constraint. In order to tackle the inherently non-convex issue, we propose an efficient alternating optimization algorithm (AO) based on the alternating direction method of multipliers (ADMM). The second problem enhances communication reliability by jointly training the encoder and decoder at the transceivers and the phase shifters at the RISs. Each neural network representing a system entity in an end-to-end learning framework is proposed to minimize the symbol error rate of the detected symbols by controlling the transceiver and the RISs phase shifts. Numerical results verify our analysis and demonstrate the superior improvements of phase shift designs to boost system performance.
Predictive and Robust Robot Assistance for Sequential Manipulation
Authors: Authors: Theodoros Stouraitis, Michael Gienger
Subjects: Robotics (cs.RO)
Arxiv link: https://arxiv.org/abs/2309.04185
Pdf link: https://arxiv.org/pdf/2309.04185
Abstract This paper presents a novel concept to support physically impaired humans in daily object manipulation tasks with a robot. Given a user's manipulation sequence, we propose a predictive model that uniquely casts the user's sequential behavior as well as a robot support intervention into a hierarchical multi-objective optimization problem. A major contribution is the prediction formulation, which allows to consider several different future paths concurrently. The second contribution is the encoding of a general notion of constancy constraints, which allows to consider dependencies between consecutive or far apart keyframes (in time or space) of a sequential task. We perform numerical studies, simulations and robot experiments to analyse and evaluate the proposed method in several table top tasks where a robot supports impaired users by predicting their posture and proactively re-arranging objects.
A Tutorial on Distributed Optimization for Cooperative Robotics: from Setups and Algorithms to Toolboxes and Research Directions
Authors: Authors: Andrea Testa, Guido Carnevale, Giuseppe Notarstefano
Subjects: Robotics (cs.RO); Optimization and Control (math.OC)
Arxiv link: https://arxiv.org/abs/2309.04257
Pdf link: https://arxiv.org/pdf/2309.04257
Abstract Several interesting problems in multi-robot systems can be cast in the framework of distributed optimization. Examples include multi-robot task allocation, vehicle routing, target protection and surveillance. While the theoretical analysis of distributed optimization algorithms has received significant attention, its application to cooperative robotics has not been investigated in detail. In this paper, we show how notable scenarios in cooperative robotics can be addressed by suitable distributed optimization setups. Specifically, after a brief introduction on the widely investigated consensus optimization (most suited for data analytics) and on the partition-based setup (matching the graph structure in the optimization), we focus on two distributed settings modeling several scenarios in cooperative robotics, i.e., the so-called constraint-coupled and aggregative optimization frameworks. For each one, we consider use-case applications, and we discuss tailored distributed algorithms with their convergence properties. Then, we revise state-of-the-art toolboxes allowing for the implementation of distributed schemes on real networks of robots without central coordinators. For each use case, we discuss their implementation in these toolboxes and provide simulations and real experiments on networks of heterogeneous robots.
Online Submodular Maximization via Online Convex Optimization
Authors: Authors: T. Si-Salem, G. Özcan, I. Nikolaou, E. Terzi, S. Ioannidis
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Optimization and Control (math.OC)
Arxiv link: https://arxiv.org/abs/2309.04339
Pdf link: https://arxiv.org/pdf/2309.04339
Abstract We study monotone submodular maximization under general matroid constraints in the online setting. We prove that online optimization of a large class of submodular functions, namely, weighted threshold potential functions, reduces to online convex optimization (OCO). This is precisely because functions in this class admit a concave relaxation; as a result, OCO policies, coupled with an appropriate rounding scheme, can be used to achieve sublinear regret in the combinatorial setting. We show that our reduction extends to many different versions of the online learning problem, including the dynamic regret, bandit, and optimistic-learning settings.
A Rapid Prototyping Language Workbench for Textual DSLs based on Xtext: Vision and Progress
Authors: Authors: Weixing Zhang, Jan-Philipp Steghöfer, Regina Hebig, Daniel Strüber
Subjects: Software Engineering (cs.SE)
Arxiv link: https://arxiv.org/abs/2309.04347
Pdf link: https://arxiv.org/pdf/2309.04347
Abstract Metamodel-based DSL development in language workbenches like Xtext allows language engineers to focus more on metamodels and domain concepts rather than grammar details. However, the grammar generated from metamodels often requires manual modification, which can be tedious and time-consuming. Especially when it comes to rapid prototyping and language evolution, the grammar will be generated repeatedly, this means that language engineers need to repeat such manual modification back and forth. Previous work introduced GrammarOptimizer, which automatically improves the generated grammar using optimization rules. However, the optimization rules need to be configured manually, which lacks user-friendliness and convenience. In this paper, we present our vision for and current progress towards a language workbench that integrates GrammarOptimizer's grammar optimization rules to support rapid prototyping and evolution of metamodel-based languages. It provides a visual configuration of optimization rules and a real-time preview of the effects of grammar optimization to address the limitations of GrammarOptimizer. Furthermore, it supports the inference of a grammar based on examples from model instances and offers a selection of language styles. These features aim to enhance the automation level of metamodel-based DSL development with Xtext and assist language engineers in iterative development and rapid prototyping. Our paper discusses the potential and applications of this language workbench, as well as how it fills the gaps in existing language workbenches.
Memory-Enhanced Dynamic Evolutionary Control of Reconfigurable Intelligent Surfaces
Authors: Authors: Francesco Zardi, Giacomo Oliveri, Andrea Massa
Subjects: Systems and Control (eess.SY)
Arxiv link: https://arxiv.org/abs/2309.04353
Pdf link: https://arxiv.org/pdf/2309.04353
Abstract An innovative evolutionary method for the dynamic control of reconfigurable intelligent surfaces (RISs) is proposed. It leverages, on the one hand, on the exploration capabilities of evolutionary strategies and their effectiveness in dealing with large-scale discrete optimization problems and, on the other hand, on the implementation of memory-enhanced search mechanisms to exploit the time/space correlation of communication environments. Without modifying the base station (BS) beamforming strategy and using an accurate description of the meta-atom response to faithfully account for the micro-scale EM interactions, the RIS control (RISC) algorithm maximizes the worst-case throughput across all users without requiring that the Green's partial matrices, from the BS to the RIS and from the RIS to the users, be (separately) known/measured. Representative numerical examples are reported to illustrate the features and to assess the potentialities of the proposed approach for the RISC.
Data-Driven Batch Localization and SLAM Using Koopman Linearization
Authors: Authors: Zi Cong Guo, Frederike Dümbgen, James R. Forbes, Timothy D. Barfoot
Subjects: Robotics (cs.RO)
Arxiv link: https://arxiv.org/abs/2309.04375
Pdf link: https://arxiv.org/pdf/2309.04375
Abstract We present a framework for model-free batch localization and SLAM. We use lifting functions to map a control-affine system into a high-dimensional space, where both the process model and the measurement model are rendered bilinear. During training, we solve a least-squares problem using groundtruth data to compute the high-dimensional model matrices associated with the lifted system purely from data. At inference time, we solve for the unknown robot trajectory and landmarks through an optimization problem, where constraints are introduced to keep the solution on the manifold of the lifting functions. The problem is efficiently solved using a sequential quadratic program (SQP), where the complexity of an SQP iteration scales linearly with the number of timesteps. Our algorithms, called Reduced Constrained Koopman Linearization Localization (RCKL-Loc) and Reduced Constrained Koopman Linearization SLAM (RCKL-SLAM), are validated experimentally in simulation and on two datasets: one with an indoor mobile robot equipped with a laser rangefinder that measures range to cylindrical landmarks, and one on a golf cart equipped with RFID range sensors. We compare RCKL-Loc and RCKL-SLAM with classic model-based nonlinear batch estimation. While RCKL-Loc and RCKL-SLAM have similar performance compared to their model-based counterparts, they outperform the model-based approaches when the prior model is imperfect, showing the potential benefit of the proposed data-driven technique.
ARRTOC: Adversarially Robust Real-Time Optimization and Control
Authors: Authors: Akhil Ahmed, Ehecatl Antonio del Rio-Chanona, Mehmet Mercangoz
Subjects: Systems and Control (eess.SY)
Arxiv link: https://arxiv.org/abs/2309.04386
Pdf link: https://arxiv.org/pdf/2309.04386
Abstract Real-Time Optimization (RTO) plays a crucial role in the process operation hierarchy by determining optimal set-points for the lower-level controllers. However, these optimal set-points can become inoperable due to implementation errors, such as disturbances and noise, at the control layers. To address this challenge, in this paper, we present the Adversarially Robust Real-Time Optimization and Control (ARRTOC) algorithm. ARRTOC draws inspiration from adversarial machine learning, offering an online constrained Adversarially Robust Optimization (ARO) solution applied to the RTO layer. This approach identifies set-points that are both optimal and inherently robust to control layer perturbations. By integrating controller design with RTO, ARRTOC enhances overall system performance and robustness. Importantly, ARRTOC maintains versatility through a loose coupling between the RTO and control layers, ensuring compatibility with various controller architectures and RTO algorithms. To validate our claims, we present three case studies: an illustrative example, a bioreactor case study, and a multi-loop evaporator process. Our results demonstrate the effectiveness of ARRTOC in achieving the delicate balance between optimality and operability in RTO and control.
DeformToon3D: Deformable 3D Toonification from Neural Radiance Fields
Authors: Authors: Junzhe Zhang, Yushi Lan, Shuai Yang, Fangzhou Hong, Quan Wang, Chai Kiat Yeo, Ziwei Liu, Chen Change Loy
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
Arxiv link: https://arxiv.org/abs/2309.04410
Pdf link: https://arxiv.org/pdf/2309.04410
Abstract In this paper, we address the challenging problem of 3D toonification, which involves transferring the style of an artistic domain onto a target 3D face with stylized geometry and texture. Although fine-tuning a pre-trained 3D GAN on the artistic domain can produce reasonable performance, this strategy has limitations in the 3D domain. In particular, fine-tuning can deteriorate the original GAN latent space, which affects subsequent semantic editing, and requires independent optimization and storage for each new style, limiting flexibility and efficient deployment. To overcome these challenges, we propose DeformToon3D, an effective toonification framework tailored for hierarchical 3D GAN. Our approach decomposes 3D toonification into subproblems of geometry and texture stylization to better preserve the original latent space. Specifically, we devise a novel StyleField that predicts conditional 3D deformation to align a real-space NeRF to the style space for geometry stylization. Thanks to the StyleField formulation, which already handles geometry stylization well, texture stylization can be achieved conveniently via adaptive style mixing that injects information of the artistic domain into the decoder of the pre-trained 3D GAN. Due to the unique design, our method enables flexible style degree control and shape-texture-specific style swap. Furthermore, we achieve efficient training without any real-world 2D-3D training pairs but proxy samples synthesized from off-the-shelf 2D toonification models.
Parallel and Limited Data Voice Conversion Using Stochastic Variational Deep Kernel Learning
Authors: Authors: Mohamadreza Jafaryani, Hamid Sheikhzadeh, Vahid Pourahmadi
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
Arxiv link: https://arxiv.org/abs/2309.04420
Pdf link: https://arxiv.org/pdf/2309.04420
Abstract Typically, voice conversion is regarded as an engineering problem with limited training data. The reliance on massive amounts of data hinders the practical applicability of deep learning approaches, which have been extensively researched in recent years. On the other hand, statistical methods are effective with limited data but have difficulties in modelling complex mapping functions. This paper proposes a voice conversion method that works with limited data and is based on stochastic variational deep kernel learning (SVDKL). At the same time, SVDKL enables the use of deep neural networks' expressive capability as well as the high flexibility of the Gaussian process as a Bayesian and non-parametric method. When the conventional kernel is combined with the deep neural network, it is possible to estimate non-smooth and more complex functions. Furthermore, the model's sparse variational Gaussian process solves the scalability problem and, unlike the exact Gaussian process, allows for the learning of a global mapping function for the entire acoustic space. One of the most important aspects of the proposed scheme is that the model parameters are trained using marginal likelihood optimization, which considers both data fitting and model complexity. Considering the complexity of the model reduces the amount of training data by increasing the resistance to overfitting. To evaluate the proposed scheme, we examined the model's performance with approximately 80 seconds of training data. The results indicated that our method obtained a higher mean opinion score, smaller spectral distortion, and better preference tests than the compared methods.
Comparative Study of Visual SLAM-Based Mobile Robot Localization Using Fiducial Markers
Authors: Authors: Jongwon Lee, Su Yeon Choi, David Hanley, Timothy Bretl
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2309.04441
Pdf link: https://arxiv.org/pdf/2309.04441
Abstract This paper presents a comparative study of three modes for mobile robot localization based on visual SLAM using fiducial markers (i.e., square-shaped artificial landmarks with a black-and-white grid pattern): SLAM, SLAM with a prior map, and localization with a prior map. The reason for comparing the SLAM-based approaches leveraging fiducial markers is because previous work has shown their superior performance over feature-only methods, with less computational burden compared to methods that use both feature and marker detection without compromising the localization performance. The evaluation is conducted using indoor image sequences captured with a hand-held camera containing multiple fiducial markers in the environment. The performance metrics include absolute trajectory error and runtime for the optimization process per frame. In particular, for the last two modes (SLAM and localization with a prior map), we evaluate their performances by perturbing the quality of prior map to study the extent to which each mode is tolerant to such perturbations. Hardware experiments show consistent trajectory error levels across the three modes, with the localization mode exhibiting the shortest runtime among them. Yet, with map perturbations, SLAM with a prior map maintains performance, while localization mode degrades in both aspects.
A Generalized Stopping Criterion for Real-Time MPC with Guaranteed Stability
Authors: Authors: Kristína Fedorová, Yuning Jiang, Juraj Oravec, Colin N. Jones, Michal Kvasnica
Subjects: Systems and Control (eess.SY); Optimization and Control (math.OC)
Arxiv link: https://arxiv.org/abs/2309.04444
Pdf link: https://arxiv.org/pdf/2309.04444
Abstract Most of the real-time implementations of the stabilizing optimal control actions suffer from the necessity to provide high computational effort. This paper presents a cutting-edge approach for real-time evaluation of linear-quadratic model predictive control (MPC) that employs a novel generalized stopping criterion, achieving asymptotic stability in the presence of input constraints. The proposed method evaluates a fixed number of iterations independent of the initial condition, eliminating the necessity for computationally expensive methods. We demonstrate the effectiveness of the introduced technique by its implementation of two widely-used first-order optimization methods: the projected gradient descent method (PGDM) and the alternating directions method of multipliers (ADMM). The numerical simulation confirmed a significantly reduced number of iterations, resulting in suboptimality rates of less than 2\,\%, while the effort reductions exceeded 80\,\%. These results nominate the proposed criterion for an efficient real-time implementation method of MPC controllers.
Multi-contact Stochastic Predictive Control for Legged Robots with Contact Locations Uncertainty
Authors: Authors: Ahmad Gazar, Majid Khadiv, Andrea Del Prete, Ludovic Righetti
Subjects: Robotics (cs.RO); Systems and Control (eess.SY)
Arxiv link: https://arxiv.org/abs/2309.04469
Pdf link: https://arxiv.org/pdf/2309.04469
Abstract Trajectory optimization under uncertainties is a challenging problem for robots in contact with the environment. Such uncertainties are inevitable due to estimation errors, control imperfections, and model mismatches between planning models used for control and the real robot dynamics. This induces control policies that could violate the contact location constraints by making contact at unintended locations, and as a consequence leading to unsafe motion plans. This work addresses the problem of robust kino-dynamic whole-body trajectory optimization using stochastic nonlinear model predictive control (SNMPC) by considering additive uncertainties on the model dynamics subject to contact location chance-constraints as a function of robot's full kinematics. We demonstrate the benefit of using SNMPC over classic nonlinear MPC (NMPC) for whole-body trajectory optimization in terms of contact location constraint satisfaction (safety). We run extensive Monte-Carlo simulations for a quadruped robot performing agile trotting and bounding motions over small stepping stones, where contact location satisfaction becomes critical. Our results show that SNMPC is able to perform all motions safely with 100% success rate, while NMPC failed 48.3% of all motions.
Keyword: adam

Sparse-DFT and WHT Precoding with Iterative Detection for Highly Frequency-Selective Channels
Authors: Authors: Roberto Bomfin, Marwa Chafii
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)
Arxiv link: https://arxiv.org/abs/2309.04149
Pdf link: https://arxiv.org/pdf/2309.04149
Abstract Various precoders have been recently studied by the wireless community to combat the channel fading effects. Two prominent precoders are implemented with the discrete Fourier transform (DFT) and Walsh-Hadamard transform (WHT). The WHT precoder is implemented with less complexity since it does not need complex multiplications. Also, spreading can be applied sparsely to decrease the transceiver complexity, leading to sparse DFT (SDFT) and sparse Walsh-Hadamard (SWH). Another relevant topic is the design of iterative receivers that deal with inter-symbol-interference (ISI). In particular, many detectors based on expectation propagation (EP) have been proposed recently for channels with high levels of ISI. An alternative is the maximum a-posterior (MAP) detector, although it leads to unfeasible high complexity in many cases. In this paper, we provide a relatively low-complexity \textcolor{black}{computation} of the MAP detector for the SWH. We also propose two \textcolor{black}{feasible methods} based on the Log-MAP and Max-Log-MAP. Additionally, the DFT, SDFT and SWH precoders are compared using an EP-based receiver with one-tap FD equalization. Lastly, SWH-Max-Log-MAP is compared to the (S)DFT with EP-based receiver in terms of performance and complexity. The results show that the proposed SWH-Max-Log-MAP has a better performance and complexity trade-off for QPSK and 16-QAM under highly selective channels, but has unfeasible complexity for higher QAM orders.
Keyword: gradient

Improving Resnet-9 Generalization Trained on Small Datasets
Authors: Authors: Omar Mohamed Awad, Habib Hajimolahoseini, Michael Lim, Gurpreet Gosal, Walid Ahmed, Yang Liu, Gordon Deng
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2309.03965
Pdf link: https://arxiv.org/pdf/2309.03965
Abstract This paper presents our proposed approach that won the first prize at the ICLR competition on Hardware Aware Efficient Training. The challenge is to achieve the highest possible accuracy in an image classification task in less than 10 minutes. The training is done on a small dataset of 5000 images picked randomly from CIFAR-10 dataset. The evaluation is performed by the competition organizers on a secret dataset with 1000 images of the same size. Our approach includes applying a series of technique for improving the generalization of ResNet-9 including: sharpness aware optimization, label smoothing, gradient centralization, input patch whitening as well as metalearning based training. Our experiments show that the ResNet-9 can achieve the accuracy of 88% while trained only on a 10% subset of CIFAR-10 dataset in less than 10 minuets
DBsurf: A Discrepancy Based Method for Discrete Stochastic Gradient Estimation
Authors: Authors: Pau Mulet Arabi, Alec Flowers, Lukas Mauch, Fabien Cardinaux
Subjects: Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2309.03974
Pdf link: https://arxiv.org/pdf/2309.03974
Abstract Computing gradients of an expectation with respect to the distributional parameters of a discrete distribution is a problem arising in many fields of science and engineering. Typically, this problem is tackled using Reinforce, which frames the problem of gradient estimation as a Monte Carlo simulation. Unfortunately, the Reinforce estimator is especially sensitive to discrepancies between the true probability distribution and the drawn samples, a common issue in low sampling regimes that results in inaccurate gradient estimates. In this paper, we introduce DBsurf, a reinforce-based estimator for discrete distributions that uses a novel sampling procedure to reduce the discrepancy between the samples and the actual distribution. To assess the performance of our estimator, we subject it to a diverse set of tasks. Among existing estimators, DBsurf attains the lowest variance in a least squares problem commonly used in the literature for benchmarking. Furthermore, DBsurf achieves the best results for training variational auto-encoders (VAE) across different datasets and sampling setups. Finally, we apply DBsurf to build a simple and efficient Neural Architecture Search (NAS) algorithm with state-of-the-art performance.
Generalized moving least squares vs. radial basis function finite difference methods for approximating surface derivatives
Authors: Authors: Andrew M. Jones, Peter A. Bosler, Paul A. Kuberry, Grady B. Wright a
Subjects: Numerical Analysis (math.NA)
Arxiv link: https://arxiv.org/abs/2309.04035
Pdf link: https://arxiv.org/pdf/2309.04035
Abstract Approximating differential operators defined on two-dimensional surfaces is an important problem that arises in many areas of science and engineering. Over the past ten years, localized meshfree methods based on generalized moving least squares (GMLS) and radial basis function finite differences (RBF-FD) have been shown to be effective for this task as they can give high orders of accuracy at low computational cost, and they can be applied to surfaces defined only by point clouds. However, there have yet to be any studies that perform a direct comparison of these methods for approximating surface differential operators (SDOs). The first purpose of this work is to fill that gap. For this comparison, we focus on an RBF-FD method based on polyharmonic spline kernels and polynomials (PHS+Poly) since they are most closely related to the GMLS method. Additionally, we use a relatively new technique for approximating SDOs with RBF-FD called the tangent plane method since it is simpler than previous techniques and natural to use with PHS+Poly RBF-FD. The second purpose of this work is to relate the tangent plane formulation of SDOs to the local coordinate formulation used in GMLS and to show that they are equivalent when the tangent space to the surface is known exactly. The final purpose is to use ideas from the GMLS SDO formulation to derive a new RBF-FD method for approximating the tangent space for a point cloud surface when it is unknown. For the numerical comparisons of the methods, we examine their convergence rates for approximating the surface gradient, divergence, and Laplacian as the point clouds are refined for various parameter choices. We also compare their efficiency in terms of accuracy per computational cost, both when including and excluding setup costs.
Toward Sufficient Spatial-Frequency Interaction for Gradient-aware Underwater Image Enhancement
Authors: Authors: Chen Zhao, Weiling Cai, Chenyu Dong, Ziqi Zeng
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2309.04089
Pdf link: https://arxiv.org/pdf/2309.04089
Abstract Underwater images suffer from complex and diverse degradation, which inevitably affects the performance of underwater visual tasks. However, most existing learning-based Underwater image enhancement (UIE) methods mainly restore such degradations in the spatial domain, and rarely pay attention to the fourier frequency information. In this paper, we develop a novel UIE framework based on spatial-frequency interaction and gradient maps, namely SFGNet, which consists of two stages. Specifically, in the first stage, we propose a dense spatial-frequency fusion network (DSFFNet), mainly including our designed dense fourier fusion block and dense spatial fusion block, achieving sufficient spatial-frequency interaction by cross connections between these two blocks. In the second stage, we propose a gradient-aware corrector (GAC) to further enhance perceptual details and geometric structures of images by gradient map. Experimental results on two real-world underwater image datasets show that our approach can successfully enhance underwater images, and achieves competitive performance in visual quality improvement.
Counterfactual Explanations via Locally-guided Sequential Algorithmic Recourse
Authors: Authors: Edward A. Small, Jeffrey N. Clark, Christopher J. McWilliams, Kacper Sokol, Jeffrey Chan, Flora D. Salim, Raul Santos-Rodriguez
Subjects: Machine Learning (cs.LG); Computers and Society (cs.CY)
Arxiv link: https://arxiv.org/abs/2309.04211
Pdf link: https://arxiv.org/pdf/2309.04211
Abstract Counterfactuals operationalised through algorithmic recourse have become a powerful tool to make artificial intelligence systems explainable. Conceptually, given an individual classified as y -- the factual -- we seek actions such that their prediction becomes the desired class y' -- the counterfactual. This process offers algorithmic recourse that is (1) easy to customise and interpret, and (2) directly aligned with the goals of each individual. However, the properties of a "good" counterfactual are still largely debated; it remains an open challenge to effectively locate a counterfactual along with its corresponding recourse. Some strategies use gradient-driven methods, but these offer no guarantees on the feasibility of the recourse and are open to adversarial attacks on carefully created manifolds. This can lead to unfairness and lack of robustness. Other methods are data-driven, which mostly addresses the feasibility problem at the expense of privacy, security and secrecy as they require access to the entire training data set. Here, we introduce LocalFACE, a model-agnostic technique that composes feasible and actionable counterfactual explanations using locally-acquired information at each step of the algorithmic recourse. Our explainer preserves the privacy of users by only leveraging data that it specifically requires to construct actionable algorithmic recourse, and protects the model by offering transparency solely in the regions deemed necessary for the intervention.
Learning Zero-Sum Linear Quadratic Games with Improved Sample Complexity
Authors: Authors: Jiduan Wu, Anas Barakat, Ilyas Fatkhullin, Niao He
Subjects: Systems and Control (eess.SY); Computer Science and Game Theory (cs.GT); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2309.04272
Pdf link: https://arxiv.org/pdf/2309.04272
Abstract Zero-sum Linear Quadratic (LQ) games are fundamental in optimal control and can be used (i) as a dynamic game formulation for risk-sensitive or robust control, or (ii) as a benchmark setting for multi-agent reinforcement learning with two competing agents in continuous state-control spaces. In contrast to the well-studied single-agent linear quadratic regulator problem, zero-sum LQ games entail solving a challenging nonconvex-nonconcave min-max problem with an objective function that lacks coercivity. Recently, Zhang et al. discovered an implicit regularization property of natural policy gradient methods which is crucial for safety-critical control systems since it preserves the robustness of the controller during learning. Moreover, in the model-free setting where the knowledge of model parameters is not available, Zhang et al. proposed the first polynomial sample complexity algorithm to reach an $\epsilon$-neighborhood of the Nash equilibrium while maintaining the desirable implicit regularization property. In this work, we propose a simpler nested Zeroth-Order (ZO) algorithm improving sample complexity by several orders of magnitude. Our main result guarantees a $\widetilde{\mathcal{O}}(\epsilon^{-3})$ sample complexity under the same assumptions using a single-point ZO estimator. Furthermore, when the estimator is replaced by a two-point estimator, our method enjoys a better $\widetilde{\mathcal{O}}(\epsilon^{-2})$ sample complexity. Our key improvements rely on a more sample-efficient nested algorithm design and finer control of the ZO natural gradient estimation error.
Graph Neural Networks Use Graphs When They Shouldn't
Authors: Authors: Maya Bechler-Speicher, Ido Amos, Ran Gilad-Bachrach, Amir Globerson
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Arxiv link: https://arxiv.org/abs/2309.04332
Pdf link: https://arxiv.org/pdf/2309.04332
Abstract Predictions over graphs play a crucial role in various domains, including social networks, molecular biology, medicine, and more. Graph Neural Networks (GNNs) have emerged as the dominant approach for learning on graph data. Instances of graph labeling problems consist of the graph-structure (i.e., the adjacency matrix), along with node-specific feature vectors. In some cases, this graph-structure is non-informative for the predictive task. For instance, molecular properties such as molar mass depend solely on the constituent atoms (node features), and not on the molecular structure. While GNNs have the ability to ignore the graph-structure in such cases, it is not clear that they will. In this work, we show that GNNs actually tend to overfit the graph-structure in the sense that they use it even when a better solution can be obtained by ignoring it. We examine this phenomenon with respect to different graph distributions and find that regular graphs are more robust to this overfitting. We then provide a theoretical explanation for this phenomenon, via analyzing the implicit bias of gradient-descent-based learning of GNNs in this setting. Finally, based on our empirical and theoretical findings, we propose a graph-editing method to mitigate the tendency of GNNs to overfit graph-structures that should be ignored. We show that this method indeed improves the accuracy of GNNs across multiple benchmarks.
A Generalized Stopping Criterion for Real-Time MPC with Guaranteed Stability
Authors: Authors: Kristína Fedorová, Yuning Jiang, Juraj Oravec, Colin N. Jones, Michal Kvasnica
Subjects: Systems and Control (eess.SY); Optimization and Control (math.OC)
Arxiv link: https://arxiv.org/abs/2309.04444
Pdf link: https://arxiv.org/pdf/2309.04444
Abstract Most of the real-time implementations of the stabilizing optimal control actions suffer from the necessity to provide high computational effort. This paper presents a cutting-edge approach for real-time evaluation of linear-quadratic model predictive control (MPC) that employs a novel generalized stopping criterion, achieving asymptotic stability in the presence of input constraints. The proposed method evaluates a fixed number of iterations independent of the initial condition, eliminating the necessity for computationally expensive methods. We demonstrate the effectiveness of the introduced technique by its implementation of two widely-used first-order optimization methods: the projected gradient descent method (PGDM) and the alternating directions method of multipliers (ADMM). The numerical simulation confirmed a significantly reduced number of iterations, resulting in suboptimality rates of less than 2\,\%, while the effort reductions exceeded 80\,\%. These results nominate the proposed criterion for an efficient real-time implementation method of MPC controllers.
Keyword: super-resolution

SRN-SZ: Deep Leaning-Based Scientific Error-bounded Lossy Compression with Super-resolution Neural Networks
Authors: Authors: Jinyang Liu, Sheng Di, Sian Jin, Kai Zhao, Xin Liang, Zizhong Chen, Franck Cappello
Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC); Information Theory (cs.IT)
Arxiv link: https://arxiv.org/abs/2309.04037
Pdf link: https://arxiv.org/pdf/2309.04037
Abstract The fast growth of computational power and scales of modern super-computing systems have raised great challenges for the management of exascale scientific data. To maintain the usability of scientific data, error-bound lossy compression is proposed and developed as an essential technique for the size reduction of scientific data with constrained data distortion. Among the diverse datasets generated by various scientific simulations, certain datasets cannot be effectively compressed by existing error-bounded lossy compressors with traditional techniques. The recent success of Artificial Intelligence has inspired several researchers to integrate neural networks into error-bounded lossy compressors. However, those works still suffer from limited compression ratios and/or extremely low efficiencies. To address those issues and improve the compression on the hard-to-compress datasets, in this paper, we propose SRN-SZ, which is a deep learning-based scientific error-bounded lossy compressor leveraging the hierarchical data grid expansion paradigm implemented by super-resolution neural networks. SRN-SZ applies the most advanced super-resolution network HAT for its compression, which is free of time-costing per-data training. In experiments compared with various state-of-the-art compressors, SRN-SZ achieves up to 75% compression ratio improvements under the same error bound and up to 80% compression ratio improvements under the same PSNR than the second-best compressor.

zoq / arxiv-updates

New submissions for Mon, 11 Sep 23 #597

Keyword: sgd

Keyword: optimization

A recommender for the management of chronic pain in patients undergoing spinal cord stimulation

Automatic Algorithm Selection for Pseudo-Boolean Optimization with Given Computational Time Limits

Improving Resnet-9 Generalization Trained on Small Datasets

One-to-Multiple Clean-Label Image Camouflage (OmClic) based Backdoor Attack on Deep Learning

Sample-Efficient Co-Design of Robotic Agents Using Multi-fidelity Training on Universal Policy Network

From Text to Mask: Localizing Entities Using the Attention of Text-to-Image Diffusion Models

A Two-Stage Training Framework for Joint Speech Compression and Enhancement

Depth Completion with Multiple Balanced Bases and Confidence for Dense Monocular SLAM

Double RIS-Assisted MIMO Systems Over Spatially Correlated Rician Fading Channels and Finite Scatterers

Predictive and Robust Robot Assistance for Sequential Manipulation

A Tutorial on Distributed Optimization for Cooperative Robotics: from Setups and Algorithms to Toolboxes and Research Directions

Online Submodular Maximization via Online Convex Optimization

A Rapid Prototyping Language Workbench for Textual DSLs based on Xtext: Vision and Progress

Memory-Enhanced Dynamic Evolutionary Control of Reconfigurable Intelligent Surfaces

Data-Driven Batch Localization and SLAM Using Koopman Linearization

ARRTOC: Adversarially Robust Real-Time Optimization and Control

DeformToon3D: Deformable 3D Toonification from Neural Radiance Fields

Parallel and Limited Data Voice Conversion Using Stochastic Variational Deep Kernel Learning

Comparative Study of Visual SLAM-Based Mobile Robot Localization Using Fiducial Markers

A Generalized Stopping Criterion for Real-Time MPC with Guaranteed Stability

Multi-contact Stochastic Predictive Control for Legged Robots with Contact Locations Uncertainty

Keyword: adam

Sparse-DFT and WHT Precoding with Iterative Detection for Highly Frequency-Selective Channels

Keyword: gradient

Improving Resnet-9 Generalization Trained on Small Datasets

DBsurf: A Discrepancy Based Method for Discrete Stochastic Gradient Estimation

Generalized moving least squares vs. radial basis function finite difference methods for approximating surface derivatives

Toward Sufficient Spatial-Frequency Interaction for Gradient-aware Underwater Image Enhancement

Counterfactual Explanations via Locally-guided Sequential Algorithmic Recourse

Learning Zero-Sum Linear Quadratic Games with Improved Sample Complexity

Graph Neural Networks Use Graphs When They Shouldn't

A Generalized Stopping Criterion for Real-Time MPC with Guaranteed Stability

Keyword: super-resolution

SRN-SZ: Deep Leaning-Based Scientific Error-bounded Lossy Compression with Super-resolution Neural Networks