Authors: Authors: Andrew Campbell, Hang Liu, Leah Woldemariam, Anna Scaglione
Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS); Multiagent Systems (cs.MA); Optimization and Control (math.OC)
Abstract
Recent research indicates that frequent model communication stands as a major bottleneck to the efficiency of decentralized machine learning (ML), particularly for large-scale and over-parameterized neural networks (NNs). In this paper, we introduce MALCOM-PSGD, a new decentralized ML algorithm that strategically integrates gradient compression techniques with model sparsification. MALCOM-PSGD leverages proximal stochastic gradient descent to handle the non-smoothness resulting from the $\ell_1$ regularization in model sparsification. Furthermore, we adapt vector source coding and dithering-based quantization for compressed gradient communication of sparsified models. Our analysis shows that decentralized proximal stochastic gradient descent with compressed communication has a convergence rate of $\mathcal{O}\left(\ln(t)/\sqrt{t}\right)$ assuming a diminishing learning rate and where $t$ denotes the number of iterations. Numerical results verify our theoretical findings and demonstrate that our method reduces communication costs by approximately $75\%$ when compared to the state-of-the-art method.
Does Differential Privacy Prevent Backdoor Attacks in Practice?
Authors: Authors: Fereshteh Razmi, Jian Lou, Li Xiong
Subjects: Cryptography and Security (cs.CR); Machine Learning (cs.LG)
Abstract
Differential Privacy (DP) was originally developed to protect privacy. However, it has recently been utilized to secure machine learning (ML) models from poisoning attacks, with DP-SGD receiving substantial attention. Nevertheless, a thorough investigation is required to assess the effectiveness of different DP techniques in preventing backdoor attacks in practice. In this paper, we investigate the effectiveness of DP-SGD and, for the first time in literature, examine PATE in the context of backdoor attacks. We also explore the role of different components of DP algorithms in defending against backdoor attacks and will show that PATE is effective against these attacks due to the bagging structure of the teacher models it employs. Our experiments reveal that hyperparameters and the number of backdoors in the training dataset impact the success of DP algorithms. Additionally, we propose Label-DP as a faster and more accurate alternative to DP-SGD and PATE. We conclude that while Label-DP algorithms generally offer weaker privacy protection, accurate hyper-parameter tuning can make them more effective than DP methods in defending against backdoor attacks while maintaining model accuracy.
Keyword: optimization
Automatic differentiation accelerated shape optimization approaches to photonic inverse design on rectilinear simulation grids
Authors: Authors: Sean Hooten, Peng Sun, Liron Gantz, Marco Fiorentino, Raymond G. Beausoleil, Thomas Van Vaerenbergh
Subjects: Computational Engineering, Finance, and Science (cs.CE); Optics (physics.optics)
Abstract
Shape optimization approaches to inverse design offer low-dimensional, physically-guided parameterizations of structures by representing them as combinations of shape primitives. However, on discretized rectilinear simulation grids, computing the gradient of a user objective via the adjoint variables method requires a sum reduction of the forward/adjoint field solutions and the Jacobian of the simulation material distribution with respect to the structural shape parameters. These shape parameters often perturb large or global parts of the simulation grid resulting in many non-zero Jacobian entries, which are typically computed by finite-difference in practice. Consequently, the gradient calculation can be non-trivial. In this work we propose to accelerate the gradient calculation by invoking automatic differentiation (AutoDiff) in instantiations of structural material distributions. In doing so, we develop extensible differentiable mappings from shape parameters to shape primitives and differentiable effective logic operations (denoted AutoDiffGeo). These AutoDiffGeo definitions may introduce some additional discretization error into the field solutions because they relax notions of sub-pixel smoothing along shape boundaries. However, we show that some mappings (e.g. simple cuboids) can achieve zero error with respect to volumetric averaging strategies. We demonstrate AutoDiff enhanced shape optimization using three integrated photonic examples: a multi-etch blazed grating coupler, a non-adiabatic waveguide transition taper, and a polarization-splitting grating coupler. We find accelerations of the gradient calculation by AutoDiff relative to finite-difference often exceed 50x, resulting in total wall time accelerations of 4x or more on the same hardware with little or no compromise to final device performance. Our code is available open source at https://github.com/smhooten/emopt
Prompt Engineering a Prompt Engineer
Authors: Authors: Qinyuan Ye, Maxamed Axmed, Reid Pryzant, Fereshte Khani
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Abstract
Prompt engineering is a challenging yet crucial task for optimizing the performance of large language models (LLMs). It requires complex reasoning to examine the model's errors, hypothesize what is missing or misleading in the current prompt, and communicate the task with clarity. While recent works indicate that LLMs can be meta-prompted to perform automatic prompt engineering, their potentials may not be fully untapped due to the lack of sufficient guidance to elicit complex reasoning capabilities in LLMs in the meta-prompt. In this work, we investigate the problem of "prompt engineering a prompt engineer" -- constructing a meta-prompt that more effectively guides LLMs to perform automatic prompt engineering. We introduce and analyze key components, such as a step-by-step reasoning template and context specification, which lead to improved performance. In addition, inspired by common optimization concepts such as batch size, step size and momentum, we introduce their verbalized counterparts to the meta-prompt and investigate their effects. Our final method, named PE2, finds a prompt that outperforms "let's think step by step" by 6.3% on the MultiArith dataset and 3.1% on the GSM8K dataset. To demonstrate its versatility, we apply PE2 to the Instruction Induction benchmark, a suite of counterfactual tasks, and a lengthy, real-world industrial prompt. In these settings, PE2 achieves strong performance and outperforms prior automatic prompt engineering baselines. Further, we show that PE2 makes meaningful and targeted prompt edits, amends erroneous or incomplete prompts, and presents non-trivial counterfactual reasoning abilities.
ML-based Real-Time Control at the Edge: An Approach Using hls4ml
Authors: Authors: R. Shi, S. Ogrenci, J.M. Arnold, J.R. Berlioz, P. Hanlet, K.J. Hazelwood, M.A. Ibrahim, H. Liu, V.P. Nagaslaev, A. Narayanan 1, D.J. Nicklaus, J. Mitrevski, G. Pradhan, A.L. Saewert, B.A. Schupbach, K. Seiya, M. Thieme, R.M. Thurman-Keup, N.V. Tran
Abstract
This study focuses on implementing a real-time control system for a particle accelerator facility that performs high energy physics experiments. A critical operating parameter in this facility is beam loss, which is the fraction of particles deviating from the accelerated proton beam into a cascade of secondary particles. Accelerators employ a large number of sensors to monitor beam loss. The data from these sensors is monitored by human operators who predict the relative contribution of different sub-systems to the beam loss. Using this information, they engage control interventions. In this paper, we present a controller to track this phenomenon in real-time using edge-Machine Learning (ML) and support control with low latency and high accuracy. We implemented this system on an Intel Arria 10 SoC. Optimizations at the algorithm, high-level synthesis, and interface levels to improve latency and resource usage are presented. Our design implements a neural network, which can predict the main source of beam loss (between two possible causes) at speeds up to 575 frames per second (fps) (average latency of 1.74 ms). The practical deployed system is required to operate at 320 fps, with a 3ms latency requirement, which has been met by our design successfully.
Active Admission Control in a P2P Distributed Environment for Capacity Efficient Livestreaming in Mobile Wireless Networks
Authors: Authors: Andrei Negulescu, Weijia Shang
Subjects: Networking and Internet Architecture (cs.NI); Distributed, Parallel, and Cluster Computing (cs.DC); Performance (cs.PF); Systems and Control (eess.SY)
Abstract
In this study, the Active Control in an Intelligent and Distributed Environment (ACIDE) media distribution model solution and algorithms are proposed for livestreaming in capacity efficient mobile wireless networks. The elements of the ACIDE model are a base station and a cluster formed by a number of peers able to establish peer to peer communications. The cluster peers are selected from a group of users interested in livestreaming the same media. The ACIDE model solution minimizes the bandwidth allocated to a cluster of n peers such that an uninterrupted media play for all peers is guaranteed. The livestream media is sent to the peers in packages and every media package is divided into n blocks. The blocks are distributed to the n peers of a cluster in two phases, such that the base station bandwidth is utilized during first phase only. The allocated bandwidth, the amount of bandwidth the base station has to allocate to a cluster, is minimized and its lower bound is equal to the bandwidth required for multicasting. In this study, the ACIDE model is used to address the problem of how to find the maximum number of peers n, chosen from a group of N users, that can be admitted to a cluster knowing the given allocated bandwidth, the amount of bandwidth that a base station allocates to a cluster in advance, prior to admitting users. When users become peers of an ACIDE cluster, the network capacity, the total number of users who are able to access live media, increases meaning that network resources are used more efficiently. The problem of finding the maximum number of peers n is addressed as an optimization problem, with the objective of having the entire given allocated bandwidth used by the peers admitted to the cluster. This problem is NP-complete and a non-optimal solution is proposed for peers selection such that all admitted peers play media continuously.
Generative Explanations for Graph Neural Network: Methods and Evaluations
Authors: Authors: Jialin Chen, Kenza Amara, Junchi Yu, Rex Ying
Abstract
Graph Neural Networks (GNNs) achieve state-of-the-art performance in various graph-related tasks. However, the black-box nature often limits their interpretability and trustworthiness. Numerous explainability methods have been proposed to uncover the decision-making logic of GNNs, by generating underlying explanatory substructures. In this paper, we conduct a comprehensive review of the existing explanation methods for GNNs from the perspective of graph generation. Specifically, we propose a unified optimization objective for generative explanation methods, comprising two sub-objectives: Attribution and Information constraints. We further demonstrate their specific manifestations in various generative model architectures and different explanation scenarios. With the unified objective of the explanation problem, we reveal the shared characteristics and distinctions among current methods, laying the foundation for future methodological advancements. Empirical results demonstrate the advantages and limitations of different explainability approaches in terms of explanation performance, efficiency, and generalizability.
DONUT-hole: DONUT Sparsification by Harnessing Knowledge and Optimizing Learning Efficiency
Authors: Authors: Azhar Shaikh, Michael Cochez, Denis Diachkov, Michiel de Rijcke, Sahar Yousefi
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Abstract
This paper introduces DONUT-hole, a sparse OCR-free visual document understanding (VDU) model that addresses the limitations of its predecessor model, dubbed DONUT. The DONUT model, leveraging a transformer architecture, overcoming the challenges of separate optical character recognition (OCR) and visual semantic understanding (VSU) components. However, its deployment in production environments and edge devices is hindered by high memory and computational demands, particularly in large-scale request services. To overcome these challenges, we propose an optimization strategy based on knowledge distillation and model pruning. Our paradigm to produce DONUT-hole, reduces the model denisty by 54\% while preserving performance. We also achieve a global representational similarity index between DONUT and DONUT-hole based on centered kernel alignment (CKA) metric of 0.79. Moreover, we evaluate the effectiveness of DONUT-hole in the document image key information extraction (KIE) task, highlighting its potential for developing more efficient VDU systems for logistic companies.
Real-time Control of Electric Autonomous Mobility-on-Demand Systems via Graph Reinforcement Learning
Abstract
Operators of Electric Autonomous Mobility-on-Demand (E-AMoD) fleets need to make several real-time decisions such as matching available cars to ride requests, rebalancing idle cars to areas of high demand, and charging vehicles to ensure sufficient range. While this problem can be posed as a linear program that optimizes flows over a space-charge-time graph, the size of the resulting optimization problem does not allow for real-time implementation in realistic settings. In this work, we present the E-AMoD control problem through the lens of reinforcement learning and propose a graph network-based framework to achieve drastically improved scalability and superior performance over heuristics. Specifically, we adopt a bi-level formulation where we (1) leverage a graph network-based RL agent to specify a desired next state in the space-charge graph, and (2) solve more tractable linear programs to best achieve the desired state while ensuring feasibility. Experiments using real-world data from San Francisco and New York City show that our approach achieves up to 89% of the profits of the theoretically-optimal solution while achieving more than a 100x speedup in computational time. Furthermore, our approach outperforms the best domain-specific heuristics with comparable runtimes, with an increase in profits by up to 3x. Finally, we highlight promising zero-shot transfer capabilities of our learned policy on tasks such as inter-city generalization and service area expansion, thus showing the utility, scalability, and flexibility of our framework.
Generative Modeling of Residuals for Real-Time Risk-Sensitive Safety with Discrete-Time Control Barrier Functions
Authors: Authors: Ryan K. Cosner, Igor Sadalski, Jana K. Woo, Preston Culbertson, Aaron D. Ames
Abstract
A key source of brittleness for robotic systems is the presence of model uncertainty and external disturbances. Most existing approaches to robust control either seek to bound the worst-case disturbance (which results in conservative behavior), or to learn a deterministic dynamics model (which is unable to capture uncertain dynamics or disturbances). This work proposes a different approach: training a state-conditioned generative model to represent the distribution of error residuals between the nominal dynamics and the actual system. In particular we introduce the Online Risk-Informed Optimization controller (ORIO), which uses Discrete-Time Control Barrier Functions, combined with a learned, generative disturbance model, to ensure the safety of the system up to some level of risk. We demonstrate our approach in both simulations and hardware, and show our method can learn a disturbance model that is accurate enough to enable risk-sensitive control of a quadrotor flying aggressively with an unmodelled slung load. We use a conditional variational autoencoder (CVAE) to learn a state-conditioned dynamics residual distribution, and find that the resulting probabilistic safety controller, which can be run at 100Hz on an embedded computer, exhibits less conservative behavior while retaining theoretical safety properties.
Scale-MIA: A Scalable Model Inversion Attack against Secure Federated Learning via Latent Space Reconstruction
Authors: Authors: Shanghao Shi, Ning Wang, Yang Xiao, Chaoyu Zhang, Yi Shi, Y.Thomas Hou, Wenjing Lou
Abstract
Federated learning is known for its capability to safeguard participants' data privacy. However, recently emerged model inversion attacks (MIAs) have shown that a malicious parameter server can reconstruct individual users' local data samples through model updates. The state-of-the-art attacks either rely on computation-intensive search-based optimization processes to recover each input batch, making scaling difficult, or they involve the malicious parameter server adding extra modules before the global model architecture, rendering the attacks too conspicuous and easily detectable. To overcome these limitations, we propose Scale-MIA, a novel MIA capable of efficiently and accurately recovering training samples of clients from the aggregated updates, even when the system is under the protection of a robust secure aggregation protocol. Unlike existing approaches treating models as black boxes, Scale-MIA recognizes the importance of the intricate architecture and inner workings of machine learning models. It identifies the latent space as the critical layer for breaching privacy and decomposes the complex recovery task into an innovative two-step process to reduce computation complexity. The first step involves reconstructing the latent space representations (LSRs) from the aggregated model updates using a closed-form inversion mechanism, leveraging specially crafted adversarial linear layers. In the second step, the whole input batches are recovered from the LSRs by feeding them into a fine-tuned generative decoder. We implemented Scale-MIA on multiple commonly used machine learning models and conducted comprehensive experiments across various settings. The results demonstrate that Scale-MIA achieves excellent recovery performance on different datasets, exhibiting high reconstruction rates, accuracy, and attack efficiency on a larger scale compared to state-of-the-art MIAs.
Interactive Motion Planning for Autonomous Vehicles via Adaptive Interactive MPC
Abstract
This article presents a new optimal control-based interactive motion planning algorithm for an autonomous vehicle interacting with a human-driven vehicle. The ego vehicle solves a joint optimization problem for its motion planning involving costs and coupled constraints of both vehicles and applies its own actions. The non-convex feasible region and lane discipline are handled by introducing integer decision variables and the resulting optimization problem is a mixed-integer quadratic program (MIQP) which is implemented via model predictive control (MPC). Furthermore, the ego vehicle imputes the cost of human-driven neighboring vehicle (NV) using an inverse optimal control method based on Karush-Kuhn-Tucker (KKT) conditions and adapts the joint optimization cost accordingly. We call the algorithm adaptive interactive mixed-integer MPC (aiMPC). Its interaction with human subjects driving the NV in a mandatory lane change scenario is tested in a developed software-and-human-in-the-loop simulator. Results show the effectiveness of the presented algorithm in terms of enhanced mobility of both the vehicles compared to baseline methods.
Diffusion Shape Prior for Wrinkle-Accurate Cloth Registration
Authors: Authors: Jingfan Guo, Fabian Prada, Donglai Xiang, Javier Romero, Chenglei Wu, Hyun Soo Park, Takaaki Shiratori, Shunsuke Saito
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Abstract
Registering clothes from 4D scans with vertex-accurate correspondence is challenging, yet important for dynamic appearance modeling and physics parameter estimation from real-world data. However, previous methods either rely on texture information, which is not always reliable, or achieve only coarse-level alignment. In this work, we present a novel approach to enabling accurate surface registration of texture-less clothes with large deformation. Our key idea is to effectively leverage a shape prior learned from pre-captured clothing using diffusion models. We also propose a multi-stage guidance scheme based on learned functional maps, which stabilizes registration for large-scale deformation even when they vary significantly from training data. Using high-fidelity real captured clothes, our experiments show that the proposed approach based on diffusion models generalizes better than surface registration with VAE or PCA-based priors, outperforming both optimization-based and learning-based non-rigid registration methods for both interpolation and extrapolation tests.
Clipped-Objective Policy Gradients for Pessimistic Policy Optimization
Authors: Authors: Jared Markowitz, Edward W. Staley
Abstract
To facilitate efficient learning, policy gradient approaches to deep reinforcement learning (RL) are typically paired with variance reduction measures and strategies for making large but safe policy changes based on a batch of experiences. Natural policy gradient methods, including Trust Region Policy Optimization (TRPO), seek to produce monotonic improvement through bounded changes in policy outputs. Proximal Policy Optimization (PPO) is a commonly used, first-order algorithm that instead uses loss clipping to take multiple safe optimization steps per batch of data, replacing the bound on the single step of TRPO with regularization on multiple steps. In this work, we find that the performance of PPO, when applied to continuous action spaces, may be consistently improved through a simple change in objective. Instead of the importance sampling objective of PPO, we instead recommend a basic policy gradient, clipped in an equivalent fashion. While both objectives produce biased gradient estimates with respect to the RL objective, they also both display significantly reduced variance compared to the unbiased off-policy policy gradient. Additionally, we show that (1) the clipped-objective policy gradient (COPG) objective is on average "pessimistic" compared to both the PPO objective and (2) this pessimism promotes enhanced exploration. As a result, we empirically observe that COPG produces improved learning compared to PPO in single-task, constrained, and multi-task learning, without adding significant computational cost or complexity. Compared to TRPO, the COPG approach is seen to offer comparable or superior performance, while retaining the simplicity of a first-order method.
One-shot data-driven design of fractional-order PID controller considering closed-loop stability: fictitious reference signal approach
Abstract
A one-shot data-driven tuning method for a fractional-order proportional-integral-derivative (FOPID) controller is proposed. The proposed method tunes the FOPID controller in the model-reference control formulation. A loss function is defined to evaluate the match between a given reference model and the closed-loop response while explicitly considering the closed-loop stability. A loss function value is based on the fictitious reference signal computed using the input/output data. Model matching is achieved via loss function minimization. The proposed method is simple and practical: it needs only one-shot input/output data of a plant (no plant model required), considers the bounded-input bounded-output stability of the closed-loop system, and automatically determines the appropriate parameter value via optimization. Numerical simulations show that the proposed approach facilitates good control performance, and destabilization can be avoided even if perfect model matching is unachievable.
Abstract
The primary objective of this paper is to demonstrate that problems related to stability and robust control in the harmonic context can be effectively addressed by formulating them as semidefinite optimization problems, invoking the concept of infinite-dimensional Toeplitz Block LMIs (TBLMIs). One of the central challenges tackled in this study pertains to the efficient resolution of these infinite-dimensional TBLMIs. Exploiting the structured nature of such problems, we introduce a consistent truncation method that effectively reduces the problem to a finite-dimensional convex optimization problem. By consistent we mean that the solution to this finite-dimensional problem allows to closely approximate the infinite-dimensional solution with arbitrary precision. Furthermore, we establish a link between the harmonic framework and the time domain setting, emphasizing the advantages over Periodic Differential LMIs (PDLMIs). We illustrate that our proposed framework is not only theoretically sound but also practically applicable to solving H 2 and H$\infty$ harmonic control design problems. To enable this, we extend the definitions of H 2 and H$\infty$ norms into the harmonic space, leveraging the concepts of the harmonic transfer function and the average trace operator for Toeplitz Block operators. Throughout this paper, we support our theoretical contributions with a range of illustrative examples that demonstrate the effectiveness of our approach.
Resilient and constrained consensus against adversarial attacks: A distributed MPC framework
Authors: Authors: Henglai Wei, Kunwu Zhang, Hui Zhang, Yang Shi
Abstract
There has been a growing interest in realizing the resilient consensus of the multi-agent system (MAS) under cyber-attacks, which aims to achieve the consensus of normal agents (i.e., agents without attacks) in a network, depending on the neighboring information. The literature has developed mean-subsequence-reduced (MSR) algorithms for the MAS with F adversarial attacks and has shown that the consensus is achieved for the normal agents when the communication network is at least (2F+1)-robust. However, such a stringent requirement on the communication network needs to be relaxed to enable more practical applications. Our objective is, for the first time, to achieve less stringent conditions on the network, while ensuring the resilient consensus for the general linear MAS subject to control input constraints. In this work, we propose a distributed resilient consensus framework, consisting of a pre-designed consensus protocol and distributed model predictive control (DMPC) optimization, which can help significantly reduce the requirement on the network robustness and effectively handle the general linear constrained MAS under adversarial attacks. By employing a novel distributed adversarial attack detection mechanism based on the history information broadcast by neighbors and a convex set (i.e., resilience set), we can evaluate the reliability of communication links. Moreover, we show that the recursive feasibility of the associated DMPC optimization problem can be guaranteed. The proposed consensus protocol features the following properties: 1) by minimizing a group of control variables, the consensus performance is optimized; 2) the resilient consensus of the general linear constrained MAS subject to F-locally adversarial attacks is achieved when the communication network is (F+1)-robust. Finally, numerical simulation results are presented to verify the theoretical results.
Efficient Learning of Fast Inverse Kinematics with Collision Avoidance
Authors: Authors: Johannes Tenhumberg, Arman Mielke, Berthold Bäuml
Abstract
Fast inverse kinematics (IK) is a central component in robotic motion planning. For complex robots, IK methods are often based on root search and non-linear optimization algorithms. These algorithms can be massively sped up using a neural network to predict a good initial guess, which can then be refined in a few numerical iterations. Besides previous work on learning-based IK, we present a learning approach for the fundamentally more complex problem of IK with collision avoidance. We do this in diverse and previously unseen environments. From a detailed analysis of the IK learning problem, we derive a network and unsupervised learning architecture that removes the need for a sample data generation step. Using the trained network's prediction as an initial guess for a two-stage Jacobian-based solver allows for fast and accurate computation of the collision-free IK. For the humanoid robot, Agile Justin (19 DoF), the collision-free IK is solved in less than 10 milliseconds (on a single CPU core) and with an accuracy of 10^-4 m and 10^-3 rad based on a high-resolution world model generated from the robot's integrated 3D sensor. Our method massively outperforms a random multi-start baseline in a benchmark with the 19 DoF humanoid and challenging 3D environments. It requires ten times less training time than a supervised training method while achieving comparable results.
RIGA: A Regret-Based Interactive Genetic Algorithm
Abstract
In this paper, we propose an interactive genetic algorithm for solving multi-objective combinatorial optimization problems under preference imprecision. More precisely, we consider problems where the decision maker's preferences over solutions can be represented by a parameterized aggregation function (e.g., a weighted sum, an OWA operator, a Choquet integral), and we assume that the parameters are initially not known by the recommendation system. In order to quickly make a good recommendation, we combine elicitation and search in the following way: 1) we use regret-based elicitation techniques to reduce the parameter space in a efficient way, 2) genetic operators are applied on parameter instances (instead of solutions) to better explore the parameter space, and 3) we generate promising solutions (population) using existing solving methods designed for the problem with known preferences. Our algorithm, called RIGA, can be applied to any multi-objective combinatorial optimization problem provided that the aggregation function is linear in its parameters and that a (near-)optimal solution can be efficiently determined for the problem with known preferences. We also study its theoretical performances: RIGA can be implemented in such way that it runs in polynomial time while asking no more than a polynomial number of queries. The method is tested on the multi-objective knapsack and traveling salesman problems. For several performance indicators (computation times, gap to optimality and number of queries), RIGA obtains better results than state-of-the-art algorithms.
Perceptual impact of the loss function on deep-learning image coding performance
Authors: Authors: Shima Mohammadi, Joao Ascenso
Subjects: Multimedia (cs.MM); Image and Video Processing (eess.IV)
Abstract
Nowadays, deep-learning image coding solutions have shown similar or better compression efficiency than conventional solutions based on hand-crafted transforms and spatial prediction techniques. These deep-learning codecs require a large training set of images and a training methodology to obtain a suitable model (set of parameters) for efficient compression. The training is performed with an optimization algorithm which provides a way to minimize the loss function. Therefore, the loss function plays a key role in the overall performance and includes a differentiable quality metric that attempts to mimic human perception. The main objective of this paper is to study the perceptual impact of several image quality metrics that can be used in the loss function of the training process, through a crowdsourcing subjective image quality assessment study. From this study, it is possible to conclude that the choice of the quality metric is critical for the perceptual performance of the deep-learning codec and that can vary depending on the image content.
Distributionally Robust Skeleton Learning of Discrete Bayesian Networks
Abstract
We consider the problem of learning the exact skeleton of general discrete Bayesian networks from potentially corrupted data. Building on distributionally robust optimization and a regression approach, we propose to optimize the most adverse risk over a family of distributions within bounded Wasserstein distance or KL divergence to the empirical distribution. The worst-case risk accounts for the effect of outliers. The proposed approach applies for general categorical random variables without assuming faithfulness, an ordinal relationship or a specific form of conditional distribution. We present efficient algorithms and show the proposed methods are closely related to the standard regularized regression approach. Under mild assumptions, we derive non-asymptotic guarantees for successful structure learning with logarithmic sample complexities for bounded-degree graphs. Numerical study on synthetic and real datasets validates the effectiveness of our method. Code is available at https://github.com/DanielLeee/drslbn.
A Compiler from Array Programs to Vectorized Homomorphic Encryption
Authors: Authors: Rolph Recto, Andrew C. Myers
Subjects: Programming Languages (cs.PL); Cryptography and Security (cs.CR)
Abstract
Homomorphic encryption (HE) is a practical approach to secure computation over encrypted data. However, writing programs with efficient HE implementations remains the purview of experts. A difficult barrier for programmability is that efficiency requires operations to be vectorized in inobvious ways, forcing efficient HE programs to manipulate ciphertexts with complex data layouts and to interleave computations with data movement primitives. We present Viaduct-HE, a compiler generates efficient vectorized HE programs. Viaduct-HE can generate both the operations and complex data layouts required for efficient HE programs. The source language of Viaduct-HE is array-oriented, enabling the compiler to have a simple representation of possible vectorization schedules. With such a representation, the compiler searches the space of possible vectorization schedules and finds those with efficient data layouts. After finding a vectorization schedule, Viaduct-HE further optimizes HE programs through term rewriting. The compiler has extension points to customize the exploration of vectorization schedules, to customize the cost model for HE programs, and to add back ends for new HE libraries. Our evaluation of the prototype Viaduct-HE compiler shows that it produces efficient vectorized HE programs with sophisticated data layouts and optimizations comparable to those designed by experts.
Abstract
Our work aims to estimate the camera motion mounted on the head of a mobile robot or a moving object from RGB-D images in a static scene. The problem of motion estimation is transformed into a nonlinear least squares function. Methods for solving such problems are iterative. Various classic methods gave an iterative solution by linearizing this function. We can also use the metaheuristic optimization method to solve this problem and improve results. In this paper, a new algorithm is developed for visual odometry using a sequence of RGB-D images. This algorithm is based on a genetic algorithm. The proposed iterative genetic algorithm searches using particles to estimate the optimal motion and then compares it to the traditional methods. To evaluate our method, we use the root mean square error to compare it with the based energy method and another metaheuristic method. We prove the efficiency of our innovative algorithm on a large set of images.
Greedy PIG: Adaptive Integrated Gradients
Authors: Authors: Kyriakos Axiotis, Sami Abu-al-haija, Lin Chen, Matthew Fahrbach, Gang Fu
Abstract
Deep learning has become the standard approach for most machine learning tasks. While its impact is undeniable, interpreting the predictions of deep learning models from a human perspective remains a challenge. In contrast to model training, model interpretability is harder to quantify and pose as an explicit optimization problem. Inspired by the AUC softmax information curve (AUC SIC) metric for evaluating feature attribution methods, we propose a unified discrete optimization framework for feature attribution and feature selection based on subset selection. This leads to a natural adaptive generalization of the path integrated gradients (PIG) method for feature attribution, which we call Greedy PIG. We demonstrate the success of Greedy PIG on a wide variety of tasks, including image feature attribution, graph compression/explanation, and post-hoc feature selection on tabular data. Our results show that introducing adaptivity is a powerful and versatile method for making attribution methods more powerful.
Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model
Authors: Authors: Jiahao Li, Hao Tan, Kai Zhang, Zexiang Xu, Fujun Luan, Yinghao Xu, Yicong Hong, Kalyan Sunkavalli, Greg Shakhnarovich, Sai Bi
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Abstract
Text-to-3D with diffusion models have achieved remarkable progress in recent years. However, existing methods either rely on score distillation-based optimization which suffer from slow inference, low diversity and Janus problems, or are feed-forward methods that generate low quality results due to the scarcity of 3D training data. In this paper, we propose Instant3D, a novel method that generates high-quality and diverse 3D assets from text prompts in a feed-forward manner. We adopt a two-stage paradigm, which first generates a sparse set of four structured and consistent views from text in one shot with a fine-tuned 2D text-to-image diffusion model, and then directly regresses the NeRF from the generated images with a novel transformer-based sparse-view reconstructor. Through extensive experiments, we demonstrate that our method can generate high-quality, diverse and Janus-free 3D assets within 20 seconds, which is two order of magnitude faster than previous optimization-based methods that can take 1 to 10 hours. Our project webpage: https://jiahao.ai/instant3d/.
Keyword: adam
There is no result
Keyword: gradient
Automatic differentiation accelerated shape optimization approaches to photonic inverse design on rectilinear simulation grids
Authors: Authors: Sean Hooten, Peng Sun, Liron Gantz, Marco Fiorentino, Raymond G. Beausoleil, Thomas Van Vaerenbergh
Subjects: Computational Engineering, Finance, and Science (cs.CE); Optics (physics.optics)
Abstract
Shape optimization approaches to inverse design offer low-dimensional, physically-guided parameterizations of structures by representing them as combinations of shape primitives. However, on discretized rectilinear simulation grids, computing the gradient of a user objective via the adjoint variables method requires a sum reduction of the forward/adjoint field solutions and the Jacobian of the simulation material distribution with respect to the structural shape parameters. These shape parameters often perturb large or global parts of the simulation grid resulting in many non-zero Jacobian entries, which are typically computed by finite-difference in practice. Consequently, the gradient calculation can be non-trivial. In this work we propose to accelerate the gradient calculation by invoking automatic differentiation (AutoDiff) in instantiations of structural material distributions. In doing so, we develop extensible differentiable mappings from shape parameters to shape primitives and differentiable effective logic operations (denoted AutoDiffGeo). These AutoDiffGeo definitions may introduce some additional discretization error into the field solutions because they relax notions of sub-pixel smoothing along shape boundaries. However, we show that some mappings (e.g. simple cuboids) can achieve zero error with respect to volumetric averaging strategies. We demonstrate AutoDiff enhanced shape optimization using three integrated photonic examples: a multi-etch blazed grating coupler, a non-adiabatic waveguide transition taper, and a polarization-splitting grating coupler. We find accelerations of the gradient calculation by AutoDiff relative to finite-difference often exceed 50x, resulting in total wall time accelerations of 4x or more on the same hardware with little or no compromise to final device performance. Our code is available open source at https://github.com/smhooten/emopt
Authors: Authors: Andrew Campbell, Hang Liu, Leah Woldemariam, Anna Scaglione
Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS); Multiagent Systems (cs.MA); Optimization and Control (math.OC)
Abstract
Recent research indicates that frequent model communication stands as a major bottleneck to the efficiency of decentralized machine learning (ML), particularly for large-scale and over-parameterized neural networks (NNs). In this paper, we introduce MALCOM-PSGD, a new decentralized ML algorithm that strategically integrates gradient compression techniques with model sparsification. MALCOM-PSGD leverages proximal stochastic gradient descent to handle the non-smoothness resulting from the $\ell_1$ regularization in model sparsification. Furthermore, we adapt vector source coding and dithering-based quantization for compressed gradient communication of sparsified models. Our analysis shows that decentralized proximal stochastic gradient descent with compressed communication has a convergence rate of $\mathcal{O}\left(\ln(t)/\sqrt{t}\right)$ assuming a diminishing learning rate and where $t$ denotes the number of iterations. Numerical results verify our theoretical findings and demonstrate that our method reduces communication costs by approximately $75\%$ when compared to the state-of-the-art method.
Abstract
Federated Learning (FL) is a promising technology that enables multiple actors to build a joint model without sharing their raw data. The distributed nature makes FL vulnerable to various poisoning attacks, including model poisoning attacks and data poisoning attacks. Today, many byzantine-resilient FL methods have been introduced to mitigate the model poisoning attack, while the effectiveness when defending against data poisoning attacks still remains unclear. In this paper, we focus on the most representative data poisoning attack - "label flipping attack" and monitor its effectiveness when attacking the existing FL methods. The results show that the existing FL methods perform similarly in Independent and identically distributed (IID) settings but fail to maintain the model robustness in Non-IID settings. To mitigate the weaknesses of existing FL methods in Non-IID scenarios, we introduce the Honest Score Client Selection (HSCS) scheme and the corresponding HSCSFL framework. In the HSCSFL, The server collects a clean dataset for evaluation. Under each iteration, the server collects the gradients from clients and then perform HSCS to select aggregation candidates. The server first evaluates the performance of each class of the global model and generates the corresponding risk vector to indicate which class could be potentially attacked. Similarly, the server evaluates the client's model and records the performance of each class as the accuracy vector. The dot product of each client's accuracy vector and global risk vector is generated as the client's host score; only the top p\% host score clients are included in the following aggregation. Finally, server aggregates the gradients and uses the outcome to update the global model. The comprehensive experimental results show our HSCSFL effectively enhances the FL robustness and defends against the "label flipping attack."
Clipped-Objective Policy Gradients for Pessimistic Policy Optimization
Authors: Authors: Jared Markowitz, Edward W. Staley
Abstract
To facilitate efficient learning, policy gradient approaches to deep reinforcement learning (RL) are typically paired with variance reduction measures and strategies for making large but safe policy changes based on a batch of experiences. Natural policy gradient methods, including Trust Region Policy Optimization (TRPO), seek to produce monotonic improvement through bounded changes in policy outputs. Proximal Policy Optimization (PPO) is a commonly used, first-order algorithm that instead uses loss clipping to take multiple safe optimization steps per batch of data, replacing the bound on the single step of TRPO with regularization on multiple steps. In this work, we find that the performance of PPO, when applied to continuous action spaces, may be consistently improved through a simple change in objective. Instead of the importance sampling objective of PPO, we instead recommend a basic policy gradient, clipped in an equivalent fashion. While both objectives produce biased gradient estimates with respect to the RL objective, they also both display significantly reduced variance compared to the unbiased off-policy policy gradient. Additionally, we show that (1) the clipped-objective policy gradient (COPG) objective is on average "pessimistic" compared to both the PPO objective and (2) this pessimism promotes enhanced exploration. As a result, we empirically observe that COPG produces improved learning compared to PPO in single-task, constrained, and multi-task learning, without adding significant computational cost or complexity. Compared to TRPO, the COPG approach is seen to offer comparable or superior performance, while retaining the simplicity of a first-order method.
A Performance-Driven Benchmark for Feature Selection in Tabular Deep Learning
Authors: Authors: Valeriia Cherepanova, Roman Levin, Gowthami Somepalli, Jonas Geiping, C. Bayan Bruss, Andrew Gordon Wilson, Tom Goldstein, Micah Goldblum
Abstract
Academic tabular benchmarks often contain small sets of curated features. In contrast, data scientists typically collect as many features as possible into their datasets, and even engineer new features from existing ones. To prevent overfitting in subsequent downstream modeling, practitioners commonly use automated feature selection methods that identify a reduced subset of informative features. Existing benchmarks for tabular feature selection consider classical downstream models, toy synthetic datasets, or do not evaluate feature selectors on the basis of downstream performance. Motivated by the increasing popularity of tabular deep learning, we construct a challenging feature selection benchmark evaluated on downstream neural networks including transformers, using real datasets and multiple methods for generating extraneous features. We also propose an input-gradient-based analogue of Lasso for neural networks that outperforms classical feature selection methods on challenging problems such as selecting from corrupted or second-order features.
Optimal Privacy-Aware Dynamic Estimation
Authors: Authors: Chuanghong Weng, Ehsan Nekouei, Karl H. Johansson
Abstract
In this paper, we develop an information-theoretic framework for the optimal privacy-aware estimation of the states of a (linear or nonlinear) system. In our setup, a private process, modeled as a first-order Markov chain, derives the states of the system, and the state estimates are shared with an untrusted party who might attempt to infer the private process based on the state estimates. As the privacy metric, we use the mutual information between the private process and the state estimates. We first show that the privacy-aware estimation is a closed-loop control problem wherein the estimator controls the belief of the adversary about the private process. We also derive the Bellman optimality principle for the optimal privacy-aware estimation problem, which is used to study the structural properties of the optimal estimator. We next develop a policy gradient algorithm, for computing an optimal estimation policy, based on a novel variational formulation of the mutual information. We finally study the performance of the optimal estimator in a building automation application.
Federated Learning with Manifold Regularization and Normalized Update Reaggregation
Authors: Authors: Xuming An, Li Shen, Han Hu, Yong Luo
Abstract
Federated Learning (FL) is an emerging collaborative machine learning framework where multiple clients train the global model without sharing their own datasets. In FL, the model inconsistency caused by the local data heterogeneity across clients results in the near-orthogonality of client updates, which leads to the global update norm reduction and slows down the convergence. Most previous works focus on eliminating the difference of parameters (or gradients) between the local and global models, which may fail to reflect the model inconsistency due to the complex structure of the machine learning model and the Euclidean space's limitation in meaningful geometric representations. In this paper, we propose FedMRUR by adopting the manifold model fusion scheme and a new global optimizer to alleviate the negative impacts. Concretely, FedMRUR adopts a hyperbolic graph manifold regularizer enforcing the representations of the data in the local and global models are close to each other in a low-dimensional subspace. Because the machine learning model has the graph structure, the distance in hyperbolic space can reflect the model bias better than the Euclidean distance. In this way, FedMRUR exploits the manifold structures of the representations to significantly reduce the model inconsistency. FedMRUR also aggregates the client updates norms as the global update norm, which can appropriately enlarge each client's contribution to the global update, thereby mitigating the norm reduction introduced by the near-orthogonality of client updates. Furthermore, we theoretically prove that our algorithm can achieve a linear speedup property for non-convex setting under partial client participation.Experiments demonstrate that FedMRUR can achieve a new state-of-the-art (SOTA) accuracy with less communication.
Robust Adversarial Attacks Detection for Deep Learning based Relative Pose Estimation for Space Rendezvous
Authors: Authors: Ziwei Wang, Nabil Aouf, Jose Pizarro, Christophe Honvault
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Machine Learning (cs.LG); Robotics (cs.RO)
Abstract
Research on developing deep learning techniques for autonomous spacecraft relative navigation challenges is continuously growing in recent years. Adopting those techniques offers enhanced performance. However, such approaches also introduce heightened apprehensions regarding the trustability and security of such deep learning methods through their susceptibility to adversarial attacks. In this work, we propose a novel approach for adversarial attack detection for deep neural network-based relative pose estimation schemes based on the explainability concept. We develop for an orbital rendezvous scenario an innovative relative pose estimation technique adopting our proposed Convolutional Neural Network (CNN), which takes an image from the chaser's onboard camera and outputs accurately the target's relative position and rotation. We perturb seamlessly the input images using adversarial attacks that are generated by the Fast Gradient Sign Method (FGSM). The adversarial attack detector is then built based on a Long Short Term Memory (LSTM) network which takes the explainability measure namely SHapley Value from the CNN-based pose estimator and flags the detection of adversarial attacks when acting. Simulation results show that the proposed adversarial attack detector achieves a detection accuracy of 99.21%. Both the deep relative pose estimator and adversarial attack detector are then tested on real data captured from our laboratory-designed setup. The experimental results from our laboratory-designed setup demonstrate that the proposed adversarial attack detector achieves an average detection accuracy of 96.29%.
Gradient-robust hybrid DG discretizations for the compressible Stokes equations
Authors: Authors: Philip L. Lederer, Christian Merdon
Abstract
This paper studies two hybrid discontinuous Galerkin (HDG) discretizations for the velocity-density formulation of the compressible Stokes equations with respect to several desired structural properties, namely provable convergence, the preservation of non-negativity and mass constraints for the density, and gradient-robustness. The later property dramatically enhances the accuracy in well-balanced situations, such as the hydrostatic balance where the pressure gradient balances the gravity force. One of the studied schemes employs an H(div)-conforming velocity ansatz space which ensures all mentioned properties, while a fully discontinuous method is shown to satisfy all properties but the gradient-robustness. Also higher-order schemes for both variants are presented and compared in three numerical benchmark problems. The final example shows the importance also for non-hydrostatic well-balanced states for the compressible Navier-Stokes equations.
An Interpretable Machine Learning Framework to Understand Bikeshare Demand before and during the COVID-19 Pandemic in New York City
Authors: Authors: Majbah Uddin, Ho-Ling Hwang, Md Sami Hasnine
Abstract
In recent years, bikesharing systems have become increasingly popular as affordable and sustainable micromobility solutions. Advanced mathematical models such as machine learning are required to generate good forecasts for bikeshare demand. To this end, this study proposes a machine learning modeling framework to estimate hourly demand in a large-scale bikesharing system. Two Extreme Gradient Boosting models were developed: one using data from before the COVID-19 pandemic (March 2019 to February 2020) and the other using data from during the pandemic (March 2020 to February 2021). Furthermore, a model interpretation framework based on SHapley Additive exPlanations was implemented. Based on the relative importance of the explanatory variables considered in this study, share of female users and hour of day were the two most important explanatory variables in both models. However, the month variable had higher importance in the pandemic model than in the pre-pandemic model.
Interpretable Graph Anomaly Detection using Gradient Attention Maps
Authors: Authors: Yifei Yang, Peng Wang, Xiaofan He, Dongmian Zou
Abstract
Detecting unusual patterns in graph data is a crucial task in data mining. However, existing methods often face challenges in consistently achieving satisfactory performance and lack interpretability, which hinders our understanding of anomaly detection decisions. In this paper, we propose a novel approach to graph anomaly detection that leverages the power of interpretability to enhance performance. Specifically, our method extracts an attention map derived from gradients of graph neural networks, which serves as a basis for scoring anomalies. In addition, we conduct theoretical analysis using synthetic data to validate our method and gain insights into its decision-making process. To demonstrate the effectiveness of our method, we extensively evaluate our approach against state-of-the-art graph anomaly detection techniques. The results consistently demonstrate the superior performance of our method compared to the baselines.
Greedy PIG: Adaptive Integrated Gradients
Authors: Authors: Kyriakos Axiotis, Sami Abu-al-haija, Lin Chen, Matthew Fahrbach, Gang Fu
Abstract
Deep learning has become the standard approach for most machine learning tasks. While its impact is undeniable, interpreting the predictions of deep learning models from a human perspective remains a challenge. In contrast to model training, model interpretability is harder to quantify and pose as an explicit optimization problem. Inspired by the AUC softmax information curve (AUC SIC) metric for evaluating feature attribution methods, we propose a unified discrete optimization framework for feature attribution and feature selection based on subset selection. This leads to a natural adaptive generalization of the path integrated gradients (PIG) method for feature attribution, which we call Greedy PIG. We demonstrate the success of Greedy PIG on a wide variety of tasks, including image feature attribution, graph compression/explanation, and post-hoc feature selection on tabular data. Our results show that introducing adaptivity is a powerful and versatile method for making attribution methods more powerful.
Keyword: super-resolution
Evaluation of Sampling Algorithms for a Pairwise Subjective Assessment Methodology
Abstract
Subjective assessment tests are often employed to evaluate image processing systems, notably image and video compression, super-resolution among others and have been used as an indisputable way to provide evidence of the performance of an algorithm or system. While several methodologies can be used in a subjective quality assessment test, pairwise comparison tests are nowadays attracting a lot of attention due to their accuracy and simplicity. However, the number of comparisons in a pairwise comparison test increases quadratically with the number of stimuli and thus often leads to very long tests, which is impractical for many cases. However, not all the pairs contribute equally to the final score and thus, it is possible to reduce the number of comparisons without degrading the final accuracy. To do so, pairwise sampling methods are often used to select the pairs which provide more information about the quality of each stimuli. In this paper, a reliable and much-needed evaluation procedure is proposed and used for already available methods in the literature, especially considering the case of subjective evaluation of image and video codecs. The results indicate that an appropriate selection of the pairs allows to achieve very reliable scores while requiring the comparison of a much lower number of pairs.
Keyword: sgd
MALCOM-PSGD: Inexact Proximal Stochastic Gradient Descent for Communication-Efficient Decentralized Machine Learning
Does Differential Privacy Prevent Backdoor Attacks in Practice?
Keyword: optimization
Automatic differentiation accelerated shape optimization approaches to photonic inverse design on rectilinear simulation grids
Prompt Engineering a Prompt Engineer
ML-based Real-Time Control at the Edge: An Approach Using hls4ml
Active Admission Control in a P2P Distributed Environment for Capacity Efficient Livestreaming in Mobile Wireless Networks
Generative Explanations for Graph Neural Network: Methods and Evaluations
DONUT-hole: DONUT Sparsification by Harnessing Knowledge and Optimizing Learning Efficiency
Real-time Control of Electric Autonomous Mobility-on-Demand Systems via Graph Reinforcement Learning
Generative Modeling of Residuals for Real-Time Risk-Sensitive Safety with Discrete-Time Control Barrier Functions
Scale-MIA: A Scalable Model Inversion Attack against Secure Federated Learning via Latent Space Reconstruction
Interactive Motion Planning for Autonomous Vehicles via Adaptive Interactive MPC
Diffusion Shape Prior for Wrinkle-Accurate Cloth Registration
Clipped-Objective Policy Gradients for Pessimistic Policy Optimization
One-shot data-driven design of fractional-order PID controller considering closed-loop stability: fictitious reference signal approach
A TBLMI Framework for Harmonic Robust Control
Resilient and constrained consensus against adversarial attacks: A distributed MPC framework
Efficient Learning of Fast Inverse Kinematics with Collision Avoidance
RIGA: A Regret-Based Interactive Genetic Algorithm
Perceptual impact of the loss function on deep-learning image coding performance
Distributionally Robust Skeleton Learning of Discrete Bayesian Networks
A Compiler from Array Programs to Vectorized Homomorphic Encryption
Dense Visual Odometry Using Genetic Algorithm
Greedy PIG: Adaptive Integrated Gradients
Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model
Keyword: adam
There is no result
Keyword: gradient
Automatic differentiation accelerated shape optimization approaches to photonic inverse design on rectilinear simulation grids
MALCOM-PSGD: Inexact Proximal Stochastic Gradient Descent for Communication-Efficient Decentralized Machine Learning
Honest Score Client Selection Scheme: Preventing Federated Learning Label Flipping Attacks in Non-IID Scenarios
Clipped-Objective Policy Gradients for Pessimistic Policy Optimization
A Performance-Driven Benchmark for Feature Selection in Tabular Deep Learning
Optimal Privacy-Aware Dynamic Estimation
Federated Learning with Manifold Regularization and Normalized Update Reaggregation
Robust Adversarial Attacks Detection for Deep Learning based Relative Pose Estimation for Space Rendezvous
Gradient-robust hybrid DG discretizations for the compressible Stokes equations
An Interpretable Machine Learning Framework to Understand Bikeshare Demand before and during the COVID-19 Pandemic in New York City
Interpretable Graph Anomaly Detection using Gradient Attention Maps
Greedy PIG: Adaptive Integrated Gradients
Keyword: super-resolution
Evaluation of Sampling Algorithms for a Pairwise Subjective Assessment Methodology