Abstract
Stochastic gradient descent (SGD) and its many variants are the widespread optimization algorithms for training deep neural networks. However, SGD suffers from inevitable drawbacks, including vanishing gradients, lack of theoretical guarantees, and substantial sensitivity to input. The Alternating Direction Method of Multipliers (ADMM) has been proposed to address these shortcomings as an effective alternative to the gradient-based methods. It has been successfully employed for training deep neural networks. However, ADMM-based optimizers have a slow convergence rate. This paper proposes an Anderson Acceleration for Deep Learning ADMM (AA-DLADMM) algorithm to tackle this drawback. The main intention of the AA-DLADMM algorithm is to employ Anderson acceleration to ADMM by considering it as a fixed-point iteration and attaining a nearly quadratic convergence rate. We verify the effectiveness and efficiency of the proposed AA-DLADMM algorithm by conducting extensive experiments on four benchmark datasets contrary to other state-of-the-art optimizers.
Abstract
Law enforcement officials heavily depend on Forensic Video Analytic (FVA) Software in their evidence extraction process. However present-day FVA software are complex, time consuming, equipment dependent and expensive. Developing countries struggle to gain access to this gateway to a secure haven. The term forensic pertains the application of scientific methods to the investigation of crime through post-processing, whereas surveillance is the close monitoring of real-time feeds. The principle objective of this Final Year Project was to develop an efficient and effective FVA Software, addressing the shortcomings through a stringent and systematic review of scholarly research papers, online databases and legal documentation. The scope spans multiple object detection, multiple object tracking, anomaly detection, activity recognition, tampering detection, general and specific image enhancement and video synopsis. Methods employed include many machine learning techniques, GPU acceleration and efficient, integrated architecture development both for real-time and postprocessing. For this CNN, GMM, multithreading and OpenCV C++ coding were used. The implications of the proposed methodology would rapidly speed up the FVA process especially through the novel video synopsis research arena. This project has resulted in three research outcomes Moving Object Based Collision Free Video Synopsis, Forensic and Surveillance Analytic Tool Architecture and Tampering Detection Inter-Frame Forgery. The results include forensic and surveillance panel outcomes with emphasis on video synopsis and Sri Lankan context. Principal conclusions include the optimization and efficient algorithm integration to overcome limitations in processing power, memory and compromise between real-time performance and accuracy.
Multidimensional extrapolated global proximal gradient and applications for image processing
Abstract
The proximal gradient method is a generic technique introduced to tackle the non-smoothness in optimization problems, wherein the objective function is expressed as the sum of a differentiable convex part and a non-differentiable regularization term. Such problems with tensor format are of interest in many fields of applied mathematics such as image and video processing. Our goal in this paper is to address the solution of such problems with a more general form of the regularization term. An adapted iterative proximal gradient method is introduced for this purpose. Due to the slowness of the proposed algorithm, we use new tensor extrapolation methods to enhance its convergence. Numerical experiments on color image deblurring are conducted to illustrate the efficiency of our approach.
Reliability-Optimized User Admission Control for URLLC Traffic: A Neural Contextual Bandit Approach
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Information Theory (cs.IT); Networking and Internet Architecture (cs.NI); Signal Processing (eess.SP)
Abstract
Ultra-reliable low-latency communication (URLLC) is the cornerstone for a broad range of emerging services in next-generation wireless networks. URLLC fundamentally relies on the network's ability to proactively determine whether sufficient resources are available to support the URLLC traffic, and thus, prevent so-called cell overloads. Nonetheless, achieving accurate quality-of-service (QoS) predictions for URLLC user equipment (UEs) and preventing cell overloads are very challenging tasks. This is due to dependency of the QoS metrics (latency and reliability) on traffic and channel statistics, users' mobility, and interdependent performance across UEs. In this paper, a new QoS-aware UE admission control approach is developed to proactively estimate QoS for URLLC UEs, prior to associating them with a cell, and accordingly, admit only a subset of UEs that do not lead to a cell overload. To this end, an optimization problem is formulated to find an efficient UE admission control policy, cognizant of UEs' QoS requirements and cell-level load dynamics. To solve this problem, a new machine learning based method is proposed that builds on (deep) neural contextual bandits, a suitable framework for dealing with nonlinear bandit problems. In fact, the UE admission controller is treated as a bandit agent that observes a set of network measurements (context) and makes admission control decisions based on context-dependent QoS (reward) predictions. The simulation results show that the proposed scheme can achieve near-optimal performance and yield substantial gains in terms of cell-level service reliability and efficient resource utilization.
Energy-efficient Decentralized Learning via Graph Sparsification
Authors: Authors: Xusheng Zhang, Cho-Chun Chiu, Ting He
Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC); Optimization and Control (math.OC)
Abstract
This work aims at improving the energy efficiency of decentralized learning by optimizing the mixing matrix, which controls the communication demands during the learning process. Through rigorous analysis based on a state-of-the-art decentralized learning algorithm, the problem is formulated as a bi-level optimization, with the lower level solved by graph sparsification. A solution with guaranteed performance is proposed for the special case of fully-connected base topology and a greedy heuristic is proposed for the general case. Simulations based on real topology and dataset show that the proposed solution can lower the energy consumption at the busiest node by 54%-76% while maintaining the quality of the trained model.
Continuously bounds-preserving discontinuous Galerkin methods for hyperbolic conservation laws
Abstract
For finite element approximations of transport phenomena, it is often necessary to apply a form of limiting to ensure that the discrete solution remains well-behaved and satisfies physical constraints. However, these limiting procedures are typically performed at discrete nodal locations, which is not sufficient to ensure the robustness of the scheme when the solution must be evaluated at arbitrary locations (e.g., for adaptive mesh refinement, remapping in arbitrary Lagragian--Eulerian solvers, overset meshes, etc.). In this work, a novel limiting approach for discontinuous Galerkin methods is presented which ensures that the solution is continuously bounds-preserving (i.e., across the entire solution polynomial) for any arbitrary choice of basis, approximation order, and mesh element type. Through a modified formulation for the constraint functionals, the proposed approach requires only the solution of a single spatial scalar minimization problem per element for which a highly efficient numerical optimization procedure is presented. The efficacy of this approach is shown in numerical experiments by enforcing continuous constraints in high-order unstructured discontinuous Galerkin discretizations of hyperbolic conservation laws, ranging from scalar transport with maximum principle preserving constraints to compressible gas dynamics with positivity-preserving constraints.
Finite Expression Method for Learning Dynamics on Complex Networks
Authors: Authors: Zezheng Song, Chunmei Wang, Haizhao Yang
Abstract
Complex network data pervades various real-world domains, including physical, technological, and biological systems. Despite the prevalence of such data, predicting trends and understanding behavioral patterns in complex systems remains challenging due to poorly understood underlying mechanisms. While data-driven methods have made strides in uncovering governing equations from time series data, efforts to extract physical laws from network data are limited and often struggle with incomplete or noisy data. To address these challenges, we introduce a novel approach called the Finite Expression Method (FEX) and its fast algorithm for this learning problem on complex networks. FEX represents dynamics on complex networks using binary trees composed of finite mathematical operators. The nodes within these trees are trained through a combinatorial optimization process guided by reinforcement learning techniques. This unique configuration allows FEX to capture complex dynamics with minimal prior knowledge of the system and a small dictionary of mathematical operators. Our extensive numerical experiments demonstrate that FEX excels in accurately identifying dynamics across diverse network topologies and dynamic behaviors.
To Balance or to Not? Battery Aging-Aware Active Cell Balancing for Electric Vehicles
Abstract
Due to manufacturing variabilities and temperature gradients within an electric vehicle's battery pack, the capacities of cells in it decrease differently over time. This reduces the usable capacity of the battery - the charge levels of one or more cells might be at the minimum threshold while most of the other cells have residual charge. Active cell balancing (i.e., transferring charge among cells) can equalize their charge levels, thereby increasing the battery pack's usable capacity. But performing balancing means additional charge transfer, which can result in energy loss and cell aging, akin to memory aging in storage technologies due to writing. This paper studies when cell balancing should be optimally triggered to minimize aging while maintaining the necessary driving capability. In particular, we propose optimization strategies for cell balancing while minimizing their impact on aging. By borrowing terminology from the storage domain, we refer to this as "wear leveling-aware" active balancing.
Estimating the Lateral Motion States of an Underwater Robot by Propeller Wake Sensing Using an Artificial Lateral Line
Abstract
An artificial lateral line (ALL) is a bioinspired flow sensing system of an underwater robot that consists of distributed flow sensors. The ALL has achieved great success in sensing the motion states of bioinspired underwater robots, e.g., robotic fish, that are driven by body undulation and/or tail flapping. However, the ALL has not been systematically tested and studied in the sensing of underwater robots driven by rotating propellers due to the highly dynamic and complex flow field therein. This paper makes a bold hypothesis that the distributed flow measurements sampled from the propeller wake flow, although infeasible to represent the entire flow dynamics, provides sufficient information for estimating the lateral motion states of the leader underwater robot. An experimental testbed is constructed to investigate the feasibility of such a state estimator which comprises a cylindrical ALL sensory system, a rotating leader propeller, and a water tank with a planar sliding guide. Specifically, a hybrid network that consists of a one-dimensional convolution network (1DCNN) and a bidirectional long short-term memory network (BiLSTM) is designed to extract the spatiotemporal features of the time series of distributed pressure measurements. A multi-output deep learning network is adopted to estimate the lateral motion states of the leader propeller. In addition, the state estimator is optimized using the whale optimization algorithm (WOA) considering the comprehensive estimation performance. Extensive experiments are conducted the results of which validate the proposed data-driven algorithm in estimating the motion states of the leader underwater robot by propeller wake sensing.
A fast offline/online forward solver for stationary transport equation with multiple inflow boundary conditions and varying coefficients
Abstract
It is of great interest to solve the inverse problem of stationary radiative transport equation (RTE) in optical tomography. The standard way is to formulate the inverse problem into an optimization problem, but the bottleneck is that one has to solve the forward problem repeatedly, which is time-consuming. Due to the optical property of biological tissue, in real applications, optical thin and thick regions coexist and are adjacent to each other, and the geometry can be complex. To use coarse meshes and save the computational cost, the forward solver has to be asymptotic preserving across the interface (APAL). In this paper, we propose an offline/online solver for RTE. The cost at the offline stage is comparable to classical methods, while the cost at the online stage is much lower. Two cases are considered. One is to solve the RTE with fixed scattering and absorption cross sections while the boundary conditions vary; the other is when cross sections vary in a small domain and the boundary conditions change many times. The solver can be decomposed into offline/online stages in these two cases. One only needs to calculate the offline stage once and update the online stage when the parameters vary. Our proposed solver is much cheaper when one needs to solve RTE with multiple right-hand sides or when the cross sections vary in a small domain, thus can accelerate the speed of solving inverse RTE problems. We illustrate the online/offline decomposition based on the Tailored Finite Point Method (TFPM), which is APAL on general quadrilateral meshes.
SeqNAS: Neural Architecture Search for Event Sequence Classification
Authors: Authors: Igor Udovichenko, Egor Shvetsov, Denis Divitsky, Dmitry Osin, Ilya Trofimov, Anatoly Glushenko, Ivan Sukharev, Dmitry Berestenev, Evgeny Burnaev
Abstract
Neural Architecture Search (NAS) methods are widely used in various industries to obtain high quality taskspecific solutions with minimal human intervention. Event Sequences find widespread use in various industrial applications including churn prediction customer segmentation fraud detection and fault diagnosis among others. Such data consist of categorical and real-valued components with irregular timestamps. Despite the usefulness of NAS methods previous approaches only have been applied to other domains images texts or time series. Our work addresses this limitation by introducing a novel NAS algorithm SeqNAS specifically designed for event sequence classification. We develop a simple yet expressive search space that leverages commonly used building blocks for event sequence classification including multihead self attention convolutions and recurrent cells. To perform the search we adopt sequential Bayesian Optimization and utilize previously trained models as an ensemble of teachers to augment knowledge distillation. As a result of our work we demonstrate that our method surpasses state of the art NAS methods and popular architectures suitable for sequence classification and holds great potential for various industrial applications.
Size Minimization For Multi-Output AND-Functions
Authors: Authors: Susanne Armbruster
Subjects: Data Structures and Algorithms (cs.DS); Computational Complexity (cs.CC)
Abstract
Recent improvements in adder optimization could be achieved by optimizing the AND-trees occurring within the constructed circuits. The overlap of such trees and its potential for pure size optimization has not been taken into account though. Motivated by this, we examine the fundamental problem of minimizing the size of a circuit for multiple AND-functions on intersecting variable sets. Our formulation generalizes the overlapping \AND-trees within adder optimization but is in NP, in contrast to general Boolean circuit optimization which is in $\Sigma_2^p$ (and thus suspected not to be in NP). While restructuring the AND- or XOR-trees simultaneously, we optimize the total number of gates needed for all functions to be computed. We show that this problem is APX-hard already for functions of few variables and present efficient approximation algorithms for the case in which the Boolean functions depend on at most 3 or 4 variables each, achieving guarantees of $\frac 43$ and $1.9$, respectively. To conclude, we give a polynomial approximation algorithm with guarantee $\frac 23k$ for AND-functions of up to $k$ variables. To achieve these results, the key technique is to determine how much overlap among the variable sets makes tree construction cheap and how little makes the optimum solution large.
A General and Scalable Method for Optimizing Real-Time Systems
Authors: Authors: Sen Wang, Dong Li, Shao-Yu Huang, Xuanliang Deng, Ashrarul H. Sifat, Changhee Jung, Ryan Williams, Haibo Zeng
Abstract
In real-time systems optimization, designers often face a challenging problem posed by the non-convex and non-continuous schedulability conditions, which may even lack an analytical form to understand their properties. To tackle this challenging problem, we treat the schedulability analysis as a black box that only returns true/false results. We propose a general and scalable framework to optimize real-time systems, named Numerical Optimizer with Real-Time Highlight (NORTH). NORTH is built upon the gradient-based active-set methods from the numerical optimization literature but with new methods to manage active constraints for the non-differentiable schedulability constraints. In addition, we also generalize NORTH to NORTH+, to collaboratively optimize certain types of discrete variables (\eg priority assignments, categorical variables) with continuous variables based on numerical optimization algorithms. We demonstrate the algorithm performance with two example applications: energy minimization based on dynamic voltage and frequency scaling (DVFS), and optimization of control system performance. In these experiments, NORTH achieved $10^2$ to $10^5$ times speed improvements over state-of-the-art methods while maintaining similar or better solution quality. NORTH+ outperforms NORTH by 30\% with similar algorithm scalability. Both NORTH and NORTH+ support black-box schedulability analysis, ensuring broad applicability.
On Sample-Efficient Offline Reinforcement Learning: Data Diversity, Posterior Sampling, and Beyond
Abstract
We seek to understand what facilitates sample-efficient learning from historical datasets for sequential decision-making, a problem that is popularly known as offline reinforcement learning (RL). Further, we are interested in algorithms that enjoy sample efficiency while leveraging (value) function approximation. In this paper, we address these fundamental questions by (i) proposing a notion of data diversity that subsumes the previous notions of coverage measures in offline RL and (ii) using this notion to {unify} three distinct classes of offline RL algorithms based on version spaces (VS), regularized optimization (RO), and posterior sampling (PS). We establish that VS-based, RO-based, and PS-based algorithms, under standard assumptions, achieve \emph{comparable} sample efficiency, which recovers the state-of-the-art sub-optimality bounds for finite and linear model classes with the standard assumptions. This result is surprising, given that the prior work suggested an unfavorable sample complexity of the RO-based algorithm compared to the VS-based algorithm, whereas posterior sampling is rarely considered in offline RL due to its explorative nature. Notably, our proposed model-free PS-based algorithm for offline RL is {novel}, with sub-optimality bounds that are {frequentist} (i.e., worst-case) in nature.
Addressing The Knapsack Challenge Through Cultural Algorithm Optimization
Authors: Authors: Mohammad Saleh Vahdatpour
Subjects: Neural and Evolutionary Computing (cs.NE)
Abstract
The "0-1 knapsack problem" stands as a classical combinatorial optimization conundrum, necessitating the selection of a subset of items from a given set. Each item possesses inherent values and weights, and the primary objective is to formulate a selection strategy that maximizes the total value while adhering to a predefined capacity constraint. In this research paper, we introduce a novel variant of Cultural Algorithms tailored specifically for solving 0-1 knapsack problems, a well-known combinatorial optimization challenge. Our proposed algorithm incorporates a belief space to refine the population and introduces two vital functions for dynamically adjusting the crossover and mutation rates during the evolutionary process. Through extensive experimentation, we provide compelling evidence of the algorithm's remarkable efficiency in consistently locating the global optimum, even in knapsack problems characterized by high dimensions and intricate constraints.
SoRoTop: a hitchhiker's guide to topology optimization MATLAB code for design-dependent pneumatic-driven soft robots
Authors: Authors: Prabhat Kumar
Subjects: Computational Engineering, Finance, and Science (cs.CE)
Abstract
Demands for pneumatic-driven soft robots are constantly rising for various applications. However, they are often designed manually due to the lack of systematic methods. Moreover, design-dependent characteristics of pneumatic actuation pose distinctive challenges. This paper provides a compact MATLAB code, named SoRoTop, and its various extensions for designing pneumatic-driven soft robots using topology optimization. The code uses the method of moving asymptotes as the optimizer and builds upon the approach initially presented in Kumar et al.(Struct Multidiscip Optim 61 (4): 1637-1655, 2020). The pneumatic load is modeled using Darcy's law with a conceptualized drainage term. Consistent nodal loads are determined from the resultant pressure field using the conventional finite element approach. The robust formulation is employed, i.e., the eroded and blueprint design descriptions are used. A min-max optimization problem is formulated using the output displacements of the eroded and blueprint designs. A volume constraint is imposed on the blueprint design, while the eroded design is used to apply a conceptualized strain energy constraint. The latter constraint aids in attaining optimized designs that can endure the applied load without compromising their performance. Sensitivities required for optimization are computed using the adjoint-variable method. The code is explained in detail, and various extensions are also presented. It is structured into pre-optimization, MMA optimization, and post-optimization operations, each of which is comprehensively detailed. The paper also illustrates the impact of load sensitivities on the optimized designs. SoRoTop is provided in Appendix A and is available with extensions in the supplementary material and publicly at \url{https://github.com/PrabhatIn/SoRoTop}.
Towards Effective Multiple-in-One Image Restoration: A Sequential and Prompt Learning Strategy
Authors: Authors: Xiangtao Kong, Chao Dong, Lei Zhang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Abstract
While single task image restoration (IR) has achieved significant successes, it remains a challenging issue to train a single model which can tackle multiple IR tasks. In this work, we investigate in-depth the multiple-in-one (MiO) IR problem, which comprises seven popular IR tasks. We point out that MiO IR faces two pivotal challenges: the optimization of diverse objectives and the adaptation to multiple tasks. To tackle these challenges, we present two simple yet effective strategies. The first strategy, referred to as sequential learning, attempts to address how to optimize the diverse objectives, which guides the network to incrementally learn individual IR tasks in a sequential manner rather than mixing them together. The second strategy, i.e., prompt learning, attempts to address how to adapt to the different IR tasks, which assists the network to understand the specific task and improves the generalization ability. By evaluating on 19 test sets, we demonstrate that the sequential and prompt learning strategies can significantly enhance the MiO performance of commonly used CNN and Transformer backbones. Our experiments also reveal that the two strategies can supplement each other to learn better degradation representations and enhance the model robustness. It is expected that our proposed MiO IR formulation and strategies could facilitate the research on how to train IR models with higher generalization capabilities.
Optimizing Order Dispatch Decisions under Delivery Window Constraints
Authors: Authors: Khalid Y. Aram
Subjects: Computational Engineering, Finance, and Science (cs.CE)
Abstract
This study focuses on order dispatch decisions within two-echelon supply chains, where order dispatch creates economic shipments to reduce delivery costs. Dispatching orders is often constrained by delivery windows, leading to penalty costs for untimely deliveries. Prolonged dispatch times can increase the lead time of orders and potentially violate these delivery windows. To balance the trade-offs between lead time and economic delivery, this study introduces a simulation-optimization approach for determining optimal ordering and dispatch rules. It emphasizes the intricacies of the order dispatch process and explores how these can be integrated into the simulation-optimization procedure to improve ordering and delivery decisions. The study evaluates various options for implementing dispatch rules, including the number of dispatch queues and prioritized dispatch. The results indicate that a single-queue, quantity-based, first-in-first-out dispatch approach achieves the greatest cost reduction while maintaining a desirable service level.
Predicting the Skies: A Novel Model for Flight-Level Passenger Traffic Forecasting
Authors: Authors: Sian Ehsani, Elina Sergeeva, Wendy Murdy, Benjamin Fox
Abstract
Accurate prediction of flight-level passenger traffic is of paramount importance in airline operations, influencing key decisions from pricing to route optimization. This study introduces a novel, multimodal deep learning approach to the challenge of predicting flight-level passenger traffic, yielding substantial accuracy improvements compared to traditional models. Leveraging an extensive dataset from American Airlines, our model ingests historical traffic data, fare closure information, and seasonality attributes specific to each flight. Our proposed neural network integrates the strengths of Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN), exploiting the temporal patterns and spatial relationships within the data to enhance prediction performance. Crucial to the success of our model is a comprehensive data processing strategy. We construct 3D tensors to represent data, apply careful masking strategies to mirror real-world dynamics, and employ data augmentation techniques to enrich the diversity of our training set. The efficacy of our approach is borne out in the results: our model demonstrates an approximate 33\% improvement in Mean Squared Error (MSE) compared to traditional benchmarks. This study, therefore, highlights the significant potential of deep learning techniques and meticulous data processing in advancing the field of flight traffic prediction.
Decentralized Federated Policy Gradient with Byzantine Fault-Tolerance and Provably Fast Convergence
Authors: Authors: Philip Jordan, Florian Grötschla, Flint Xiaofeng Fan, Roger Wattenhofer
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Distributed, Parallel, and Cluster Computing (cs.DC); Multiagent Systems (cs.MA)
Abstract
In Federated Reinforcement Learning (FRL), agents aim to collaboratively learn a common task, while each agent is acting in its local environment without exchanging raw trajectories. Existing approaches for FRL either (a) do not provide any fault-tolerance guarantees (against misbehaving agents), or (b) rely on a trusted central agent (a single point of failure) for aggregating updates. We provide the first decentralized Byzantine fault-tolerant FRL method. Towards this end, we first propose a new centralized Byzantine fault-tolerant policy gradient (PG) algorithm that improves over existing methods by relying only on assumptions standard for non-fault-tolerant PG. Then, as our main contribution, we show how a combination of robust aggregation and Byzantine-resilient agreement methods can be leveraged in order to eliminate the need for a trusted central entity. Since our results represent the first sample complexity analysis for Byzantine fault-tolerant decentralized federated non-convex optimization, our technical contributions may be of independent interest. Finally, we corroborate our theoretical results experimentally for common RL environments, demonstrating the speed-up of decentralized federations w.r.t. the number of participating agents and resilience against various Byzantine attacks.
Pre-insertion resistors temperature prediction based on improved WOA-SVR
Authors: Authors: Honghe Dai, Site Mo, Haoxin Wang, Nan Yin, Songhai Fan, Bixiong Li
Abstract
The pre-insertion resistors (PIR) within high-voltage circuit breakers are critical components and warm up by generating Joule heat when an electric current flows through them. Elevated temperature can lead to temporary closure failure and, in severe cases, the rupture of PIR. To accurately predict the temperature of PIR, this study combines finite element simulation techniques with Support Vector Regression (SVR) optimized by an Improved Whale Optimization Algorithm (IWOA) approach. The IWOA includes Tent mapping, a convergence factor based on the sigmoid function, and the Ornstein-Uhlenbeck variation strategy. The IWOA-SVR model is compared with the SSA-SVR and WOA-SVR. The results reveal that the prediction accuracies of the IWOA-SVR model were 90.2% and 81.5% (above 100$^\circ$C) in the 3$^\circ$C temperature deviation range and 96.3% and 93.4% (above 100$^\circ$C) in the 4$^\circ$C temperature deviation range, surpassing the performance of the comparative models. This research demonstrates the method proposed can realize the online monitoring of the temperature of the PIR, which can effectively prevent thermal faults PIR and provide a basis for the opening and closing of the circuit breaker within a short period.
Quadrotor Stabilization with Safety Guarantees: A Universal Formula Approach
Authors: Authors: Ming Li, Zhiyong Sun, Siep Weiland
Subjects: Robotics (cs.RO); Systems and Control (eess.SY)
Abstract
Safe stabilization is a significant challenge for quadrotors, which involves reaching a goal position while avoiding obstacles. Most of the existing solutions for this problem rely on optimization-based methods, demanding substantial onboard computational resources. This paper introduces a novel approach to address this issue and provides a solution that offers fast computational capabilities tailored for onboard execution. Drawing inspiration from Sontag's universal formula, we propose an analytical control strategy that incorporates the conditions of control Lyapunov functions (CLFs) and control barrier functions (CBFs), effectively avoiding the need for solving optimization problems onboard. Moreover, we extend our approach by incorporating the concepts of input-to-state stability (ISS) and input-to-state safety (ISSf), enhancing the universal formula's capacity to effectively manage disturbances. Furthermore, we present a projection-based approach to ensure that the universal formula remains effective even when faced with control input constraints. The basic idea of this approach is to project the control input derived from the universal formula onto the closest point within the control input domain. Through comprehensive simulations and experimental results, we validate the efficacy and highlight the advantages of our methodology.
Automated construction of effective potential via algorithmic implicit bias
Abstract
We introduce a novel approach for decomposing and learning every scale of a given multiscale objective function in $\mathbb{R}^d$, where $d\ge 1$. This approach leverages a recently demonstrated implicit bias of the optimization method of gradient descent by Kong and Tao, which enables the automatic generation of data that nearly follow Gibbs distribution with an effective potential at any desired scale. One application of this automated effective potential modeling is to construct reduced-order models. For instance, a deterministic surrogate Hamiltonian model can be developed to substantially soften the stiffness that bottlenecks the simulation, while maintaining the accuracy of phase portraits at the scale of interest. Similarly, a stochastic surrogate model can be constructed at a desired scale, such that both its equilibrium and out-of-equilibrium behaviors (characterized by auto-correlation function and mean path) align with those of a damped mechanical system with the original multiscale function being its potential. The robustness and efficiency of our proposed approach in multi-dimensional scenarios have been demonstrated through a series of numerical experiments. A by-product of our development is a method for anisotropic noise estimation and calibration. More precisely, Langevin model of stochastic mechanical systems may not have isotropic noise in practice, and we provide a systematic algorithm to quantify its covariance matrix without directly measuring the noise. In this case, the system may not admit closed form expression of its invariant distribution either, but with this tool, we can design friction matrix appropriately to calibrate the system so that its invariant distribution has a closed form expression of Gibbs.
Characterizing Physical Memory Fragmentation
Authors: Authors: Mark Mansi, Michael M. Swift
Subjects: Operating Systems (cs.OS); Performance (cs.PF)
Abstract
External fragmentation of physical memory occurs when adjacent differently sized regions of allocated physical memory are freed at different times, causing free memory to be physically discontiguous. It can significantly degrade system performance and efficiency, such as reducing the ability to use huge pages, a critical optimization on modern large-memory system. For decades system developers have sought to avoid and mitigate fragmentation, but few prior studies quantify and characterize it in production settings. Moreover, prior work often artificially fragments physical memory to create more realistic performance evaluations, but their fragmentation methodologies are ad hoc and unvalidated. Out of 13 papers, we found 11 different methodologies, some of which were subsequently found inadequate. The importance of addressing fragmentation necessitates a validated and principled methodology. Our work fills these gaps in knowledge and methodology. We conduct a study of memory fragmentation in production by observing 248 machines in the Computer Sciences Department at University of Wisconsin - Madison for a week. We identify six key memory usage patterns, and find that Linux's file cache and page reclamation systems are major contributors to fragmentation because they often obliviously break up contiguous memory. Finally, we create and\'uril, a tool to artificially fragment memory during experimental research evaluations. While and\'uril ultimately fails as a scientific tool, we discuss its design ideas, merits, and failings in hope that they may inspire future research.
Physics-informed Neural Networks for Encoding Dynamics in Real Physical Systems
Abstract
This dissertation investigates physics-informed neural networks (PINNs) as candidate models for encoding governing equations, and assesses their performance on experimental data from two different systems. The first system is a simple nonlinear pendulum, and the second is 2D heat diffusion across the surface of a metal block. We show that for the pendulum system the PINNs outperformed equivalent uninformed neural networks (NNs) in the ideal data case, with accuracy improvements of 18x and 6x for 10 linearly-spaced and 10 uniformly-distributed random training points respectively. In similar test cases with real data collected from an experiment, PINNs outperformed NNs with 9.3x and 9.1x accuracy improvements for 67 linearly-spaced and uniformly-distributed random points respectively. For the 2D heat diffusion, we show that both PINNs and NNs do not fare very well in reconstructing the heating regime due to difficulties in optimizing the network parameters over a large domain in both time and space. We highlight that data denoising and smoothing, reducing the size of the optimization problem, and using LBFGS as the optimizer are all ways to improve the accuracy of the predicted solution for both PINNs and NNs. Additionally, we address the viability of deploying physics-informed models within physical systems, and we choose FPGAs as the compute substrate for deployment. In light of this, we perform our experiments using a PYNQ-Z1 FPGA and identify issues related to time-coherent sensing and spatial data alignment. We discuss the insights gained from this work and list future work items based on the proposed architecture for the system that our methods work to develop.
GLOCALFAIR: Jointly Improving Global and Local Group Fairness in Federated Learning
Authors: Authors: Syed Irfan Ali Meerza, Luyang Liu, Jiaxin Zhang, Jian Liu
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computers and Society (cs.CY)
Abstract
Federated learning (FL) has emerged as a prospective solution for collaboratively learning a shared model across clients without sacrificing their data privacy. However, the federated learned model tends to be biased against certain demographic groups (e.g., racial and gender groups) due to the inherent FL properties, such as data heterogeneity and party selection. Unlike centralized learning, mitigating bias in FL is particularly challenging as private training datasets and their sensitive attributes are typically not directly accessible. Most prior research in this field only focuses on global fairness while overlooking the local fairness of individual clients. Moreover, existing methods often require sensitive information about the client's local datasets to be shared, which is not desirable. To address these issues, we propose GLOCALFAIR, a client-server co-design fairness framework that can jointly improve global and local group fairness in FL without the need for sensitive statistics about the client's private datasets. Specifically, we utilize constrained optimization to enforce local fairness on the client side and adopt a fairness-aware clustering-based aggregation on the server to further ensure the global model fairness across different sensitive groups while maintaining high utility. Experiments on two image datasets and one tabular dataset with various state-of-the-art fairness baselines show that GLOCALFAIR can achieve enhanced fairness under both global and local data distributions while maintaining a good level of utility and client fairness.
Abstract
All vehicles must follow the rules that govern traffic behavior, regardless of whether the vehicles are human-driven or Connected Autonomous Vehicles (CAVs). Road signs indicate locally active rules, such as speed limits and requirements to yield or stop. Recent research has demonstrated attacks, such as adding stickers or projected colored patches to signs, that cause CAV misinterpretation, resulting in potential safety issues. Humans can see and potentially defend against these attacks. But humans can not detect what they can not observe. We have developed an effective physical-world attack that leverages the sensitivity of filterless image sensors and the properties of Infrared Laser Reflections (ILRs), which are invisible to humans. The attack is designed to affect CAV cameras and perception, undermining traffic sign recognition by inducing misclassification. In this work, we formulate the threat model and requirements for an ILR-based traffic sign perception attack to succeed. We evaluate the effectiveness of the ILR attack with real-world experiments against two major traffic sign recognition architectures on four IR-sensitive cameras. Our black-box optimization methodology allows the attack to achieve up to a 100% attack success rate in indoor, static scenarios and a >80.5% attack success rate in our outdoor, moving vehicle scenarios. We find the latest state-of-the-art certifiable defense is ineffective against ILR attacks as it mis-certifies >33.5% of cases. To address this, we propose a detection strategy based on the physical properties of IR laser reflections which can detect 96% of ILR attacks.
AA-DLADMM: An Accelerated ADMM-based Framework for Training Deep Neural Networks
Authors: Authors: Zeinab Ebrahimi, Gustavo Batista, Mohammad Deghat
Subjects: Machine Learning (cs.LG); Systems and Control (eess.SY)
Abstract
Stochastic gradient descent (SGD) and its many variants are the widespread optimization algorithms for training deep neural networks. However, SGD suffers from inevitable drawbacks, including vanishing gradients, lack of theoretical guarantees, and substantial sensitivity to input. The Alternating Direction Method of Multipliers (ADMM) has been proposed to address these shortcomings as an effective alternative to the gradient-based methods. It has been successfully employed for training deep neural networks. However, ADMM-based optimizers have a slow convergence rate. This paper proposes an Anderson Acceleration for Deep Learning ADMM (AA-DLADMM) algorithm to tackle this drawback. The main intention of the AA-DLADMM algorithm is to employ Anderson acceleration to ADMM by considering it as a fixed-point iteration and attaining a nearly quadratic convergence rate. We verify the effectiveness and efficiency of the proposed AA-DLADMM algorithm by conducting extensive experiments on four benchmark datasets contrary to other state-of-the-art optimizers.
Exploiting Storage for Computing: Computation Reuse in Collaborative Edge Computing
Authors: Authors: Xingqiu He, Chaoqun You, Tony Q. S. Quek
Subjects: Networking and Internet Architecture (cs.NI); Signal Processing (eess.SP)
Abstract
Collaborative Edge Computing (CEC) is a new edge computing paradigm that enables neighboring edge servers to share computational resources with each other. Although CEC can enhance the utilization of computational resources, it still suffers from resource waste. The primary reason is that end-users from the same area are likely to offload similar tasks to edge servers, thereby leading to duplicate computations. To improve system efficiency, the computation results of previously executed tasks can be cached and then reused by subsequent tasks. However, most existing computation reuse algorithms only consider one edge server, which significantly limits the effectiveness of computation reuse. To address this issue, this paper applies computation reuse in CEC networks to exploit the collaboration among edge servers. We formulate an optimization problem that aims to minimize the overall task response time and decompose it into a caching subproblem and a scheduling subproblem. By analyzing the properties of optimal solutions, we show that the optimal caching decisions can be efficiently searched using the bisection method. For the scheduling subproblem, we utilize projected gradient descent and backtracking to find a local minimum. Numerical results show that our algorithm significantly reduces the response time in various situations.
DDM-Lag : A Diffusion-based Decision-making Model for Autonomous Vehicles with Lagrangian Safety Enhancement
Abstract
Decision-making stands as a pivotal component in the realm of autonomous vehicles (AVs), playing a crucial role in navigating the intricacies of autonomous driving. Amidst the evolving landscape of data-driven methodologies, enhancing decision-making performance in complex scenarios has emerged as a prominent research focus. Despite considerable advancements, current learning-based decision-making approaches exhibit potential for refinement, particularly in aspects of policy articulation and safety assurance. To address these challenges, we introduce DDM-Lag, a Diffusion Decision Model,augmented with Lagrangian-based safety enhancements.In our approach, the autonomous driving decision-making conundrum is conceptualized as a Constrained Markov Decision Process (CMDP). We have crafted an Actor-Critic framework, wherein the diffusion model is employed as the actor,facilitating policy exploration and learning. The integration of safety constraints in the CMDP and the adoption of a Lagrangian relaxation-based policy optimization technique ensure enhanced decision safety. A PID controller is employed for the stable updating of model parameters. The effectiveness of DDM-Lag is evaluated through different driving tasks, showcasing improvements in decision-making safety and overall performance compared to baselines.
Joint Power Allocation and User Scheduling in Integrated Satellite-Terrestrial Cell-Free Massive MIMO IoT Systems
Authors: Authors: Trinh Van Chien, Ha An Le, Ta Hai Tung, Hien Quoc Ngo, Symeon Chatzinotas
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)
Abstract
Both space and ground communications have been proven effective solutions under different perspectives in Internet of Things (IoT) networks. This paper investigates multiple-access scenarios, where plenty of IoT users are cooperatively served by a satellite in space and access points (APs) on the ground. Available users in each coherence interval are split into scheduled and unscheduled subsets to optimize limited radio resources. We compute the uplink ergodic throughput of each scheduled user under imperfect channel state information (CSI) and non-orthogonal pilot signals. As maximum-radio combining is deployed locally at the ground gateway and the APs, the uplink ergodic throughput is obtained in a closed-form expression. The analytical results explicitly unveil the effects of channel conditions and pilot contamination on each scheduled user. By maximizing the sum throughput, the system can simultaneously determine scheduled users and perform power allocation based on either a model-based approach with alternating optimization or a learning-based approach with the graph neural network. Numerical results manifest that integrated satellite-terrestrial cell-free massive multiple-input multiple-output systems can significantly improve the sum ergodic throughput over coherence intervals. The integrated systems can schedule the vast majority of users; some might be out of service due to the limited power budget.
Corn Yield Prediction Model with Deep Neural Networks for Smallholder Farmer Decision Support System
Authors: Authors: Chollette Olisah, Lyndon Smith, Melvyn Smith, Lawrence Morolake, Osi Ojukwu
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Human-Computer Interaction (cs.HC)
Abstract
Given the nonlinearity of the interaction between weather and soil variables, a novel deep neural network regressor (DNNR) was carefully designed with considerations to the depth, number of neurons of the hidden layers, and the hyperparameters with their optimizations. Additionally, a new metric, the average of absolute root squared error (ARSE) was proposed to address the shortcomings of root mean square error (RMSE) and mean absolute error (MAE) while combining their strengths. Using the ARSE metric, the random forest regressor (RFR) and the extreme gradient boosting regressor (XGBR), were compared with DNNR. The RFR and XGBR achieved yield errors of 0.0000294 t/ha, and 0.000792 t/ha, respectively, compared to the DNNR(s) which achieved 0.0146 t/ha and 0.0209 t/ha, respectively. All errors were impressively small. However, with changes to the explanatory variables to ensure generalizability to unforeseen data, DNNR(s) performed best. The unforeseen data, different from unseen data, is coined to represent sudden and unexplainable change to weather and soil variables due to climate change. Further analysis reveals that a strong interaction does exist between weather and soil variables. Using precipitation and silt, which are strong-negatively and strong-positively correlated with yield, respectively, yield was observed to increase when precipitation was reduced and silt increased, and vice-versa.
A foundation for exact binarized morphological neural networks
Abstract
Training and running deep neural networks (NNs) often demands a lot of computation and energy-intensive specialized hardware (e.g. GPU, TPU...). One way to reduce the computation and power cost is to use binary weight NNs, but these are hard to train because the sign function has a non-smooth gradient. We present a model based on Mathematical Morphology (MM), which can binarize ConvNets without losing performance under certain conditions, but these conditions may not be easy to satisfy in real-world scenarios. To solve this, we propose two new approximation methods and develop a robust theoretical framework for ConvNets binarization using MM. We propose as well regularization losses to improve the optimization. We empirically show that our model can learn a complex morphological network, and explore its performance on a classification task.
Metaheuristics for (Variable-Size) Mixed Optimization Problems: A Unified Taxonomy and Survey
Abstract
Many real world optimization problems are formulated as mixed-variable optimization problems (MVOPs) which involve both continuous and discrete variables. MVOPs including dimensional variables are characterized by a variable-size search space. Depending on the values of dimensional variables, the number and type of the variables of the problem can vary dynamically. MVOPs and variable-size MVOPs (VMVOPs) are difficult to solve and raise a number of scientific challenges in the design of metaheuristics. Standard metaheuristics have been first designed to address continuous or discrete optimization problems, and are not able to tackle (V)MVOPs in an efficient way. The development of metaheuristics for solving such problems has attracted the attention of many researchers and is increasingly popular. However, to our knowledge there is no well established taxonomy and comprehensive survey for handling this important family of optimization problems. This paper presents a unified taxonomy for metaheuristic solutions for solving (V)MVOPs in an attempt to provide a common terminology and classification mechanisms. It provides a general mathematical formulation and concepts of (V)MVOPs, and identifies the various solving methodologies than can be applied in metaheuristics. The advantages, the weaknesses and the limitations of the presented methodologies are discussed. The proposed taxonomy also allows to identify some open research issues which needs further in-depth investigations.
A Modifiable Architectural Design for Commercial Greenhouses Energy Economic Dispatch Testbed
Authors: Authors: Christian Skafte Beck Clausen, Bo Nørregaard Jørgensen, Zheng Grace Ma
Subjects: Software Engineering (cs.SE); Systems and Control (eess.SY)
Abstract
Facing economic challenges due to the diverse objectives of businesses, and consumers, commercial greenhouses strive to minimize energy costs while addressing CO2 emissions. This scenario is intensified by rising energy costs and the global imperative to curtail CO2 emissions. To address these dynamic economic challenges, this paper proposes an architectural design for an energy economic dispatch testbed for commercial greenhouses. Utilizing the Attribute-Driven De-sign method, core architectural components of a software-in-the-loop testbed are proposed which emphasizes modularity and careful consideration of the multi-objective optimization problem. This approach extends prior research by implementing a modular multi-objective optimization framework in Java. The results demonstrate the successful integration of the CO2 reduction objective within the modular architecture with minimal effort. The multi-objective optimization output can also be employed to examine cost and CO2 objectives, ultimately serving as a valuable decision-support tool. The novel testbed architecture and a modular approach can tackle the multi-objective optimization problem and enable commercial greenhouses to navigate the intricate landscape of energy cost and CO2 emissions management.
MX: Enhancing RISC-V's Vector ISA for Ultra-Low Overhead, Energy-Efficient Matrix Multiplication
Abstract
Dense Matrix Multiplication (MatMul) is arguably one of the most ubiquitous compute-intensive kernels, spanning linear algebra, DSP, graphics, and machine learning applications. Thus, MatMul optimization is crucial not only in high-performance processors but also in embedded low-power platforms. Several Instruction Set Architectures (ISAs) have recently included matrix extensions to improve MatMul performance and efficiency at the cost of added matrix register files and units. In this paper, we propose Matrix eXtension (MX), a lightweight approach that builds upon the open-source RISC-V Vector (RVV) ISA to boost MatMul energy efficiency. Instead of adding expensive dedicated hardware, MX uses the pre-existing vector register file and functional units to create a hybrid vector/matrix engine at a negligible area cost (< 3%), which comes from a compact near-FPU tile buffer for higher data reuse, and no clock frequency overhead. We implement MX on a compact and highly energy-optimized RVV processor and evaluate it in both a Dual- and 64-Core cluster in a 12-nm technology node. MX boosts the Dual-Core's energy efficiency by 10% for a double-precision 64x64x64 matrix multiplication with the same FPU utilization (~97%) and by 25% on the 64-Core cluster for the same benchmark on 32-bit data, with a 56% performance gain.
A Minimaximalist Approach to Reinforcement Learning from Human Feedback
Authors: Authors: Gokul Swamy, Christoph Dann, Rahul Kidambi, Zhiwei Steven Wu, Alekh Agarwal
Abstract
We present Self-Play Preference Optimization (SPO), an algorithm for reinforcement learning from human feedback. Our approach is minimalist in that it does not require training a reward model nor unstable adversarial training and is therefore rather simple to implement. Our approach is maximalist in that it provably handles non-Markovian, intransitive, and stochastic preferences while being robust to the compounding errors that plague offline approaches to sequential prediction. To achieve the preceding qualities, we build upon the concept of a Minimax Winner (MW), a notion of preference aggregation from the social choice theory literature that frames learning from preferences as a zero-sum game between two policies. By leveraging the symmetry of this game, we prove that rather than using the traditional technique of dueling two policies to compute the MW, we can simply have a single agent play against itself while maintaining strong convergence guarantees. Practically, this corresponds to sampling multiple trajectories from a policy, asking a rater or preference model to compare them, and then using the proportion of wins as the reward for a particular trajectory. We demonstrate that on a suite of continuous control tasks, we are able to learn significantly more efficiently than reward-model based approaches while maintaining robustness to the intransitive and stochastic preferences that frequently occur in practice when aggregating human judgments.
Fun with Flags: Robust Principal Directions via Flag Manifolds
Abstract
Principal component analysis (PCA), along with its extensions to manifolds and outlier contaminated data, have been indispensable in computer vision and machine learning. In this work, we present a unifying formalism for PCA and its variants, and introduce a framework based on the flags of linear subspaces, \ie a hierarchy of nested linear subspaces of increasing dimension, which not only allows for a common implementation but also yields novel variants, not explored previously. We begin by generalizing traditional PCA methods that either maximize variance or minimize reconstruction error. We expand these interpretations to develop a wide array of new dimensionality reduction algorithms by accounting for outliers and the data manifold. To devise a common computational approach, we recast robust and dual forms of PCA as optimization problems on flag manifolds. We then integrate tangent space approximations of principal geodesic analysis (tangent-PCA) into this flag-based framework, creating novel robust and dual geodesic PCA variations. The remarkable flexibility offered by the 'flagification' introduced here enables even more algorithmic variants identified by specific flag types. Last but not least, we propose an effective convergent solver for these flag-formulations employing the Stiefel manifold. Our empirical results on both real-world and synthetic scenarios, demonstrate the superiority of our novel algorithms, especially in terms of robustness to outliers on manifolds.
AGG: Amortized Generative 3D Gaussians for Single Image to 3D
Abstract
Given the growing need for automatic 3D content creation pipelines, various 3D representations have been studied to generate 3D objects from a single image. Due to its superior rendering efficiency, 3D Gaussian splatting-based models have recently excelled in both 3D reconstruction and generation. 3D Gaussian splatting approaches for image to 3D generation are often optimization-based, requiring many computationally expensive score-distillation steps. To overcome these challenges, we introduce an Amortized Generative 3D Gaussian framework (AGG) that instantly produces 3D Gaussians from a single image, eliminating the need for per-instance optimization. Utilizing an intermediate hybrid representation, AGG decomposes the generation of 3D Gaussian locations and other appearance attributes for joint optimization. Moreover, we propose a cascaded pipeline that first generates a coarse representation of the 3D data and later upsamples it with a 3D Gaussian super-resolution module. Our method is evaluated against existing optimization-based 3D Gaussian frameworks and sampling-based pipelines utilizing other 3D representations, where AGG showcases competitive generation abilities both qualitatively and quantitatively while being several orders of magnitude faster. Project page: https://ir1d.github.io/AGG/
Keyword: adam
There is no result
Keyword: gradient
Multidimensional extrapolated global proximal gradient and applications for image processing
Abstract
The proximal gradient method is a generic technique introduced to tackle the non-smoothness in optimization problems, wherein the objective function is expressed as the sum of a differentiable convex part and a non-differentiable regularization term. Such problems with tensor format are of interest in many fields of applied mathematics such as image and video processing. Our goal in this paper is to address the solution of such problems with a more general form of the regularization term. An adapted iterative proximal gradient method is introduced for this purpose. Due to the slowness of the proposed algorithm, we use new tensor extrapolation methods to enhance its convergence. Numerical experiments on color image deblurring are conducted to illustrate the efficiency of our approach.
Adaptive Boosting with Fairness-aware Reweighting Technique for Fair Classification
Abstract
Machine learning methods based on AdaBoost have been widely applied to various classification problems across many mission-critical applications including healthcare, law and finance. However, there is a growing concern about the unfairness and discrimination of data-driven classification models, which is inevitable for classical algorithms including AdaBoost. In order to achieve fair classification, a novel fair AdaBoost (FAB) approach is proposed that is an interpretable fairness-improving variant of AdaBoost. We mainly investigate binary classification problems and focus on the fairness of three different indicators (i.e., accuracy, false positive rate and false negative rate). By utilizing a fairness-aware reweighting technique for base classifiers, the proposed FAB approach can achieve fair classification while maintaining the advantage of AdaBoost with negligible sacrifice of predictive performance. In addition, a hyperparameter is introduced in FAB to show preferences for the fairness-accuracy trade-off. An upper bound for the target loss function that quantifies error rate and unfairness is theoretically derived for FAB, which provides a strict theoretical support for the fairness-improving methods designed for AdaBoost. The effectiveness of the proposed method is demonstrated on three real-world datasets (i.e., Adult, COMPAS and HSLS) with respect to the three fairness indicators. The results are accordant with theoretic analyses, and show that (i) FAB significantly improves classification fairness at a small cost of accuracy compared with AdaBoost; and (ii) FAB outperforms state-of-the-art fair classification methods including equalized odds method, exponentiated gradient method, and disparate mistreatment method in terms of the fairness-accuracy trade-off.
To Balance or to Not? Battery Aging-Aware Active Cell Balancing for Electric Vehicles
Abstract
Due to manufacturing variabilities and temperature gradients within an electric vehicle's battery pack, the capacities of cells in it decrease differently over time. This reduces the usable capacity of the battery - the charge levels of one or more cells might be at the minimum threshold while most of the other cells have residual charge. Active cell balancing (i.e., transferring charge among cells) can equalize their charge levels, thereby increasing the battery pack's usable capacity. But performing balancing means additional charge transfer, which can result in energy loss and cell aging, akin to memory aging in storage technologies due to writing. This paper studies when cell balancing should be optimally triggered to minimize aging while maintaining the necessary driving capability. In particular, we propose optimization strategies for cell balancing while minimizing their impact on aging. By borrowing terminology from the storage domain, we refer to this as "wear leveling-aware" active balancing.
Data-Dependent Stability Analysis of Adversarial Training
Abstract
Stability analysis is an essential aspect of studying the generalization ability of deep learning, as it involves deriving generalization bounds for stochastic gradient descent-based training algorithms. Adversarial training is the most widely used defense against adversarial example attacks. However, previous generalization bounds for adversarial training have not included information regarding the data distribution. In this paper, we fill this gap by providing generalization bounds for stochastic gradient descent-based adversarial training that incorporate data distribution information. We utilize the concepts of on-average stability and high-order approximate Lipschitz conditions to examine how changes in data distribution and adversarial budget can affect robust generalization gaps. Our derived generalization bounds for both convex and non-convex losses are at least as good as the uniform stability-based counterparts which do not include data distribution information. Furthermore, our findings demonstrate how distribution shifts from data poisoning attacks can impact robust generalization.
Understanding Representation Learnability of Nonlinear Self-Supervised Learning
Authors: Authors: Ruofeng Yang, Xiangyuan Li, Bo Jiang, Shuai Li
Abstract
Self-supervised learning (SSL) has empirically shown its data representation learnability in many downstream tasks. There are only a few theoretical works on data representation learnability, and many of those focus on final data representation, treating the nonlinear neural network as a ``black box". However, the accurate learning results of neural networks are crucial for describing the data distribution features learned by SSL models. Our paper is the first to analyze the learning results of the nonlinear SSL model accurately. We consider a toy data distribution that contains two features: the label-related feature and the hidden feature. Unlike previous linear setting work that depends on closed-form solutions, we use the gradient descent algorithm to train a 1-layer nonlinear SSL model with a certain initialization region and prove that the model converges to a local minimum. Furthermore, different from the complex iterative analysis, we propose a new analysis process which uses the exact version of Inverse Function Theorem to accurately describe the features learned by the local minimum. With this local minimum, we prove that the nonlinear SSL model can capture the label-related feature and hidden feature at the same time. In contrast, the nonlinear supervised learning (SL) model can only learn the label-related feature. We also present the learning processes and results of the nonlinear SSL and SL model via simulation experiments.
Interpreting Adaptive Gradient Methods by Parameter Scaling for Learning-Rate-Free Optimization
Authors: Authors: Min-Kook Suh, Seung-Woo Seo
Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC)
Abstract
We address the challenge of estimating the learning rate for adaptive gradient methods used in training deep neural networks. While several learning-rate-free approaches have been proposed, they are typically tailored for steepest descent. However, although steepest descent methods offer an intuitive approach to finding minima, many deep learning applications require adaptive gradient methods to achieve faster convergence. In this paper, we interpret adaptive gradient methods as steepest descent applied on parameter-scaled networks, proposing learning-rate-free adaptive gradient methods. Experimental results verify the effectiveness of this approach, demonstrating comparable performance to hand-tuned learning rates across various scenarios. This work extends the applicability of learning-rate-free methods, enhancing training with adaptive gradient methods.
A General and Scalable Method for Optimizing Real-Time Systems
Authors: Authors: Sen Wang, Dong Li, Shao-Yu Huang, Xuanliang Deng, Ashrarul H. Sifat, Changhee Jung, Ryan Williams, Haibo Zeng
Abstract
In real-time systems optimization, designers often face a challenging problem posed by the non-convex and non-continuous schedulability conditions, which may even lack an analytical form to understand their properties. To tackle this challenging problem, we treat the schedulability analysis as a black box that only returns true/false results. We propose a general and scalable framework to optimize real-time systems, named Numerical Optimizer with Real-Time Highlight (NORTH). NORTH is built upon the gradient-based active-set methods from the numerical optimization literature but with new methods to manage active constraints for the non-differentiable schedulability constraints. In addition, we also generalize NORTH to NORTH+, to collaboratively optimize certain types of discrete variables (\eg priority assignments, categorical variables) with continuous variables based on numerical optimization algorithms. We demonstrate the algorithm performance with two example applications: energy minimization based on dynamic voltage and frequency scaling (DVFS), and optimization of control system performance. In these experiments, NORTH achieved $10^2$ to $10^5$ times speed improvements over state-of-the-art methods while maintaining similar or better solution quality. NORTH+ outperforms NORTH by 30\% with similar algorithm scalability. Both NORTH and NORTH+ support black-box schedulability analysis, ensuring broad applicability.
Comparison of Microservice Call Rate Predictions for Replication in the Cloud
Authors: Authors: Narges Mehran, Arman Haghighi, Pedram Aminharati, Nikolay Nikolov, Ahmet Soylu, Dumitru Roman, Radu Prodan
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Abstract
Today, many users deploy their microservice-based applications with various interconnections on a cluster of Cloud machines, subject to stochastic changes due to dynamic user requirements. To address this problem, we compare three machine learning (ML) models for predicting the microservice call rates based on the microservice times and aiming at estimating the scalability requirements. We apply the linear regression (LR), multilayer perception (MLP), and gradient boosting regression (GBR) models on the Alibaba microservice traces. The prediction results reveal that the LR model reaches a lower training time than the GBR and MLP models. However, the GBR reduces the mean absolute error and the mean absolute percentage error compared to LR and MLP models. Moreover, the prediction results show that the required number of replicas for each microservice by the gradient boosting model is close to the actual test data without any prediction.
Bilateral Reference for High-Resolution Dichotomous Image Segmentation
Authors: Authors: Peng Zheng, Dehong Gao, Deng-Ping Fan, Li Liu, Jorma Laaksonen, Wanli Ouyang, Nicu Sebe
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Abstract
We introduce a novel bilateral reference framework (BiRefNet) for high-resolution dichotomous image segmentation (DIS). It comprises two essential components: the localization module (LM) and the reconstruction module (RM) with our proposed bilateral reference (BiRef). The LM aids in object localization using global semantic information. Within the RM, we utilize BiRef for the reconstruction process, where hierarchical patches of images provide the source reference and gradient maps serve as the target reference. These components collaborate to generate the final predicted maps. We also introduce auxiliary gradient supervision to enhance focus on regions with finer details. Furthermore, we outline practical training strategies tailored for DIS to improve map quality and training process. To validate the general applicability of our approach, we conduct extensive experiments on four tasks to evince that BiRefNet exhibits remarkable performance, outperforming task-specific cutting-edge methods across all benchmarks.
Decentralized Federated Policy Gradient with Byzantine Fault-Tolerance and Provably Fast Convergence
Authors: Authors: Philip Jordan, Florian Grötschla, Flint Xiaofeng Fan, Roger Wattenhofer
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Distributed, Parallel, and Cluster Computing (cs.DC); Multiagent Systems (cs.MA)
Abstract
In Federated Reinforcement Learning (FRL), agents aim to collaboratively learn a common task, while each agent is acting in its local environment without exchanging raw trajectories. Existing approaches for FRL either (a) do not provide any fault-tolerance guarantees (against misbehaving agents), or (b) rely on a trusted central agent (a single point of failure) for aggregating updates. We provide the first decentralized Byzantine fault-tolerant FRL method. Towards this end, we first propose a new centralized Byzantine fault-tolerant policy gradient (PG) algorithm that improves over existing methods by relying only on assumptions standard for non-fault-tolerant PG. Then, as our main contribution, we show how a combination of robust aggregation and Byzantine-resilient agreement methods can be leveraged in order to eliminate the need for a trusted central entity. Since our results represent the first sample complexity analysis for Byzantine fault-tolerant decentralized federated non-convex optimization, our technical contributions may be of independent interest. Finally, we corroborate our theoretical results experimentally for common RL environments, demonstrating the speed-up of decentralized federations w.r.t. the number of participating agents and resilience against various Byzantine attacks.
Automated construction of effective potential via algorithmic implicit bias
Abstract
We introduce a novel approach for decomposing and learning every scale of a given multiscale objective function in $\mathbb{R}^d$, where $d\ge 1$. This approach leverages a recently demonstrated implicit bias of the optimization method of gradient descent by Kong and Tao, which enables the automatic generation of data that nearly follow Gibbs distribution with an effective potential at any desired scale. One application of this automated effective potential modeling is to construct reduced-order models. For instance, a deterministic surrogate Hamiltonian model can be developed to substantially soften the stiffness that bottlenecks the simulation, while maintaining the accuracy of phase portraits at the scale of interest. Similarly, a stochastic surrogate model can be constructed at a desired scale, such that both its equilibrium and out-of-equilibrium behaviors (characterized by auto-correlation function and mean path) align with those of a damped mechanical system with the original multiscale function being its potential. The robustness and efficiency of our proposed approach in multi-dimensional scenarios have been demonstrated through a series of numerical experiments. A by-product of our development is a method for anisotropic noise estimation and calibration. More precisely, Langevin model of stochastic mechanical systems may not have isotropic noise in practice, and we provide a systematic algorithm to quantify its covariance matrix without directly measuring the noise. In this case, the system may not admit closed form expression of its invariant distribution either, but with this tool, we can design friction matrix appropriately to calibrate the system so that its invariant distribution has a closed form expression of Gibbs.
Multi-Modal Federated Learning for Cancer Staging over Non-IID Datasets with Unbalanced Modalities
Abstract
The use of machine learning (ML) for cancer staging through medical image analysis has gained substantial interest across medical disciplines. When accompanied by the innovative federated learning (FL) framework, ML techniques can further overcome privacy concerns related to patient data exposure. Given the frequent presence of diverse data modalities within patient records, leveraging FL in a multi-modal learning framework holds considerable promise for cancer staging. However, existing works on multi-modal FL often presume that all data-collecting institutions have access to all data modalities. This oversimplified approach neglects institutions that have access to only a portion of data modalities within the system. In this work, we introduce a novel FL architecture designed to accommodate not only the heterogeneity of data samples, but also the inherent heterogeneity/non-uniformity of data modalities across institutions. We shed light on the challenges associated with varying convergence speeds observed across different data modalities within our FL system. Subsequently, we propose a solution to tackle these challenges by devising a distributed gradient blending and proximity-aware client weighting strategy tailored for multi-modal FL. To show the superiority of our method, we conduct experiments using The Cancer Genome Atlas program (TCGA) datalake considering different cancer types and three modalities of data: mRNA sequences, histopathological image data, and clinical information.
AA-DLADMM: An Accelerated ADMM-based Framework for Training Deep Neural Networks
Authors: Authors: Zeinab Ebrahimi, Gustavo Batista, Mohammad Deghat
Subjects: Machine Learning (cs.LG); Systems and Control (eess.SY)
Abstract
Stochastic gradient descent (SGD) and its many variants are the widespread optimization algorithms for training deep neural networks. However, SGD suffers from inevitable drawbacks, including vanishing gradients, lack of theoretical guarantees, and substantial sensitivity to input. The Alternating Direction Method of Multipliers (ADMM) has been proposed to address these shortcomings as an effective alternative to the gradient-based methods. It has been successfully employed for training deep neural networks. However, ADMM-based optimizers have a slow convergence rate. This paper proposes an Anderson Acceleration for Deep Learning ADMM (AA-DLADMM) algorithm to tackle this drawback. The main intention of the AA-DLADMM algorithm is to employ Anderson acceleration to ADMM by considering it as a fixed-point iteration and attaining a nearly quadratic convergence rate. We verify the effectiveness and efficiency of the proposed AA-DLADMM algorithm by conducting extensive experiments on four benchmark datasets contrary to other state-of-the-art optimizers.
Exploiting Storage for Computing: Computation Reuse in Collaborative Edge Computing
Authors: Authors: Xingqiu He, Chaoqun You, Tony Q. S. Quek
Subjects: Networking and Internet Architecture (cs.NI); Signal Processing (eess.SP)
Abstract
Collaborative Edge Computing (CEC) is a new edge computing paradigm that enables neighboring edge servers to share computational resources with each other. Although CEC can enhance the utilization of computational resources, it still suffers from resource waste. The primary reason is that end-users from the same area are likely to offload similar tasks to edge servers, thereby leading to duplicate computations. To improve system efficiency, the computation results of previously executed tasks can be cached and then reused by subsequent tasks. However, most existing computation reuse algorithms only consider one edge server, which significantly limits the effectiveness of computation reuse. To address this issue, this paper applies computation reuse in CEC networks to exploit the collaboration among edge servers. We formulate an optimization problem that aims to minimize the overall task response time and decompose it into a caching subproblem and a scheduling subproblem. By analyzing the properties of optimal solutions, we show that the optimal caching decisions can be efficiently searched using the bisection method. For the scheduling subproblem, we utilize projected gradient descent and backtracking to find a local minimum. Numerical results show that our algorithm significantly reduces the response time in various situations.
Inverse-like Antagonistic Scene Text Spotting via Reading-Order Estimation and Dynamic Sampling
Abstract
Scene text spotting is a challenging task, especially for inverse-like scene text, which has complex layouts, e.g., mirrored, symmetrical, or retro-flexed. In this paper, we propose a unified end-to-end trainable inverse-like antagonistic text spotting framework dubbed IATS, which can effectively spot inverse-like scene texts without sacrificing general ones. Specifically, we propose an innovative reading-order estimation module (REM) that extracts reading-order information from the initial text boundary generated by an initial boundary module (IBM). To optimize and train REM, we propose a joint reading-order estimation loss consisting of a classification loss, an orthogonality loss, and a distribution loss. With the help of IBM, we can divide the initial text boundary into two symmetric control points and iteratively refine the new text boundary using a lightweight boundary refinement module (BRM) for adapting to various shapes and scales. To alleviate the incompatibility between text detection and recognition, we propose a dynamic sampling module (DSM) with a thin-plate spline that can dynamically sample appropriate features for recognition in the detected text region. Without extra supervision, the DSM can proactively learn to sample appropriate features for text recognition through the gradient returned by the recognition module. Extensive experiments on both challenging scene text and inverse-like scene text datasets demonstrate that our method achieves superior performance both on irregular and inverse-like text spotting.
Corn Yield Prediction Model with Deep Neural Networks for Smallholder Farmer Decision Support System
Authors: Authors: Chollette Olisah, Lyndon Smith, Melvyn Smith, Lawrence Morolake, Osi Ojukwu
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Human-Computer Interaction (cs.HC)
Abstract
Given the nonlinearity of the interaction between weather and soil variables, a novel deep neural network regressor (DNNR) was carefully designed with considerations to the depth, number of neurons of the hidden layers, and the hyperparameters with their optimizations. Additionally, a new metric, the average of absolute root squared error (ARSE) was proposed to address the shortcomings of root mean square error (RMSE) and mean absolute error (MAE) while combining their strengths. Using the ARSE metric, the random forest regressor (RFR) and the extreme gradient boosting regressor (XGBR), were compared with DNNR. The RFR and XGBR achieved yield errors of 0.0000294 t/ha, and 0.000792 t/ha, respectively, compared to the DNNR(s) which achieved 0.0146 t/ha and 0.0209 t/ha, respectively. All errors were impressively small. However, with changes to the explanatory variables to ensure generalizability to unforeseen data, DNNR(s) performed best. The unforeseen data, different from unseen data, is coined to represent sudden and unexplainable change to weather and soil variables due to climate change. Further analysis reveals that a strong interaction does exist between weather and soil variables. Using precipitation and silt, which are strong-negatively and strong-positively correlated with yield, respectively, yield was observed to increase when precipitation was reduced and silt increased, and vice-versa.
A topological description of loss surfaces based on Betti Numbers
Authors: Authors: Maria Sofia Bucarelli, Giuseppe Alessio D'Inverno, Monica Bianchini, Franco Scarselli, Fabrizio Silvestri
Abstract
In the context of deep learning models, attention has recently been paid to studying the surface of the loss function in order to better understand training with methods based on gradient descent. This search for an appropriate description, both analytical and topological, has led to numerous efforts to identify spurious minima and characterize gradient dynamics. Our work aims to contribute to this field by providing a topological measure to evaluate loss complexity in the case of multilayer neural networks. We compare deep and shallow architectures with common sigmoidal activation functions by deriving upper and lower bounds on the complexity of their loss function and revealing how that complexity is influenced by the number of hidden units, training models, and the activation function used. Additionally, we found that certain variations in the loss function or model architecture, such as adding an $\ell_2$ regularization term or implementing skip connections in a feedforward network, do not affect loss topology in specific cases.
A foundation for exact binarized morphological neural networks
Abstract
Training and running deep neural networks (NNs) often demands a lot of computation and energy-intensive specialized hardware (e.g. GPU, TPU...). One way to reduce the computation and power cost is to use binary weight NNs, but these are hard to train because the sign function has a non-smooth gradient. We present a model based on Mathematical Morphology (MM), which can binarize ConvNets without losing performance under certain conditions, but these conditions may not be easy to satisfy in real-world scenarios. To solve this, we propose two new approximation methods and develop a robust theoretical framework for ConvNets binarization using MM. We propose as well regularization losses to improve the optimization. We empirically show that our model can learn a complex morphological network, and explore its performance on a classification task.
Weak Correlations as the Underlying Principle for Linearization of Gradient-Based Learning Systems
Authors: Authors: Ori Shem-Ur, Yaron Oz
Subjects: Machine Learning (cs.LG); Statistical Mechanics (cond-mat.stat-mech); High Energy Physics - Theory (hep-th); Probability (math.PR); Machine Learning (stat.ML)
Abstract
Deep learning models, such as wide neural networks, can be conceptualized as nonlinear dynamical physical systems characterized by a multitude of interacting degrees of freedom. Such systems in the infinite limit, tend to exhibit simplified dynamics. This paper delves into gradient descent-based learning algorithms, that display a linear structure in their parameter dynamics, reminiscent of the neural tangent kernel. We establish this apparent linearity arises due to weak correlations between the first and higher-order derivatives of the hypothesis function, concerning the parameters, taken around their initial values. This insight suggests that these weak correlations could be the underlying reason for the observed linearization in such systems. As a case in point, we showcase this weak correlations structure within neural networks in the large width limit. Exploiting the relationship between linearity and weak correlations, we derive a bound on deviations from linearity observed during the training trajectory of stochastic gradient descent. To facilitate our proof, we introduce a novel method to characterise the asymptotic behavior of random tensors.
Variance Reduction in Ratio Metrics for Efficient Online Experiments
Authors: Authors: Shubham Baweja, Neeti Pokharna, Aleksei Ustimenko, Olivier Jeunen
Subjects: Machine Learning (cs.LG); Information Retrieval (cs.IR); Applications (stat.AP)
Abstract
Online controlled experiments, such as A/B-tests, are commonly used by modern tech companies to enable continuous system improvements. Despite their paramount importance, A/B-tests are expensive: by their very definition, a percentage of traffic is assigned an inferior system variant. To ensure statistical significance on top-level metrics, online experiments typically run for several weeks. Even then, a considerable amount of experiments will lead to inconclusive results (i.e. false negatives, or type-II error). The main culprit for this inefficiency is the variance of the online metrics. Variance reduction techniques have been proposed in the literature, but their direct applicability to commonly used ratio metrics (e.g. click-through rate or user retention) is limited. In this work, we successfully apply variance reduction techniques to ratio metrics on a large-scale short-video platform: ShareChat. Our empirical results show that we can either improve A/B-test confidence in 77% of cases, or can retain the same level of confidence with 30% fewer data points. Importantly, we show that the common approach of including as many covariates as possible in regression is counter-productive, highlighting that control variates based on Gradient-Boosted Decision Tree predictors are most effective. We discuss the practicalities of implementing these methods at scale and showcase the cost reduction they beget.
Convex SGD: Generalization Without Early Stopping
Authors: Authors: Julien Hendrickx, Alex Olshevsky
Subjects: Machine Learning (cs.LG); Statistics Theory (math.ST)
Abstract
We consider the generalization error associated with stochastic gradient descent on a smooth convex function over a compact set. We show the first bound on the generalization error that vanishes when the number of iterations $T$ and the dataset size $n$ go to zero at arbitrary rates; our bound scales as $\tilde{O}(1/\sqrt{T} + 1/\sqrt{n})$ with step-size $\alpha_t = 1/\sqrt{t}$. In particular, strong convexity is not needed for stochastic gradient descent to generalize well.
LoFi User Scheduling for Multiuser MIMO Wireless Systems
Authors: Authors: Alexandra Gallyas-Sanhueza, Gian Marti, Victoria Palhares, Reinhard Wiesmayr, Christoph Studer
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)
Abstract
We propose new low-fidelity (LoFi) user equipment (UE) scheduling algorithms for multiuser multiple-input multiple-output (MIMO) wireless communication systems. The proposed methods rely on an efficient guess-and-check procedure that, given an objective function, performs paired comparisons between random subsets of UEs that should be scheduled in certain time slots. The proposed LoFi scheduling methods are computationally efficient, highly parallelizable, and gradient-free, which enables the use of almost arbitrary, non-differentiable objective functions. System simulations in a millimeter-wave (mmWave) multiuser MIMO scenario demonstrate that the proposed LoFi schedulers outperform a range of state-of-the-art user scheduling algorithms in terms of bit error-rate and/or computational complexity.
Keyword: super-resolution
FMA-Net: Flow-Guided Dynamic Filtering and Iterative Feature Refinement with Multi-Attention for Joint Video Super-Resolution and Deblurring
Authors: Authors: Geunhyuk Youk, Jihyong Oh, Munchurl Kim
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Abstract
We present a joint learning scheme of video super-resolution and deblurring, called VSRDB, to restore clean high-resolution (HR) videos from blurry low-resolution (LR) ones. This joint restoration problem has drawn much less attention compared to single restoration problems. In this paper, we propose a novel flow-guided dynamic filtering (FGDF) and iterative feature refinement with multi-attention (FRMA), which constitutes our VSRDB framework, denoted as FMA-Net. Specifically, our proposed FGDF enables precise estimation of both spatio-temporally-variant degradation and restoration kernels that are aware of motion trajectories through sophisticated motion representation learning. Compared to conventional dynamic filtering, the FGDF enables the FMA-Net to effectively handle large motions into the VSRDB. Additionally, the stacked FRMA blocks trained with our novel temporal anchor (TA) loss, which temporally anchors and sharpens features, refine features in a course-to-fine manner through iterative updates. Extensive experiments demonstrate the superiority of the proposed FMA-Net over state-of-the-art methods in terms of both quantitative and qualitative quality. Codes and pre-trained models are available at: https://kaist-viclab.github.io/fmanet-site
AGG: Amortized Generative 3D Gaussians for Single Image to 3D
Abstract
Given the growing need for automatic 3D content creation pipelines, various 3D representations have been studied to generate 3D objects from a single image. Due to its superior rendering efficiency, 3D Gaussian splatting-based models have recently excelled in both 3D reconstruction and generation. 3D Gaussian splatting approaches for image to 3D generation are often optimization-based, requiring many computationally expensive score-distillation steps. To overcome these challenges, we introduce an Amortized Generative 3D Gaussian framework (AGG) that instantly produces 3D Gaussians from a single image, eliminating the need for per-instance optimization. Utilizing an intermediate hybrid representation, AGG decomposes the generation of 3D Gaussian locations and other appearance attributes for joint optimization. Moreover, we propose a cascaded pipeline that first generates a coarse representation of the 3D data and later upsamples it with a 3D Gaussian super-resolution module. Our method is evaluated against existing optimization-based 3D Gaussian frameworks and sampling-based pipelines utilizing other 3D representations, where AGG showcases competitive generation abilities both qualitatively and quantitatively while being several orders of magnitude faster. Project page: https://ir1d.github.io/AGG/
Keyword: sgd
AA-DLADMM: An Accelerated ADMM-based Framework for Training Deep Neural Networks
Keyword: optimization
Forensic Video Analytic Software
Multidimensional extrapolated global proximal gradient and applications for image processing
Reliability-Optimized User Admission Control for URLLC Traffic: A Neural Contextual Bandit Approach
Energy-efficient Decentralized Learning via Graph Sparsification
Continuously bounds-preserving discontinuous Galerkin methods for hyperbolic conservation laws
Finite Expression Method for Learning Dynamics on Complex Networks
To Balance or to Not? Battery Aging-Aware Active Cell Balancing for Electric Vehicles
Estimating the Lateral Motion States of an Underwater Robot by Propeller Wake Sensing Using an Artificial Lateral Line
A fast offline/online forward solver for stationary transport equation with multiple inflow boundary conditions and varying coefficients
SeqNAS: Neural Architecture Search for Event Sequence Classification
Size Minimization For Multi-Output AND-Functions
A General and Scalable Method for Optimizing Real-Time Systems
On Sample-Efficient Offline Reinforcement Learning: Data Diversity, Posterior Sampling, and Beyond
Addressing The Knapsack Challenge Through Cultural Algorithm Optimization
SoRoTop: a hitchhiker's guide to topology optimization MATLAB code for design-dependent pneumatic-driven soft robots
Towards Effective Multiple-in-One Image Restoration: A Sequential and Prompt Learning Strategy
Optimizing Order Dispatch Decisions under Delivery Window Constraints
Predicting the Skies: A Novel Model for Flight-Level Passenger Traffic Forecasting
Decentralized Federated Policy Gradient with Byzantine Fault-Tolerance and Provably Fast Convergence
Pre-insertion resistors temperature prediction based on improved WOA-SVR
Quadrotor Stabilization with Safety Guarantees: A Universal Formula Approach
Automated construction of effective potential via algorithmic implicit bias
Characterizing Physical Memory Fragmentation
Physics-informed Neural Networks for Encoding Dynamics in Real Physical Systems
GLOCALFAIR: Jointly Improving Global and Local Group Fairness in Federated Learning
Invisible Reflections: Leveraging Infrared Laser Reflections to Target Traffic Sign Perception
AA-DLADMM: An Accelerated ADMM-based Framework for Training Deep Neural Networks
Exploiting Storage for Computing: Computation Reuse in Collaborative Edge Computing
DDM-Lag : A Diffusion-based Decision-making Model for Autonomous Vehicles with Lagrangian Safety Enhancement
Joint Power Allocation and User Scheduling in Integrated Satellite-Terrestrial Cell-Free Massive MIMO IoT Systems
Corn Yield Prediction Model with Deep Neural Networks for Smallholder Farmer Decision Support System
A foundation for exact binarized morphological neural networks
Metaheuristics for (Variable-Size) Mixed Optimization Problems: A Unified Taxonomy and Survey
A Modifiable Architectural Design for Commercial Greenhouses Energy Economic Dispatch Testbed
MX: Enhancing RISC-V's Vector ISA for Ultra-Low Overhead, Energy-Efficient Matrix Multiplication
A Minimaximalist Approach to Reinforcement Learning from Human Feedback
Fun with Flags: Robust Principal Directions via Flag Manifolds
AGG: Amortized Generative 3D Gaussians for Single Image to 3D
Keyword: adam
There is no result
Keyword: gradient
Multidimensional extrapolated global proximal gradient and applications for image processing
Adaptive Boosting with Fairness-aware Reweighting Technique for Fair Classification
To Balance or to Not? Battery Aging-Aware Active Cell Balancing for Electric Vehicles
Data-Dependent Stability Analysis of Adversarial Training
Understanding Representation Learnability of Nonlinear Self-Supervised Learning
Interpreting Adaptive Gradient Methods by Parameter Scaling for Learning-Rate-Free Optimization
A General and Scalable Method for Optimizing Real-Time Systems
Comparison of Microservice Call Rate Predictions for Replication in the Cloud
Bilateral Reference for High-Resolution Dichotomous Image Segmentation
Decentralized Federated Policy Gradient with Byzantine Fault-Tolerance and Provably Fast Convergence
Automated construction of effective potential via algorithmic implicit bias
Multi-Modal Federated Learning for Cancer Staging over Non-IID Datasets with Unbalanced Modalities
AA-DLADMM: An Accelerated ADMM-based Framework for Training Deep Neural Networks
Exploiting Storage for Computing: Computation Reuse in Collaborative Edge Computing
Inverse-like Antagonistic Scene Text Spotting via Reading-Order Estimation and Dynamic Sampling
Corn Yield Prediction Model with Deep Neural Networks for Smallholder Farmer Decision Support System
A topological description of loss surfaces based on Betti Numbers
A foundation for exact binarized morphological neural networks
Weak Correlations as the Underlying Principle for Linearization of Gradient-Based Learning Systems
Variance Reduction in Ratio Metrics for Efficient Online Experiments
Convex SGD: Generalization Without Early Stopping
LoFi User Scheduling for Multiuser MIMO Wireless Systems
Keyword: super-resolution
FMA-Net: Flow-Guided Dynamic Filtering and Iterative Feature Refinement with Multi-Attention for Joint Video Super-Resolution and Deblurring
AGG: Amortized Generative 3D Gaussians for Single Image to 3D