Abstract
Multi-modal optimization is often encountered in engineering problems, especially when different and alternative solutions are sought. Evolutionary algorithms can efficiently tackle multi-modal optimization thanks to their features such as the concept of population, exploration/exploitation, and being suitable for parallel computation. This paper introduces a multi-modal optimization version of the Big Bang-Big Crunch algorithm based on clustering, namely, k-BBBC. This algorithm guarantees a complete convergence of the entire population, retrieving on average the 99\% of local optima for a specific problem. Additionally, we introduce two post-processing methods to (i) identify the local optima in a set of retrieved solutions (i.e., a population), and (ii) quantify the number of correctly retrieved optima against the expected ones (i.e., success rate). Our results show that k-BBBC performs well even with problems having a large number of optima (tested on 379 optima) and high dimensionality (tested on 32 decision variables). When compared to other multi-modal optimization methods, it outperforms them in terms of accuracy (in both search and objective space) and success rate (number of correctly retrieved optima) -- especially when elitism is applied. Lastly, we validated our proposed post-processing methods by comparing their success rate to the actual one. Results suggest that these methods can be used to evaluate the performance of a multi-modal optimization algorithm by correctly identifying optima and providing an indication of success -- without the need to know where the optima are located in the search space.
Scissorhands: Scrub Data Influence via Connection Sensitivity in Networks
Authors: Authors: Jing Wu, Mehrtash Harandi
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Abstract
Machine unlearning has become a pivotal task to erase the influence of data from a trained model. It adheres to recent data regulation standards and enhances the privacy and security of machine learning applications. Most existing machine unlearning methods perform well, however, they typically necessitate access to the entirety of the remaining data, which might not be feasible in certain scenarios. In this work, we present a new machine unlearning approach Scissorhands, which operates effectively with only a subset of the training data. Initially, Scissorhands identifies the most pertinent parameters in the given model relative to the forgetting data via connection sensitivity. This process involves reinitializing the most influential top-$k$ percent of these parameters, resulting in a trimmed model for erasing the influence of the forgetting data. Subsequently, Scissorhands retrains the trimmed model through a min-max optimization process, seeking parameters that preserve information on the remaining data while discarding information related to the forgetting data. Our experimental results, conducted across five distinct datasets and utilizing both CNN and ViT, demonstrate that Scissorhands, despite utilizing only a limited portion of the training data, showcases competitive performance when compared to existing methods.
Kimera2: Robust and Accurate Metric-Semantic SLAM in the Real World
Authors: Authors: Marcus Abate, Yun Chang, Nathan Hughes, Luca Carlone
Abstract
We present improvements to Kimera, an open-source metric-semantic visual-inertial SLAM library. In particular, we enhance Kimera-VIO, the visual-inertial odometry pipeline powering Kimera, to support better feature tracking, more efficient keyframe selection, and various input modalities (eg monocular, stereo, and RGB-D images, as well as wheel odometry). Additionally, Kimera-RPGO and Kimera-PGMO, Kimera's pose-graph optimization backends, are updated to support modern outlier rejection methods - specifically, Graduated-Non-Convexity - for improved robustness to spurious loop closures. These new features are evaluated extensively on a variety of simulated and real robotic platforms, including drones, quadrupeds, wheeled robots, and simulated self-driving cars. We present comparisons against several state-of-the-art visual-inertial SLAM pipelines and discuss strengths and weaknesses of the new release of Kimera. The newly added features have been released open-source at https://github.com/MIT-SPARK/Kimera.
An ontology alignment method with user intervention using compact differential evolution with adaptive parameter control
Authors: Authors: Zhaoming Lv
Subjects: Neural and Evolutionary Computing (cs.NE)
Abstract
User interaction is one of the most effective ways to improve the ontology alignment quality. However, this approach faces the challenge of how users can participate effectively in the matching process. To solve this challenge. In this paper, an interactive ontology alignment approach using compact differential evolution algorithm with adaptive parameter control (IOACDE) is proposed. In this method, the ontology alignment process is modeled as an interactive optimization problem and users are allowed to intervene in matching in two ways. One is that the mapping suggestions generated by IOACDE as a complete candidate alignment is evaluated by user during optimization process. The other is that the user ameliorates the alignment results by evaluating single mapping after the automatic matching process. To demonstrate the effectiveness of the proposed algorithm, the neural embedding model and K nearest neighbor (KNN) is employed to simulate user for the ontologies of the real world. The experimental results show that the proposed interactive approach can improve the alignment quality compared to the non-interactive. Compared with the state-of-the-art methods from OAEI, the results show that the proposed algorithm has a better performance under the same error rate.
A Temporal-Spectral Fusion Transformer with Subject-specific Adapter for Enhancing RSVP-BCI Decoding
Authors: Authors: Xujin Li, Wei Wei, Shuang Qiu, Huiguang He
Abstract
The Rapid Serial Visual Presentation (RSVP)-based Brain-Computer Interface (BCI) is an efficient technology for target retrieval using electroencephalography (EEG) signals. The performance improvement of traditional decoding methods relies on a substantial amount of training data from new test subjects, which increases preparation time for BCI systems. Several studies introduce data from existing subjects to reduce the dependence of performance improvement on data from new subjects, but their optimization strategy based on adversarial learning with extensive data increases training time during the preparation procedure. Moreover, most previous methods only focus on the single-view information of EEG signals, but ignore the information from other views which may further improve performance. To enhance decoding performance while reducing preparation time, we propose a Temporal-Spectral fusion transformer with Subject-specific Adapter (TSformer-SA). Specifically, a cross-view interaction module is proposed to facilitate information transfer and extract common representations across two-view features extracted from EEG temporal signals and spectrogram images. Then, an attention-based fusion module fuses the features of two views to obtain comprehensive discriminative features for classification. Furthermore, a multi-view consistency loss is proposed to maximize the feature similarity between two views of the same EEG signal. Finally, we propose a subject-specific adapter to rapidly transfer the knowledge of the model trained on data from existing subjects to decode data from new subjects. Experimental results show that TSformer-SA significantly outperforms comparison methods and achieves outstanding performance with limited training data from new subjects. This facilitates efficient decoding and rapid deployment of BCI systems in practical use.
Ordering-Flexible Multi-Robot Coordination for MovingTarget Convoying Using Long-TermTask Execution
Abstract
In this paper, we propose a cooperative long-term task execution (LTTE) algorithm for protecting a moving target into the interior of an ordering-flexible convex hull by a team of robots resiliently in the changing environments. Particularly, by designing target-approaching and sensing-neighbor collision-free subtasks, and incorporating these subtasks into the constraints rather than the traditional cost function in an online constraint-based optimization framework, the proposed LTTE can systematically guarantee long-term target convoying under changing environments in the n-dimensional Euclidean space. Then, the introduction of slack variables allow for the constraint violation of different subtasks; i.e., the attraction from target-approaching constraints and the repulsion from time-varying collision-avoidance constraints, which results in the desired formation with arbitrary spatial ordering sequences. Rigorous analysis is provided to guarantee asymptotical convergence with challenging nonlinear couplings induced by time-varying collision-free constraints. Finally, 2D experiments using three autonomous mobile robots (AMRs) are conducted to validate the effectiveness of the proposed algorithm, and 3D simulations tackling changing environmental elements, such as different initial positions, some robots suddenly breakdown and static obstacles are presented to demonstrate the multi-dimensional adaptability, robustness and the ability of obstacle avoidance of the proposed method.
RotationDrag: Point-based Image Editing with Rotated Diffusion Features
Authors: Authors: Minxing Luo, Wentao Cheng, Jian Yang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Abstract
A precise and user-friendly manipulation of image content while preserving image fidelity has always been crucial to the field of image editing. Thanks to the power of generative models, recent point-based image editing methods allow users to interactively change the image content with high generalizability by clicking several control points. But the above mentioned editing process is usually based on the assumption that features stay constant in the motion supervision step from initial to target points. In this work, we conduct a comprehensive investigation in the feature space of diffusion models, and find that features change acutely under in-plane rotation. Based on this, we propose a novel approach named RotationDrag, which significantly improves point-based image editing performance when users intend to in-plane rotate the image content. Our method tracks handle points more precisely by utilizing the feature map of the rotated images, thus ensuring precise optimization and high image fidelity. Furthermore, we build a in-plane rotation focused benchmark called RotateBench, the first benchmark to evaluate the performance of point-based image editing method under in-plane rotation scenario on both real images and generated images. A thorough user study demonstrates the superior capability in accomplishing in-plane rotation that users intend to achieve, comparing the DragDiffusion baseline and other existing diffusion-based methods. See the project page https://github.com/Tony-Lowe/RotationDrag for code and experiment results.
Batch-ICL: Effective, Efficient, and Order-Agnostic In-Context Learning
Authors: Authors: Kaiyi Zhang, Ang Lv, Yuhan Chen, Hansen Ha, Tao Xu, Rui Yan
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL)
Abstract
In this paper, by treating in-context learning (ICL) as a meta-optimization process, we explain why LLMs are sensitive to the order of ICL examples. This understanding leads us to the development of Batch-ICL, an effective, efficient, and order-agnostic inference algorithm for ICL. Differing from the standard N-shot learning approach, Batch-ICL employs $N$ separate 1-shot forward computations and aggregates the resulting meta-gradients. These aggregated meta-gradients are then applied to a zero-shot learning to generate the final prediction. This batch processing approach renders the LLM agnostic to the order of ICL examples. Through extensive experiments and analysis, we demonstrate that Batch-ICL consistently outperforms most permutations of example sequences. In some cases, it even exceeds the performance of the optimal order for standard ICL, all while reducing the computational resources required. Furthermore, we develop a novel variant of Batch-ICL featuring multiple "epochs" of meta-optimization. This variant implicitly explores permutations of ICL examples, further enhancing ICL performance.
Maximum Causal Entropy Inverse Reinforcement Learning for Mean-Field Games
Authors: Authors: Berkay Anahtarci, Can Deha Kariksiz, Naci Saldi
Subjects: Systems and Control (eess.SY); Machine Learning (cs.LG); Optimization and Control (math.OC)
Abstract
In this paper, we introduce the maximum casual entropy Inverse Reinforcement Learning (IRL) problem for discrete-time mean-field games (MFGs) under an infinite-horizon discounted-reward optimality criterion. The state space of a typical agent is finite. Our approach begins with a comprehensive review of the maximum entropy IRL problem concerning deterministic and stochastic Markov decision processes (MDPs) in both finite and infinite-horizon scenarios. Subsequently, we formulate the maximum casual entropy IRL problem for MFGs - a non-convex optimization problem with respect to policies. Leveraging the linear programming formulation of MDPs, we restructure this IRL problem into a convex optimization problem and establish a gradient descent algorithm to compute the optimal solution with a rate of convergence. Finally, we present a new algorithm by formulating the MFG problem as a generalized Nash equilibrium problem (GNEP), which is capable of computing the mean-field equilibrium (MFE) for the forward RL problem. This method is employed to produce data for a numerical example. We note that this novel algorithm is also applicable to general MFE computations.
Mutual Enhancement of Large Language and Reinforcement Learning Models through Bi-Directional Feedback Mechanisms: A Case Study
Abstract
Large Language Models (LLMs) have demonstrated remarkable capabilities for reinforcement learning (RL) models, such as planning and reasoning capabilities. However, the problems of LLMs and RL model collaboration still need to be solved. In this study, we employ a teacher-student learning framework to tackle these problems, specifically by offering feedback for LLMs using RL models and providing high-level information for RL models with LLMs in a cooperative multi-agent setting. Within this framework, the LLM acts as a teacher, while the RL model acts as a student. The two agents cooperatively assist each other through a process of recursive help, such as "I help you help I help." The LLM agent supplies abstract information to the RL agent, enabling efficient exploration and policy improvement. In turn, the RL agent offers feedback to the LLM agent, providing valuable, real-time information that helps generate more useful tokens. This bi-directional feedback loop promotes optimization, exploration, and mutual improvement for both agents, enabling them to accomplish increasingly challenging tasks. Remarkably, we propose a practical algorithm to address the problem and conduct empirical experiments to evaluate the effectiveness of our method.
Identifying Policy Gradient Subspaces
Authors: Authors: Jan Schneider, Pierre Schumacher, Simon Guist, Le Chen, Daniel Häufle, Bernhard Schölkopf, Dieter Büchler
Abstract
Policy gradient methods hold great potential for solving complex continuous control tasks. Still, their training efficiency can be improved by exploiting structure within the optimization problem. Recent work indicates that supervised learning can be accelerated by leveraging the fact that gradients lie in a low-dimensional and slowly-changing subspace. In this paper, we conduct a thorough evaluation of this phenomenon for two popular deep policy gradient methods on various simulated benchmark tasks. Our results demonstrate the existence of such gradient subspaces despite the continuously changing data distribution inherent to reinforcement learning. These findings reveal promising directions for future work on more efficient reinforcement learning, e.g., through improving parameter-space exploration or enabling second-order optimization.
Block Majorization Minimization with Extrapolation and Application to $β$-NMF
Authors: Authors: Le Thi Khanh Hien, Valentin Leplat, Nicolas Gillis
Subjects: Machine Learning (cs.LG); Signal Processing (eess.SP); Numerical Analysis (math.NA); Optimization and Control (math.OC)
Abstract
We propose a Block Majorization Minimization method with Extrapolation (BMMe) for solving a class of multi-convex optimization problems. The extrapolation parameters of BMMe are updated using a novel adaptive update rule. By showing that block majorization minimization can be reformulated as a block mirror descent method, with the Bregman divergence adaptively updated at each iteration, we establish subsequential convergence for BMMe. We use this method to design efficient algorithms to tackle nonnegative matrix factorization problems with the $\beta$-divergences ($\beta$-NMF) for $\beta\in [1,2]$. These algorithms, which are multiplicative updates with extrapolation, benefit from our novel results that offer convergence guarantees. We also empirically illustrate the significant acceleration of BMMe for $\beta$-NMF through extensive experiments.
Real-time MPC with Control Barrier Functions for Autonomous Driving using Safety Enhanced Collocation
Authors: Authors: Jean Pierre Allamaa, Panagiotis Patrinos, Toshiyuki Ohtsuka, Tong Duy Son
Subjects: Systems and Control (eess.SY); Optimization and Control (math.OC)
Abstract
The autonomous driving industry is continuously dealing with more safety-critical scenarios, and nonlinear model predictive control (NMPC) is a powerful control strategy for handling such situations. However, standard safety constraints are not scalable and require a long NMPC horizon. Moreover, the adoption of NMPC in the automotive industry is limited by the heavy computation of numerical optimization routines. To address those issues, this paper presents a real-time capable NMPC for automated driving in urban environments, using control barrier functions (CBFs). Furthermore, the designed NMPC is based on a novel collocation transcription approach, named RESAFE/COL, that allows to reduce the number of optimization variables while still guaranteeing the continuous time (nonlinear) inequality constraints satisfaction, through regional convex hull approximation. RESAFE/COL is proven to be 5 times faster than multiple shooting and more tractable for embedded hardware without a decrease in the performance, nor accuracy and safety of the numerical solution. We validate our NMPC-CBF with RESAFE/COL approach with highly accurate digital twins of the vehicle and the urban environment and show the safe controller's ability to improve crash avoidance by 91%.
Data-Efficient Interactive Multi-Objective Optimization Using ParEGO
Authors: Authors: Arash Heidari, Sebastian Rojas Gonzalez, Tom Dhaene, Ivo Couckuyt
Subjects: Neural and Evolutionary Computing (cs.NE)
Abstract
Multi-objective optimization is a widely studied problem in diverse fields, such as engineering and finance, that seeks to identify a set of non-dominated solutions that provide optimal trade-offs among competing objectives. However, the computation of the entire Pareto front can become prohibitively expensive, both in terms of computational resources and time, particularly when dealing with a large number of objectives. In practical applications, decision-makers (DMs) will select a single solution of the Pareto front that aligns with their preferences to be implemented; thus, traditional multi-objective algorithms invest a lot of budget sampling solutions that are not interesting for the DM. In this paper, we propose two novel algorithms that employ Gaussian Processes and advanced discretization methods to efficiently locate the most preferred region of the Pareto front in expensive-to-evaluate problems. Our approach involves interacting with the decision-maker to guide the optimization process towards their preferred trade-offs. Our experimental results demonstrate that our proposed algorithms are effective in finding non-dominated solutions that align with the decision-maker's preferences while maintaining computational efficiency.
LMI-based robust model predictive control for a quarter car with series active variable geometry suspension
Authors: Authors: Zilin Feng, Anastasis Georgiou, Min Yu, Simos A. Evangelou, Imad M Jaimoukha, Daniele Dini
Abstract
This paper proposes a robust model predictive control-based solution for the recently introduced series active variable geometry suspension (SAVGS) to improve the ride comfort and road holding of a quarter car. In order to close the gap between the nonlinear multi-body SAVGS model and its linear equivalent, a new uncertain system characterization is proposed that captures unmodeled dynamics, parameter variation, and external disturbances. Based on the newly proposed linear uncertain model for the quarter car SAVGS system, a constrained optimal control problem (OCP) is presented in the form of a linear matrix inequality (LMI) optimization. More specifically, utilizing semidefinite relaxation techniques a state-feedback robust model predictive control (RMPC) scheme is presented and integrated with the nonlinear multi-body SAVGS model, where state-feedback gain and control perturbation are computed online to optimise performance, while physical and design constraints are preserved. Numerical simulation results with different ISO-defined road events demonstrate the robustness and significant performance improvement in terms of ride comfort and road holding of the proposed approach, as compared to the conventional passive suspension, as well as, to actively controlled SAVGS by a previously developed conventional H-infinity control scheme.
PolyTOPS: Reconfigurable and Flexible Polyhedral Scheduler
Authors: Authors: Gianpietro Consolaro, Zhen Zhang, Harenome Razanajato, Nelson Lossing, Nassim Tchoulak, Adilla Susungi, Artur Cesar Araujo Alves, Renwei Zhang, Denis Barthou, Corinne Ancourt, Cedric Bastoul
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Computation and Language (cs.CL); Performance (cs.PF)
Abstract
Polyhedral techniques have been widely used for automatic code optimization in low-level compilers and higher-level processes. Loop optimization is central to this technique, and several polyhedral schedulers like Feautrier, Pluto, isl and Tensor Scheduler have been proposed, each of them targeting a different architecture, parallelism model, or application scenario. The need for scenario-specific optimization is growing due to the heterogeneity of architectures. One of the most critical cases is represented by NPUs (Neural Processing Units) used for AI, which may require loop optimization with different objectives. Another factor to be considered is the framework or compiler in which polyhedral optimization takes place. Different scenarios, depending on the target architecture, compilation environment, and application domain, may require different kinds of optimization to best exploit the architecture feature set. We introduce a new configurable polyhedral scheduler, PolyTOPS, that can be adjusted to various scenarios with straightforward, high-level configurations. This scheduler allows the creation of diverse scheduling strategies that can be both scenario-specific (like state-of-the-art schedulers) and kernel-specific, breaking the concept of a one-size-fits-all scheduler approach. PolyTOPS has been used with isl and CLooG as code generators and has been integrated in MindSpore AKG deep learning compiler. Experimental results in different scenarios show good performance: a geomean speedup of 7.66x on MindSpore (for the NPU Ascend architecture) hybrid custom operators over isl scheduling, a geomean speedup up to 1.80x on PolyBench on different multicore architectures over Pluto scheduling. Finally, some comparisons with different state-of-the-art tools are presented in the PolyMage scenario.
Learning Joint Space Reference Manifold for Reliable Physical Assistance
Abstract
This paper presents a study on the use of the Talos humanoid robot for performing assistive sit-to-stand or stand-to-sit tasks. In such tasks, the human exerts a large amount of force (100--200 N) within a very short time (2--8 s), posing significant challenges in terms of human unpredictability and robot stability control. To address these challenges, we propose an approach for finding a spatial reference for the robot, which allows the robot to move according to the force exerted by the human and control its stability during the task. Specifically, we focus on the problem of finding a 1D manifold for the robot, while assuming a simple controller to guide its movement on this manifold. To achieve this, we use a functional representation to parameterize the manifold and solve an optimization problem that takes into account the robot's stability and the unpredictability of human behavior. We demonstrate the effectiveness of our approach through simulations and experiments with the Talos robot, showing robustness and adaptability.
A Closed-form Solution for Weight Optimization in Fully-connected Feed-forward Neural Networks
Authors: Authors: Slavisa Tomic, João Pedro Matos-Carvalho, Marko Beko
Abstract
This work addresses weight optimization problem for fully-connected feed-forward neural networks. Unlike existing approaches that are based on back-propagation (BP) and chain rule gradient-based optimization (which implies iterative execution, potentially burdensome and time-consuming in some cases), the proposed approach offers the solution for weight optimization in closed-form by means of least squares (LS) methodology. In the case where the input-to-output mapping is injective, the new approach optimizes the weights in a back-propagating fashion in a single iteration by jointly optimizing a set of weights in each layer for each neuron. In the case where the input-to-output mapping is not injective (e.g., in classification problems), the proposed solution is easily adapted to obtain its final solution in a few iterations. An important advantage over the existing solutions is that these computations (for all neurons in a layer) are independent from each other; thus, they can be carried out in parallel to optimize all weights in a given layer simultaneously. Furthermore, its running time is deterministic in the sense that one can obtain the exact number of computations necessary to optimize the weights in all network layers (per iteration, in the case of non-injective mapping). Our simulation and empirical results show that the proposed scheme, BPLS, works well and is competitive with existing ones in terms of accuracy, but significantly surpasses them in terms of running time. To summarize, the new method is straightforward to implement, is competitive and computationally more efficient than the existing ones, and is well-tailored for parallel implementation.
Stylometry Analysis of Multi-authored Documents for Authorship and Author Style Change Detection
Authors: Authors: Muhammad Tayyab Zamir, Muhammad Asif Ayub, Asma Gul, Nasir Ahmad, Kashif Ahmad
Abstract
In recent years, the increasing use of Artificial Intelligence based text generation tools has posed new challenges in document provenance, authentication, and authorship detection. However, advancements in stylometry have provided opportunities for automatic authorship and author change detection in multi-authored documents using style analysis techniques. Style analysis can serve as a primary step toward document provenance and authentication through authorship detection. This paper investigates three key tasks of style analysis: (i) classification of single and multi-authored documents, (ii) single change detection, which involves identifying the point where the author switches, and (iii) multiple author-switching detection in multi-authored documents. We formulate all three tasks as classification problems and propose a merit-based fusion framework that integrates several state-of-the-art natural language processing (NLP) algorithms and weight optimization techniques. We also explore the potential of special characters, which are typically removed during pre-processing in NLP applications, on the performance of the proposed methods for these tasks by conducting extensive experiments on both cleaned and raw datasets. Experimental results demonstrate significant improvements over existing solutions for all three tasks on a benchmark dataset.
Optimally Blending Honeypots into Production Networks: Hardness and Algorithms
Authors: Authors: Md Mahabub Uz Zaman, Liangde Tao, Mark Maldonado, Chang Liu, Ahmed Sunny, Shouhuai Xu, Lin Chen
Subjects: Cryptography and Security (cs.CR); Computational Complexity (cs.CC)
Abstract
Honeypot is an important cyber defense technique that can expose attackers new attacks. However, the effectiveness of honeypots has not been systematically investigated, beyond the rule of thumb that their effectiveness depends on how they are deployed. In this paper, we initiate a systematic study on characterizing the cybersecurity effectiveness of a new paradigm of deploying honeypots: blending honeypot computers (or IP addresses) into production computers. This leads to the following Honeypot Deployment (HD) problem, How should the defender blend honeypot computers into production computers to maximize the utility in forcing attackers to expose their new attacks while minimizing the loss to the defender in terms of the digital assets stored in the compromised production computers? We formalize HD as a combinatorial optimization problem, prove its NP hardness, provide a near optimal algorithm (i.e., polynomial time approximation scheme). We also conduct simulations to show the impact of attacker capabilities.
Keyword: adam
Bicycle Stabilization using mechanism optimization and Digital LQR
Abstract
This study introduces lateral pendulum as an innovative balancer design for bicycle stabilization. This pendulum, operating in the bicycle's vertical plane, enables the bicycle to remain stationary. The paper develops a dynamic model for a bicycle equipped with this lateral pendulum, using Lagrange's method, where the equations are validated with ADAMS software. The stabilization is demonstrated with traditional vertical and novel lateral pendulums, managed through a genetic-pole placement control algorithm. This approach showcases the superiority of the lateral pendulum over traditional methods, including vertical pendulums and steering the handlebar. Additionally, a Digital Linear Quadratic Regulator controller is implemented for practical application, further enhancing system stability.
Keyword: gradient
CrisisKAN: Knowledge-infused and Explainable Multimodal Attention Network for Crisis Event Classification
Authors: Authors: Shubham Gupta, Nandini Saini, Suman Kundu, Debasis Das
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Abstract
Pervasive use of social media has become the emerging source for real-time information (like images, text, or both) to identify various events. Despite the rapid growth of image and text-based event classification, the state-of-the-art (SOTA) models find it challenging to bridge the semantic gap between features of image and text modalities due to inconsistent encoding. Also, the black-box nature of models fails to explain the model's outcomes for building trust in high-stakes situations such as disasters, pandemic. Additionally, the word limit imposed on social media posts can potentially introduce bias towards specific events. To address these issues, we proposed CrisisKAN, a novel Knowledge-infused and Explainable Multimodal Attention Network that entails images and texts in conjunction with external knowledge from Wikipedia to classify crisis events. To enrich the context-specific understanding of textual information, we integrated Wikipedia knowledge using proposed wiki extraction algorithm. Along with this, a guided cross-attention module is implemented to fill the semantic gap in integrating visual and textual data. In order to ensure reliability, we employ a model-specific approach called Gradient-weighted Class Activation Mapping (Grad-CAM) that provides a robust explanation of the predictions of the proposed model. The comprehensive experiments conducted on the CrisisMMD dataset yield in-depth analysis across various crisis-specific tasks and settings. As a result, CrisisKAN outperforms existing SOTA methodologies and provides a novel view in the domain of explainable multimodal event classification.
SD-MVS: Segmentation-Driven Deformation Multi-View Stereo with Spherical Refinement and EM optimization
Abstract
In this paper, we introduce Segmentation-Driven Deformation Multi-View Stereo (SD-MVS), a method that can effectively tackle challenges in 3D reconstruction of textureless areas. We are the first to adopt the Segment Anything Model (SAM) to distinguish semantic instances in scenes and further leverage these constraints for pixelwise patch deformation on both matching cost and propagation. Concurrently, we propose a unique refinement strategy that combines spherical coordinates and gradient descent on normals and pixelwise search interval on depths, significantly improving the completeness of reconstructed 3D model. Furthermore, we adopt the Expectation-Maximization (EM) algorithm to alternately optimize the aggregate matching cost and hyperparameters, effectively mitigating the problem of parameters being excessively dependent on empirical tuning. Evaluations on the ETH3D high-resolution multi-view stereo benchmark and the Tanks and Temples dataset demonstrate that our method can achieve state-of-the-art results with less time consumption.
Batch-ICL: Effective, Efficient, and Order-Agnostic In-Context Learning
Authors: Authors: Kaiyi Zhang, Ang Lv, Yuhan Chen, Hansen Ha, Tao Xu, Rui Yan
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL)
Abstract
In this paper, by treating in-context learning (ICL) as a meta-optimization process, we explain why LLMs are sensitive to the order of ICL examples. This understanding leads us to the development of Batch-ICL, an effective, efficient, and order-agnostic inference algorithm for ICL. Differing from the standard N-shot learning approach, Batch-ICL employs $N$ separate 1-shot forward computations and aggregates the resulting meta-gradients. These aggregated meta-gradients are then applied to a zero-shot learning to generate the final prediction. This batch processing approach renders the LLM agnostic to the order of ICL examples. Through extensive experiments and analysis, we demonstrate that Batch-ICL consistently outperforms most permutations of example sequences. In some cases, it even exceeds the performance of the optimal order for standard ICL, all while reducing the computational resources required. Furthermore, we develop a novel variant of Batch-ICL featuring multiple "epochs" of meta-optimization. This variant implicitly explores permutations of ICL examples, further enhancing ICL performance.
AI-enabled Priority and Auction-Based Spectrum Management for 6G
Authors: Authors: Mina Khadem, Farshad Zeinali, Nader Mokari, Hamid Saeedi
Subjects: Systems and Control (eess.SY); Networking and Internet Architecture (cs.NI); Signal Processing (eess.SP)
Abstract
In this paper, we present a quality of service (QoS)-aware priority-based spectrum management scheme to guarantee the minimum required bit rate of vertical sector players (VSPs) in the 5G and beyond generation, including the 6th generation (6G). VSPs are considered as spectrum leasers to optimize the overall spectrum efficiency of the network from the perspective of the mobile network operator (MNO) as the spectrum licensee and auctioneer. We exploit a modified Vickrey-Clarke-Groves (VCG) auction mechanism to allocate the spectrum to them where the QoS and the truthfulness of bidders are considered as two important parameters for prioritization of VSPs. The simulation is done with the help of deep deterministic policy gradient (DDPG) as a deep reinforcement learning (DRL)-based algorithm. Simulation results demonstrate that deploying the DDPG algorithm results in significant advantages. In particular, the efficiency of the proposed spectrum management scheme is about %85 compared to the %35 efficiency in traditional auction methods.
Robust fully discrete error bounds for the Kuznetsov equation in the inviscid limit
Abstract
The Kuznetsov equation is a classical wave model of acoustics that incorporates quadratic gradient nonlinearities. When its strong damping vanishes, it undergoes a singular behavior change, switching from a parabolic-like to a hyperbolic quasilinear evolution. In this work, we establish for the first time the optimal error bounds for its finite element approximation as well as a semi-implicit fully discrete approximation that are robust with respect to the vanishing damping parameter. The core of the new arguments lies in devising energy estimates directly for the error equation where one can more easily exploit the polynomial structure of the nonlinearities and compensate inverse estimates with smallness conditions on the error. Numerical experiments are included to illustrate the theoretical results.
Maximum Causal Entropy Inverse Reinforcement Learning for Mean-Field Games
Authors: Authors: Berkay Anahtarci, Can Deha Kariksiz, Naci Saldi
Subjects: Systems and Control (eess.SY); Machine Learning (cs.LG); Optimization and Control (math.OC)
Abstract
In this paper, we introduce the maximum casual entropy Inverse Reinforcement Learning (IRL) problem for discrete-time mean-field games (MFGs) under an infinite-horizon discounted-reward optimality criterion. The state space of a typical agent is finite. Our approach begins with a comprehensive review of the maximum entropy IRL problem concerning deterministic and stochastic Markov decision processes (MDPs) in both finite and infinite-horizon scenarios. Subsequently, we formulate the maximum casual entropy IRL problem for MFGs - a non-convex optimization problem with respect to policies. Leveraging the linear programming formulation of MDPs, we restructure this IRL problem into a convex optimization problem and establish a gradient descent algorithm to compute the optimal solution with a rate of convergence. Finally, we present a new algorithm by formulating the MFG problem as a generalized Nash equilibrium problem (GNEP), which is capable of computing the mean-field equilibrium (MFE) for the forward RL problem. This method is employed to produce data for a numerical example. We note that this novel algorithm is also applicable to general MFE computations.
Scalar Representation of 2D Steady Vector Fields
Authors: Authors: Holger Theisel, Michael Motejat, Janos Zimmermann, Christian Rössl
Abstract
We introduce a representation of a 2D steady vector field ${{\mathbf v}}$ by two scalar fields $a$, $b$, such that the isolines of $a$ correspond to stream lines of ${{\mathbf v}}$, and $b$ increases with constant speed under integration of ${{\mathbf v}}$. This way, we get a direct encoding of stream lines, i.e., a numerical integration of ${{\mathbf v}}$ can be replaced by a local isoline extraction of $a$. To guarantee a solution in every case, gradient-preserving cuts are introduced such that the scalar fields are allowed to be discontinuous in the values but continuous in the gradient. Along with a piecewise linear discretization and a proper placement of the cuts, the fields $a$ and $b$ can be computed. We show several evaluations on non-trivial vector fields.
Identifying Policy Gradient Subspaces
Authors: Authors: Jan Schneider, Pierre Schumacher, Simon Guist, Le Chen, Daniel Häufle, Bernhard Schölkopf, Dieter Büchler
Abstract
Policy gradient methods hold great potential for solving complex continuous control tasks. Still, their training efficiency can be improved by exploiting structure within the optimization problem. Recent work indicates that supervised learning can be accelerated by leveraging the fact that gradients lie in a low-dimensional and slowly-changing subspace. In this paper, we conduct a thorough evaluation of this phenomenon for two popular deep policy gradient methods on various simulated benchmark tasks. Our results demonstrate the existence of such gradient subspaces despite the continuously changing data distribution inherent to reinforcement learning. These findings reveal promising directions for future work on more efficient reinforcement learning, e.g., through improving parameter-space exploration or enabling second-order optimization.
A Closed-form Solution for Weight Optimization in Fully-connected Feed-forward Neural Networks
Authors: Authors: Slavisa Tomic, João Pedro Matos-Carvalho, Marko Beko
Abstract
This work addresses weight optimization problem for fully-connected feed-forward neural networks. Unlike existing approaches that are based on back-propagation (BP) and chain rule gradient-based optimization (which implies iterative execution, potentially burdensome and time-consuming in some cases), the proposed approach offers the solution for weight optimization in closed-form by means of least squares (LS) methodology. In the case where the input-to-output mapping is injective, the new approach optimizes the weights in a back-propagating fashion in a single iteration by jointly optimizing a set of weights in each layer for each neuron. In the case where the input-to-output mapping is not injective (e.g., in classification problems), the proposed solution is easily adapted to obtain its final solution in a few iterations. An important advantage over the existing solutions is that these computations (for all neurons in a layer) are independent from each other; thus, they can be carried out in parallel to optimize all weights in a given layer simultaneously. Furthermore, its running time is deterministic in the sense that one can obtain the exact number of computations necessary to optimize the weights in all network layers (per iteration, in the case of non-injective mapping). Our simulation and empirical results show that the proposed scheme, BPLS, works well and is competitive with existing ones in terms of accuracy, but significantly surpasses them in terms of running time. To summarize, the new method is straightforward to implement, is competitive and computationally more efficient than the existing ones, and is well-tailored for parallel implementation.
Keyword: super-resolution
DFU: scale-robust diffusion model for zero-shot super-resolution image generation
Authors: Authors: Alex Havrilla, Kevin Rojas, Wenjing Liao, Molei Tao
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Abstract
Diffusion generative models have achieved remarkable success in generating images with a fixed resolution. However, existing models have limited ability to generalize to different resolutions when training data at those resolutions are not available. Leveraging techniques from operator learning, we present a novel deep-learning architecture, Dual-FNO UNet (DFU), which approximates the score operator by combining both spatial and spectral information at multiple resolutions. Comparisons of DFU to baselines demonstrate its scalability: 1) simultaneously training on multiple resolutions improves FID over training at any single fixed resolution; 2) DFU generalizes beyond its training resolutions, allowing for coherent, high-fidelity generation at higher-resolutions with the same model, i.e. zero-shot super-resolution image-generation; 3) we propose a fine-tuning strategy to further enhance the zero-shot super-resolution image-generation capability of our model, leading to a FID of 11.3 at 1.66 times the maximum training resolution on FFHQ, which no other method can come close to achieving.
TriNeRFLet: A Wavelet Based Multiscale Triplane NeRF Representation
Authors: Authors: Rajaei Khatib, Raja Giryes
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Abstract
In recent years, the neural radiance field (NeRF) model has gained popularity due to its ability to recover complex 3D scenes. Following its success, many approaches proposed different NeRF representations in order to further improve both runtime and performance. One such example is Triplane, in which NeRF is represented using three 2D feature planes. This enables easily using existing 2D neural networks in this framework, e.g., to generate the three planes. Despite its advantage, the triplane representation lagged behind in its 3D recovery quality compared to NeRF solutions. In this work, we propose TriNeRFLet, a 2D wavelet-based multiscale triplane representation for NeRF, which closes the 3D recovery performance gap and is competitive with current state-of-the-art methods. Building upon the triplane framework, we also propose a novel super-resolution (SR) technique that combines a diffusion model with TriNeRFLet for improving NeRF resolution.
Frequency-Time Diffusion with Neural Cellular Automata
Authors: Authors: John Kalkhof, Arlene Kühn, Yannik Frisch, Anirban Mukhopadhyay
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Abstract
Denoising Diffusion Models (DDMs) have become the leading generative technique for synthesizing high-quality images but are often constrained by their UNet-based architectures that impose certain limitations. In particular, the considerable size of often hundreds of millions of parameters makes them impractical when hardware resources are limited. However, even with powerful hardware, processing images in the gigapixel range is difficult. This is especially true in fields such as microscopy or satellite imaging, where such challenges arise from the limitation to a predefined generative size and the inefficient scaling to larger images. We present two variations of Neural Cellular Automata (NCA)-based DDM methods to address these challenges and jumpstart NCA-based DDMs: Diff-NCA and FourierDiff-NCA. Diff-NCA performs diffusion by using only local features of the underlying distribution, making it suitable for applications where local features are critical. To communicate global knowledge in image space, naive NCA setups require timesteps that increase with the image scale. We solve this bottleneck of current NCA architectures by introducing FourierDiff-NCA, which advances Diff-NCA by adding a Fourier-based diffusion process and combines the frequency-organized Fourier space with the image space. By initiating diffusion in the Fourier domain and finalizing it in the image space, FourierDiff-NCA accelerates global communication. We validate our techniques by using Diff-NCA (208k parameters) to generate high-resolution digital pathology scans at 576x576 resolution and FourierDiff-NCA (887k parameters) to synthesize CelebA images at 64x64, outperforming VNCA and five times bigger UNet-based DDMs. In addition, we demonstrate FourierDiff-NCA's capabilities in super-resolution, OOD image synthesis, and inpainting without additional training.
Video Super-Resolution Transformer with Masked Inter&Intra-Frame Attention
Abstract
Recently, Vision Transformer has achieved great success in recovering missing details in low-resolution sequences, i.e., the video super-resolution (VSR) task.Despite its superiority in VSR accuracy, the heavy computational burden as well as the large memory footprint hinder the deployment of Transformer-based VSR models on constrained devices.In this paper, we address the above issue by proposing a novel feature-level masked processing framework: VSR with Masked Intra and inter frame Attention (MIA-VSR).The core of MIA-VSR is leveraging feature-level temporal continuity between adjacent frames to reduce redundant computations and make more rational use of previously enhanced SR features. Concretely, we propose an intra-frame and inter-frame attention block which takes the respective roles of past features and input features into consideration and only exploits previously enhanced features to provide supplementary information. In addition, an adaptive block-wise mask prediction module is developed to skip unimportant computations according to feature similarity between adjacent frames. We conduct detailed ablation studies to validate our contributions and compare the proposed method with recent state-of-the-art VSR approaches. The experimental results demonstrate that MIA-VSR improves the memory and computation efficiency over state-of-the-art methods, without trading off PSNR accuracy. The code is available at https://github.com/LabShuHangGU/MIA-VSR.
Keyword: sgd
There is no result
Keyword: optimization
Multi-Modal Optimization with k-Cluster Big Bang-Big Crunch Algorithm
Scissorhands: Scrub Data Influence via Connection Sensitivity in Networks
Kimera2: Robust and Accurate Metric-Semantic SLAM in the Real World
An ontology alignment method with user intervention using compact differential evolution with adaptive parameter control
A Temporal-Spectral Fusion Transformer with Subject-specific Adapter for Enhancing RSVP-BCI Decoding
Ordering-Flexible Multi-Robot Coordination for MovingTarget Convoying Using Long-TermTask Execution
RotationDrag: Point-based Image Editing with Rotated Diffusion Features
Batch-ICL: Effective, Efficient, and Order-Agnostic In-Context Learning
Maximum Causal Entropy Inverse Reinforcement Learning for Mean-Field Games
Mutual Enhancement of Large Language and Reinforcement Learning Models through Bi-Directional Feedback Mechanisms: A Case Study
Identifying Policy Gradient Subspaces
Block Majorization Minimization with Extrapolation and Application to $β$-NMF
Real-time MPC with Control Barrier Functions for Autonomous Driving using Safety Enhanced Collocation
Data-Efficient Interactive Multi-Objective Optimization Using ParEGO
LMI-based robust model predictive control for a quarter car with series active variable geometry suspension
PolyTOPS: Reconfigurable and Flexible Polyhedral Scheduler
Learning Joint Space Reference Manifold for Reliable Physical Assistance
A Closed-form Solution for Weight Optimization in Fully-connected Feed-forward Neural Networks
Stylometry Analysis of Multi-authored Documents for Authorship and Author Style Change Detection
Optimally Blending Honeypots into Production Networks: Hardness and Algorithms
Keyword: adam
Bicycle Stabilization using mechanism optimization and Digital LQR
Keyword: gradient
CrisisKAN: Knowledge-infused and Explainable Multimodal Attention Network for Crisis Event Classification
SD-MVS: Segmentation-Driven Deformation Multi-View Stereo with Spherical Refinement and EM optimization
Batch-ICL: Effective, Efficient, and Order-Agnostic In-Context Learning
AI-enabled Priority and Auction-Based Spectrum Management for 6G
Robust fully discrete error bounds for the Kuznetsov equation in the inviscid limit
Maximum Causal Entropy Inverse Reinforcement Learning for Mean-Field Games
Scalar Representation of 2D Steady Vector Fields
Identifying Policy Gradient Subspaces
A Closed-form Solution for Weight Optimization in Fully-connected Feed-forward Neural Networks
Keyword: super-resolution
DFU: scale-robust diffusion model for zero-shot super-resolution image generation
TriNeRFLet: A Wavelet Based Multiscale Triplane NeRF Representation
Frequency-Time Diffusion with Neural Cellular Automata
Video Super-Resolution Transformer with Masked Inter&Intra-Frame Attention