Abstract
Real time accurate solutions of large scale complex dynamical systems are in critical need for control, optimization, uncertainty quantification, and decision-making in practical engineering and science applications. This paper contributes in this direction a model constrained tangent manifold learning (mcTangent) approach. At the heart of mcTangent is the synergy of several desirable strategies: i) a tangent manifold learning to take advantage of the neural network speed and the time accurate nature of the method of lines; ii) a model constrained approach to encode the neural network tangent with the underlying governing equations; iii) sequential learning strategies to promote long time stability and accuracy; and iv) data randomization approach to implicitly enforce the smoothness of the neural network tangent and its likeliness to the truth tangent up second order derivatives in order to further enhance the stability and accuracy of mcTangent solutions. Both semi heuristic and rigorous arguments are provided to analyze and justify the proposed approach. Several numerical results for transport equation, viscous Burgers equation, and Navier Stokes equation are presented to study and demonstrate the capability of the proposed mcTangent learning approach.
Implicit High-Order Meshing using Boundary and Interface Fitting
Authors: Authors: Jorge-Luis Barrera, Tzanio Kolev, Ketan Mittal, Vladimir Tomov
Subjects: Computational Geometry (cs.CG); Computational Engineering, Finance, and Science (cs.CE)
Abstract
We propose a method for implicit high-order meshing that aligns easy-to-generate meshes with the boundaries and interfaces of the domain of interest. Our focus is particularly on the case when the target surface is prescribed as the zero isocontour of a smooth discrete function. Common examples of this scenario include using level set functions to represent material interfaces in multimaterial configurations, and evolving geometries in shape and topology optimization. The proposed method formulates the mesh optimization problem as a variational minimization of the sum of a chosen mesh-quality metric using the Target-Matrix Optimization Paradigm (TMOP) and a penalty term that weakly forces the selected faces of the mesh to align with the target surface. The distinct features of the method are use of a source mesh to represent the level set function with sufficient accuracy, and adaptive strategies for setting the penalization weight and selecting the faces of the mesh to be fit to the target isocontour of the level set field. We demonstrate that the proposed method is robust for generating boundary- and interface-fitted meshes for curvilinear domains using different element types in 2D and 3D.
Robust Reinforcement Learning using Offline Data
Authors: Authors: Kishan Panaganti, Zaiyan Xu, Dileep Kalathil, Mohammad Ghavamzadeh
Abstract
The goal of robust reinforcement learning (RL) is to learn a policy that is robust against the uncertainty in model parameters. Parameter uncertainty commonly occurs in many real-world RL applications due to simulator modeling errors, changes in the real-world system dynamics over time, and adversarial disturbances. Robust RL is typically formulated as a max-min problem, where the objective is to learn the policy that maximizes the value against the worst possible models that lie in an uncertainty set. In this work, we propose a robust RL algorithm called Robust Fitted Q-Iteration (RFQI), which uses only an offline dataset to learn the optimal robust policy. Robust RL with offline data is significantly more challenging than its non-robust counterpart because of the minimization over all models present in the robust Bellman operator. This poses challenges in offline data collection, optimization over the models, and unbiased estimation. In this work, we propose a systematic approach to overcome these challenges, resulting in our RFQI algorithm. We prove that RFQI learns a near-optimal robust policy under standard assumptions and demonstrate its superior performance on standard benchmark problems.
Capturing Dependencies within Machine Learning via a Formal Process Model
Authors: Authors: Fabian Ritz, Thomy Phan, Andreas Sedlmeier, Philipp Altmann, Jan Wieghardt, Reiner Schmid, Horst Sauer, Cornel Klein, Claudia Linnhoff-Popien, Thomas Gabor
Abstract
The development of Machine Learning (ML) models is more than just a special case of software development (SD): ML models acquire properties and fulfill requirements even without direct human interaction in a seemingly uncontrollable manner. Nonetheless, the underlying processes can be described in a formal way. We define a comprehensive SD process model for ML that encompasses most tasks and artifacts described in the literature in a consistent way. In addition to the production of the necessary artifacts, we also focus on generating and validating fitting descriptions in the form of specifications. We stress the importance of further evolving the ML model throughout its life-cycle even after initial training and testing. Thus, we provide various interaction points with standard SD processes in which ML often is an encapsulated task. Further, our SD process model allows to formulate ML as a (meta-) optimization problem. If automated rigorously, it can be used to realize self-adaptive autonomous systems. Finally, our SD process model features a description of time that allows to reason about the progress within ML development processes. This might lead to further applications of formal methods within the field of ML.
Efficient Joint-Dimensional Search with Solution Space Regularization for Real-Time Semantic Segmentation
Authors: Authors: Peng Ye, Baopu Li, Tao Chen, Jiayuan Fan, Zhen Mei, Chen Lin, Chongyan Zuo, Qinghua Chi, Wanli Ouyan
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Abstract
Semantic segmentation is a popular research topic in computer vision, and many efforts have been made on it with impressive results. In this paper, we intend to search an optimal network structure that can run in real-time for this problem. Towards this goal, we jointly search the depth, channel, dilation rate and feature spatial resolution, which results in a search space consisting of about 2.78*10^324 possible choices. To handle such a large search space, we leverage differential architecture search methods. However, the architecture parameters searched using existing differential methods need to be discretized, which causes the discretization gap between the architecture parameters found by the differential methods and their discretized version as the final solution for the architecture search. Hence, we relieve the problem of discretization gap from the innovative perspective of solution space regularization. Specifically, a novel Solution Space Regularization (SSR) loss is first proposed to effectively encourage the supernet to converge to its discrete one. Then, a new Hierarchical and Progressive Solution Space Shrinking method is presented to further achieve high efficiency of searching. In addition, we theoretically show that the optimization of SSR loss is equivalent to the L_0-norm regularization, which accounts for the improved search-evaluation gap. Comprehensive experiments show that the proposed search scheme can efficiently find an optimal network structure that yields an extremely fast speed (175 FPS) of segmentation with a small model size (1 M) while maintaining comparable accuracy.
Adaptive Learning Rates for Faster Stochastic Gradient Methods
Authors: Authors: Samuel Horváth, Konstantin Mishchenko, Peter Richtárik
Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC)
Abstract
In this work, we propose new adaptive step size strategies that improve several stochastic gradient methods. Our first method (StoPS) is based on the classical Polyak step size (Polyak, 1987) and is an extension of the recent development of this method for the stochastic optimization-SPS (Loizou et al., 2021), and our second method, denoted GraDS, rescales step size by "diversity of stochastic gradients". We provide a theoretical analysis of these methods for strongly convex smooth functions and show they enjoy deterministic-like rates despite stochastic gradients. Furthermore, we demonstrate the theoretical superiority of our adaptive methods on quadratic objectives. Unfortunately, both StoPS and GraDS depend on unknown quantities, which are only practical for the overparametrized models. To remedy this, we drop this undesired dependence and redefine StoPS and GraDS to StoP and GraD, respectively. We show that these new methods converge linearly to the neighbourhood of the optimal solution under the same assumptions. Finally, we corroborate our theoretical claims by experimental validation, which reveals that GraD is particularly useful for deep learning optimization.
Diversifying Message Aggregation in Multi-Agent Communication via Normalized Tensor Nuclear Norm Regularization
Authors: Authors: Yuanzhao Zhai, Kele Xu, Bo Ding, Dawei Feng, Zijian Gao, Huaimin Wang
Abstract
Aggregating messages is a key component for the communication of multi-agent reinforcement learning (Comm-MARL). Recently, it has witnessed the prevalence of graph attention networks (GAT) in Comm-MARL, where agents can be represented as nodes and messages can be aggregated via the weighted passing. While successful, GAT can lead to homogeneity in the strategies of message aggregation, and the "core" agent may excessively influence other agents' behaviors, which can severely limit the multi-agent coordination. To address this challenge, we first study the adjacency tensor of the communication graph and show that the homogeneity of message aggregation could be measured by the normalized tensor rank. Since the rank optimization problem is known to be NP-hard, we define a new nuclear norm to replace the rank, a convex surrogate of normalized tensor rank. Leveraging the norm, we further propose a plug-and-play regularizer on the adjacency tensor, named Normalized Tensor Nuclear Norm Regularization (NTNNR), to enrich the diversity of message aggregation actively. We extensively evaluate GAT with the proposed regularizer in both cooperative and mixed cooperative-competitive scenarios. The results demonstrate that aggregating messages using NTNNR-enhanced GAT can improve the efficiency of the training and achieve higher asymptotic performance than existing message aggregation methods. When NTNNR is applied to existing graph-attention Comm-MARL methods, we also observe significant performance improvements on the StarCraft II micromanagement benchmarks.
Differentiable Inference of Temporal Logic Formulas
Abstract
We demonstrate the first Recurrent Neural Network architecture for learning Signal Temporal Logic formulas, and present the first systematic comparison of formula inference methods. Legacy systems embed much expert knowledge which is not explicitly formalized. There is great interest in learning formal specifications that characterize the ideal behavior of such systems -- that is, formulas in temporal logic that are satisfied by the system's output signals. Such specifications can be used to better understand the system's behavior and improve design of its next iteration. Previous inference methods either assumed certain formula templates, or did a heuristic enumeration of all possible templates. This work proposes a neural network architecture that infers the formula structure via gradient descent, eliminating the need for imposing any specific templates. It combines learning of formula structure and parameters in one optimization. Through systematic comparison, we demonstrate that this method achieves similar or better mis-classification rates (MCR) than enumerative and lattice methods. We also observe that different formulas can achieve similar MCR, empirically demonstrating the under-determinism of the problem of temporal logic inference.
Keyword: adam
There is no result
Keyword: gradient
Human Activity Recognition Using Cascaded Dual Attention CNN and Bi-Directional GRU Framework
Authors: Authors: Hayat Ullah, Arslan Munir
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Abstract
Vision-based human activity recognition has emerged as one of the essential research areas in video analytics domain. Over the last decade, numerous advanced deep learning algorithms have been introduced to recognize complex human actions from video streams. These deep learning algorithms have shown impressive performance for the human activity recognition task. However, these newly introduced methods either exclusively focus on model performance or the effectiveness of these models in terms of computational efficiency and robustness, resulting in a biased tradeoff in their proposals to deal with challenging human activity recognition problem. To overcome the limitations of contemporary deep learning models for human activity recognition, this paper presents a computationally efficient yet generic spatial-temporal cascaded framework that exploits the deep discriminative spatial and temporal features for human activity recognition. For efficient representation of human actions, we have proposed an efficient dual attentional convolutional neural network (CNN) architecture that leverages a unified channel-spatial attention mechanism to extract human-centric salient features in video frames. The dual channel-spatial attention layers together with the convolutional layers learn to be more attentive in the spatial receptive fields having objects over the number of feature maps. The extracted discriminative salient features are then forwarded to stacked bi-directional gated recurrent unit (Bi-GRU) for long-term temporal modeling and recognition of human actions using both forward and backward pass gradient learning. Extensive experiments are conducted, where the obtained results show that the proposed framework attains an improvement in execution time up to 167 times in terms of frames per second as compared to most of the contemporary action recognition methods.
Deep Reinforcement Learning for Dynamic Recommendation with Model-agnostic Counterfactual Policy Synthesis
Abstract
Recent advances in recommender systems have proved the potential of Reinforcement Learning (RL) to handle the dynamic evolution processes between users and recommender systems. However, learning to train an optimal RL agent is generally impractical with commonly sparse user feedback data in the context of recommender systems. To circumvent the lack of interaction of current RL-based recommender systems, we propose to learn a general Model-agnostic Counterfactual Synthesis Policy for counterfactual user interaction data augmentation. The counterfactual synthesis policy aims to synthesise counterfactual states while preserving significant information in the original state relevant to the user's interests, building upon two different training approaches we designed: learning with expert demonstrations and joint training. As a result, the synthesis of each counterfactual data is based on the current recommendation agent interaction with the environment to adapt to users' dynamic interests. We integrate the proposed policy Deep Deterministic Policy Gradient (DDPG), Soft Actor Critic (SAC) and Twin Delayed DDPG in an adaptive pipeline with a recommendation agent that can generate counterfactual data to improve the performance of recommendation. The empirical results on both online simulation and offline datasets demonstrate the effectiveness and generalisation of our counterfactual synthesis policy and verify that it improves the performance of RL recommendation agents.
Abstract
This paper introduces a quadrotor's autonomous take-off and landing system on a moving platform. The designed system addresses three challenging problems: fast pose estimation, restricted external localization, and effective obstacle avoidance. Specifically, first, we design a landing recognition and positioning system based on the AruCo marker to help the quadrotor quickly calculate the relative pose; second, we leverage a gradient-based local motion planner to generate collision-free reference trajectories rapidly for the quadrotor; third, we build an autonomous state machine that enables the quadrotor to complete its take-off, tracking and landing tasks in full autonomy; finally, we conduct experiments in simulated, real-world indoor and outdoor environments to verify the system's effectiveness and demonstrate its potential.
Adaptive Learning Rates for Faster Stochastic Gradient Methods
Authors: Authors: Samuel Horváth, Konstantin Mishchenko, Peter Richtárik
Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC)
Abstract
In this work, we propose new adaptive step size strategies that improve several stochastic gradient methods. Our first method (StoPS) is based on the classical Polyak step size (Polyak, 1987) and is an extension of the recent development of this method for the stochastic optimization-SPS (Loizou et al., 2021), and our second method, denoted GraDS, rescales step size by "diversity of stochastic gradients". We provide a theoretical analysis of these methods for strongly convex smooth functions and show they enjoy deterministic-like rates despite stochastic gradients. Furthermore, we demonstrate the theoretical superiority of our adaptive methods on quadratic objectives. Unfortunately, both StoPS and GraDS depend on unknown quantities, which are only practical for the overparametrized models. To remedy this, we drop this undesired dependence and redefine StoPS and GraDS to StoP and GraD, respectively. We show that these new methods converge linearly to the neighbourhood of the optimal solution under the same assumptions. Finally, we corroborate our theoretical claims by experimental validation, which reveals that GraD is particularly useful for deep learning optimization.
Fast Offline Policy Optimization for Large Scale Recommendation
Authors: Authors: Otmane Sakhi, David Rohde, Alexandre Gilotte
Abstract
Personalised interactive systems such as recommender systems require selecting relevant items dependent on context. Production systems need to identify the items rapidly from very large catalogues which can be efficiently solved using maximum inner product search technology. Offline optimisation of maximum inner product search can be achieved by a relaxation of the discrete problem resulting in policy learning or reinforce style learning algorithms. Unfortunately this relaxation step requires computing a sum over the entire catalogue making the complexity of the evaluation of the gradient (and hence each stochastic gradient descent iterations) linear in the catalogue size. This calculation is untenable in many real world examples such as large catalogue recommender systems severely limiting the usefulness of this method in practice. In this paper we show how it is possible to produce an excellent approximation of these policy learning algorithms that scale logarithmically with the catalogue size. Our contribution is based upon combining three novel ideas: a new Monte Carlo estimate of the gradient of a policy, the self normalised importance sampling estimator and the use of fast maximum inner product search at training time. Extensive experiments show our algorithm is an order of magnitude faster than naive approaches yet produces equally good policies.
Differentiable Inference of Temporal Logic Formulas
Abstract
We demonstrate the first Recurrent Neural Network architecture for learning Signal Temporal Logic formulas, and present the first systematic comparison of formula inference methods. Legacy systems embed much expert knowledge which is not explicitly formalized. There is great interest in learning formal specifications that characterize the ideal behavior of such systems -- that is, formulas in temporal logic that are satisfied by the system's output signals. Such specifications can be used to better understand the system's behavior and improve design of its next iteration. Previous inference methods either assumed certain formula templates, or did a heuristic enumeration of all possible templates. This work proposes a neural network architecture that infers the formula structure via gradient descent, eliminating the need for imposing any specific templates. It combines learning of formula structure and parameters in one optimization. Through systematic comparison, we demonstrate that this method achieves similar or better mis-classification rates (MCR) than enumerative and lattice methods. We also observe that different formulas can achieve similar MCR, empirically demonstrating the under-determinism of the problem of temporal logic inference.
Keyword: super-resolution
Learning Degradation Representations for Image Deblurring
Authors: Authors: Dasong Li, Yi Zhang, Ka Chun Cheung, Xiaogang Wang, Hongwei Qin, Hongsheng Li
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Abstract
In various learning-based image restoration tasks, such as image denoising and image super-resolution, the degradation representations were widely used to model the degradation process and handle complicated degradation patterns. However, they are less explored in learning-based image deblurring as blur kernel estimation cannot perform well in real-world challenging cases. We argue that it is particularly necessary for image deblurring to model degradation representations since blurry patterns typically show much larger variations than noisy patterns or high-frequency textures.In this paper, we propose a framework to learn spatially adaptive degradation representations of blurry images. A novel joint image reblurring and deblurring learning process is presented to improve the expressiveness of degradation representations. To make learned degradation representations effective in reblurring and deblurring, we propose a Multi-Scale Degradation Injection Network (MSDI-Net) to integrate them into the neural networks. With the integration, MSDI-Net can handle various and complicated blurry patterns adaptively. Experiments on the GoPro and RealBlur datasets demonstrate that our proposed deblurring framework with the learned degradation representations outperforms state-of-the-art methods with appealing improvements. The code is released at https://github.com/dasongli1/Learning_degradation.
Keyword: sgd
There is no result
Keyword: optimization
A Model-Constrained Tangent Manifold Learning Approach for Dynamical Systems
Implicit High-Order Meshing using Boundary and Interface Fitting
Robust Reinforcement Learning using Offline Data
Capturing Dependencies within Machine Learning via a Formal Process Model
Efficient Joint-Dimensional Search with Solution Space Regularization for Real-Time Semantic Segmentation
Adaptive Learning Rates for Faster Stochastic Gradient Methods
Diversifying Message Aggregation in Multi-Agent Communication via Normalized Tensor Nuclear Norm Regularization
Differentiable Inference of Temporal Logic Formulas
Keyword: adam
There is no result
Keyword: gradient
Human Activity Recognition Using Cascaded Dual Attention CNN and Bi-Directional GRU Framework
Deep Reinforcement Learning for Dynamic Recommendation with Model-agnostic Counterfactual Policy Synthesis
Quadrotor Autonomous Landing on Moving Platform
Adaptive Learning Rates for Faster Stochastic Gradient Methods
Fast Offline Policy Optimization for Large Scale Recommendation
Differentiable Inference of Temporal Logic Formulas
Keyword: super-resolution
Learning Degradation Representations for Image Deblurring