New submissions for Thu, 15 Dec 22

Keyword: sgd

There is no result

Keyword: optimization

Multi-Target Decision Making under Conditions of Severe Uncertainty

Authors: Authors: Christoph Jansen, Georg Schollmeyer, Thomas Augustin
Subjects: Artificial Intelligence (cs.AI); Theoretical Economics (econ.TH); Methodology (stat.ME)
Arxiv link: https://arxiv.org/abs/2212.06832
Pdf link: https://arxiv.org/pdf/2212.06832
Abstract The quality of consequences in a decision making problem under (severe) uncertainty must often be compared among different targets (goals, objectives) simultaneously. In addition, the evaluations of a consequence's performance under the various targets often differ in their scale of measurement, classically being either purely ordinal or perfectly cardinal. In this paper, we transfer recent developments from abstract decision theory with incomplete preferential and probabilistic information to this multi-target setting and show how -- by exploiting the (potentially) partial cardinal and partial probabilistic information -- more informative orders for comparing decisions can be given than the Pareto order. We discuss some interesting properties of the proposed orders between decision options and show how they can be concretely computed by linear optimization. We conclude the paper by demonstrating our framework in an artificial (but quite real-world) example in the context of comparing algorithms under different performance measures.
Are metaheuristics worth it? A computational comparison between nature-inspired and deterministic techniques on black-box optimization problems
Authors: Authors: Jakub Kudela
Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Optimization and Control (math.OC)
Arxiv link: https://arxiv.org/abs/2212.06875
Pdf link: https://arxiv.org/pdf/2212.06875
Abstract In the field of derivative-free optimization, both of its main branches, the deterministic and nature-inspired techniques, experienced in recent years substantial advancement. In this paper, we provide an extensive computational comparison of selected methods from each of these branches. The chosen representatives were either standard and well-utilized methods, or the best-performing methods from recent numerical comparisons. The computational comparison was performed on five different benchmark sets and the results were analyzed in terms of performance, time complexity, and convergence properties of the selected methods. The results showed that, when dealing with situations where the objective function evaluations are relatively cheap, the nature-inspired methods have a significantly better performance than their deterministic counterparts. However, in situations when the function evaluations are costly or otherwise prohibited, the deterministic methods might provide more consistent and overall better results.
Verifying term graph optimizations using Isabelle/HOL
Authors: Authors: Brae J. Webb, Ian J. Hayes, Mark Utting
Subjects: Programming Languages (cs.PL); Logic in Computer Science (cs.LO)
Arxiv link: https://arxiv.org/abs/2212.06956
Pdf link: https://arxiv.org/pdf/2212.06956
Abstract Our objective is to formally verify the correctness of the hundreds of expression optimization rules used within the GraalVM compiler. When defining the semantics of a programming language, expressions naturally form abstract syntax trees, or, terms. However, in order to facilitate sharing of common subexpressions, modern compilers represent expressions as term graphs. Defining the semantics of term graphs is more complicated than defining the semantics of their equivalent term representations. More significantly, defining optimizations directly on term graphs and proving semantics preservation is considerably more complicated than on the equivalent term representations. On terms, optimizations can be expressed as conditional term rewriting rules, and proofs that the rewrites are semantics preserving are relatively straightforward. In this paper, we explore an approach to using term rewrites to verify term graph transformations of optimizations within the GraalVM compiler. This approach significantly reduces the overall verification effort and allows for simpler encoding of optimization rules.
Safety Correction from Baseline: Towards the Risk-aware Policy in Robotics via Dual-agent Reinforcement Learning
Authors: Authors: Linrui Zhang, Zichen Yan, Li Shen, Shoujie Li, Xueqian Wang, Dacheng Tao
Subjects: Machine Learning (cs.LG); Robotics (cs.RO)
Arxiv link: https://arxiv.org/abs/2212.06998
Pdf link: https://arxiv.org/pdf/2212.06998
Abstract Learning a risk-aware policy is essential but rather challenging in unstructured robotic tasks. Safe reinforcement learning methods open up new possibilities to tackle this problem. However, the conservative policy updates make it intractable to achieve sufficient exploration and desirable performance in complex, sample-expensive environments. In this paper, we propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent. Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control. Concretely, the baseline agent is responsible for maximizing rewards under standard RL settings. Thus, it is compatible with off-the-shelf training techniques of unconstrained optimization, exploration and exploitation. On the other hand, the safe agent mimics the baseline agent for policy improvement and learns to fulfill safety constraints via off-policy RL tuning. In contrast to training from scratch, safe policy correction requires significantly fewer interactions to obtain a near-optimal policy. The dual policies can be optimized synchronously via a shared replay buffer, or leveraging the pre-trained model or the non-learning-based controller as a fixed baseline agent. Experimental results show that our approach can learn feasible skills without prior knowledge as well as deriving risk-averse counterparts from pre-trained unsafe policies. The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks with respect to both safety constraint satisfaction and sample efficiency.
Efficient Sensor Scheduling Strategy Based on Spatio-temporal Scope Information Model
Authors: Authors: Yang Liu, Chen Dong, Xiaoqi Qin, Xiaodong Xu
Subjects: Information Theory (cs.IT)
Arxiv link: https://arxiv.org/abs/2212.07008
Pdf link: https://arxiv.org/pdf/2212.07008
Abstract In this paper, based on the spatio-temporal correlation of sensor nodes in the Internet of Things (IoT), a Spatio-temporal Scope information model (SSIM) is proposed to quantify the scope valuable information of sensor data, which decays with space and time, to guide the system for efficient decision making in the sensed region. A simple sensor monitoring system containing three sensor nodes is considered, and two optimal scheduling decision mechanisms, single-step optimal and long-term optimal decision mechanisms, are proposed for the optimization problem. For the single-step mechanism, the scheduling results are analyzed theoretically, and approximate numerical bounds on the node layout between some of the scheduling results are obtained, consistent with the simulation results. For the long-term mechanism, the scheduling results with different node layouts are obtained using the Q-learning algorithm. The performance of the two mechanisms is verified by conducting experiments using the relative humidity dataset, and the differences in performance of the two mechanisms are discussed; in addition, the limitations of the model are summarized.
Rate-Splitting Multiple Access for Uplink Massive MIMO With Electromagnetic Exposure Constraints
Authors: Authors: Hanyu Jiang, Li You, Ahmed Elzanaty, Jue Wang, Wenjin Wang, Xiqi Gao, Mohamed-Slim Alouini
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)
Arxiv link: https://arxiv.org/abs/2212.07028
Pdf link: https://arxiv.org/pdf/2212.07028
Abstract Over the past few years, the prevalence of wireless devices has become one of the essential sources of electromagnetic (EM) radiation to the public. Facing with the swift development of wireless communications, people are skeptical about the risks of long-term exposure to EM radiation. As EM exposure is required to be restricted at user terminals, it is inefficient to blindly decrease the transmit power, which leads to limited spectral efficiency and energy efficiency (EE). Recently, rate-splitting multiple access (RSMA) has been proposed as an effective way to provide higher wireless transmission performance, which is a promising technology for future wireless communications. To this end, we propose using RSMA to increase the EE of massive MIMO uplink while limiting the EM exposure of users. In particularly, we investigate the optimization of the transmit covariance matrices and decoding order using statistical channel state information (CSI). The problem is formulated as non-convex mixed integer program, which is in general difficult to handle. We first propose a modified water-filling scheme to obtain the transmit covariance matrices with fixed decoding order. Then, a greedy approach is proposed to obtain the decoding permutation. Numerical results verify the effectiveness of the proposed EM exposure-aware EE maximization scheme for uplink RSMA.
A Predictive Operation Controller for an Electro-Thermal Microgrid Utilizing Variable Flow Temperatures
Authors: Authors: Max Rose, Christian A. Hans, Johannes Schiffer
Subjects: Systems and Control (eess.SY)
Arxiv link: https://arxiv.org/abs/2212.07078
Pdf link: https://arxiv.org/pdf/2212.07078
Abstract We propose an optimal operation control strategy for an electro-thermal microgrid. Compared to existing work, our approach increases flexibility by operating the thermal network with variable flow temperatures and in that way explicitly exploits its inherent storage capacities. To this end, the microgrid is represented by a multi-layer network composed of an electrical and a thermal layer. We show that the system behavior can be represented by a discrete-time state model derived from DC power flow approximations and 1d incompressible Euler equations. Both layers are interconnected via heat pumps. By combining this model with desired operating objectives and constraints, we obtain a constrained convex optimization problem. This is used to derive a model predictive control scheme for the optimal operation of electro-thermal microgrids. The performance of the proposed operation control algorithm is demonstrated in a numerical case study.
Artificial intelligence-driven digital twin of a modern house demonstrated in virtual reality
Authors: Authors: Elias Mohammed Elfarri, Adil Rasheed, Omer San
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Human-Computer Interaction (cs.HC)
Arxiv link: https://arxiv.org/abs/2212.07102
Pdf link: https://arxiv.org/pdf/2212.07102
Abstract A digital twin is defined as a virtual representation of a physical asset enabled through data and simulators for real-time prediction, optimization, monitoring, controlling, and improved decision-making. Unfortunately, the term remains vague and says little about its capability. Recently, the concept of capability level has been introduced to address this issue. Based on its capability, the concept states that a digital twin can be categorized on a scale from zero to five, referred to as standalone, descriptive, diagnostic, predictive, prescriptive, and autonomous, respectively. The current work introduces the concept in the context of the built environment. It demonstrates the concept by using a modern house as a use case. The house is equipped with an array of sensors that collect timeseries data regarding the internal state of the house. Together with physics-based and data-driven models, these data are used to develop digital twins at different capability levels demonstrated in virtual reality. The work, in addition to presenting a blueprint for developing digital twins, also provided future research directions to enhance the technology.
Efficient Non-isomorphic Graph Enumeration Algorithms for Subclasses of Perfect Graphs
Authors: Authors: Jun Kawahara, Toshiki Saitoh, Hirokazu Takeda, Ryo Yoshinaka, Yui Yoshioka
Subjects: Data Structures and Algorithms (cs.DS)
Arxiv link: https://arxiv.org/abs/2212.07119
Pdf link: https://arxiv.org/pdf/2212.07119
Abstract Intersection graphs are well-studied in the area of graph algorithms. Some intersection graph classes are known to have algorithms enumerating all unlabeled graphs by reverse search. Since these algorithms output graphs one by one and the numbers of graphs in these classes are vast, they work only for a small number of vertices. Binary decision diagrams (BDDs) are compact data structures for various types of data and useful for solving optimization and enumeration problems. This study proposes enumeration algorithms for five intersection graph classes, which admit $\mathrm{O}(n)$-bit string representations for their member graphs. Our algorithm for each class enumerates all unlabeled graphs with $n$ vertices over BDDs representing the binary strings in time polynomial in $n$. Moreover, our algorithms are extended to enumerate those with constraints on the maximum (bi)clique size and/or the number of edges.
Approximating Optimal Estimation of Time Offset Synchronization with Temperature Variations
Authors: Authors: Maurizio Mongelli, Stefano Scanzio
Subjects: Networking and Internet Architecture (cs.NI); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2212.07138
Pdf link: https://arxiv.org/pdf/2212.07138
Abstract The paper addresses the problem of time offset synchronization in the presence of temperature variations, which lead to a non-Gaussian environment. In this context, regular Kalman filtering reveals to be suboptimal. A functional optimization approach is developed in order to approximate optimal estimation of the clock offset between master and slave. A numerical approximation is provided to this aim, based on regular neural network training. Other heuristics are provided as well, based on spline regression. An extensive performance evaluation highlights the benefits of the proposed techniques, which can be easily generalized to several clock synchronization protocols and operating environments.
Robust Multitarget Tracking in Interference Environments: A Message-Passing Approach
Authors: Authors: Xianglong Bai, Hua Lan, Zengfu Wang, Quan Pan, Yuhang Hao, Can Li
Subjects: Systems and Control (eess.SY)
Arxiv link: https://arxiv.org/abs/2212.07182
Pdf link: https://arxiv.org/pdf/2212.07182
Abstract Multitarget tracking in the interference environments suffers from the nonuniform, unknown and time-varying clutter, resulting in dramatic performance deterioration. We address this challenge by proposing a robust multitarget tracking algorithm, which estimates the states of clutter and targets simultaneously by the message-passing (MP) approach. We define the non-homogeneous clutter with a finite mixture model containing a uniform component and multiple nonuniform components. The measured signal strength is utilized to estimate the mean signal-to-noise ratio (SNR) of targets and the mean clutter-to-noise ratio (CNR) of clutter, which are then used as additional feature information of targets and clutter to improve the performance of discrimination of targets from clutter. We also present a hybrid data association which can reason over correspondence between targets, clutter, and measurements. Then, a unified MP algorithm is used to infer the marginal posterior probability distributions of targets, clutter, and data association by splitting the joint probability distribution into a mean-field approximate part and a belief propagation part. As a result, a closed-loop iterative optimization of the posterior probability distribution can be obtained, which can effectively deal with the coupling between target tracking, clutter estimation and data association. Simulation results demonstrate the performance superiority and robustness of the proposed multitarget tracking algorithm compared with the probability hypothesis density (PHD) filter and the cardinalized PHD (CPHD) filter.
Multi-objective low-thrust spacecraft trajectory design using reachability analysis
Authors: Authors: Nikolaus Vertovec, Sina Ober-Blöbaum, Kostas Margellos
Subjects: Systems and Control (eess.SY); Optimization and Control (math.OC)
Arxiv link: https://arxiv.org/abs/2212.07209
Pdf link: https://arxiv.org/pdf/2212.07209
Abstract One of the fundamental problems in spacecraft trajectory design is finding the optimal transfer trajectory that minimizes the propellant consumption and transfer time simultaneously. We formulate this as a multi-objective optimal control (MOC) problem that involves optimizing over the initial or final state, subject to state constraints. Drawing on recent developments in reachability analysis subject to state constraints, we show that the proposed MOC problem can be stated as an optimization problem subject to a constraint that involves the sub-level set of the viscosity solution of a quasi-variational inequality. We then generalize this approach to account for more general optimal control problems in Bolza form. We relate these problems to the Pareto front of the developed multi-objective programs. The proposed approach is demonstrated on two low-thrust orbital transfer problems around a rotating asteroid.
RAGO: Recurrent Graph Optimizer For Multiple Rotation Averaging
Authors: Authors: Heng Li, Zhaopeng Cui, Shuaicheng Liu, Ping Tan
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2212.07211
Pdf link: https://arxiv.org/pdf/2212.07211
Abstract This paper proposes a deep recurrent Rotation Averaging Graph Optimizer (RAGO) for Multiple Rotation Averaging (MRA). Conventional optimization-based methods usually fail to produce accurate results due to corrupted and noisy relative measurements. Recent learning-based approaches regard MRA as a regression problem, while these methods are sensitive to initialization due to the gauge freedom problem. To handle these problems, we propose a learnable iterative graph optimizer minimizing a gauge-invariant cost function with an edge rectification strategy to mitigate the effect of inaccurate measurements. Our graph optimizer iteratively refines the global camera rotations by minimizing each node's single rotation objective function. Besides, our approach iteratively rectifies relative rotations to make them more consistent with the current camera orientations and observed relative rotations. Furthermore, we employ a gated recurrent unit to improve the result by tracing the temporal information of the cost graph. Our framework is a real-time learning-to-optimize rotation averaging graph optimizer with a tiny size deployed for real-world applications. RAGO outperforms previous traditional and deep methods on real-world and synthetic datasets. The code is available at https://github.com/sfu-gruvi-3dv/RAGO
FedSkip: Combatting Statistical Heterogeneity with Federated Skip Aggregation
Authors: Authors: Ziqing Fan, Yanfeng Wang, Jiangchao Yao, Lingjuan Lyu, Ya Zhang, Qi Tian
Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC)
Arxiv link: https://arxiv.org/abs/2212.07224
Pdf link: https://arxiv.org/pdf/2212.07224
Abstract The statistical heterogeneity of the non-independent and identically distributed (non-IID) data in local clients significantly limits the performance of federated learning. Previous attempts like FedProx, SCAFFOLD, MOON, FedNova and FedDyn resort to an optimization perspective, which requires an auxiliary term or re-weights local updates to calibrate the learning bias or the objective inconsistency. However, in addition to previous explorations for improvement in federated averaging, our analysis shows that another critical bottleneck is the poorer optima of client models in more heterogeneous conditions. We thus introduce a data-driven approach called FedSkip to improve the client optima by periodically skipping federated averaging and scattering local models to the cross devices. We provide theoretical analysis of the possible benefit from FedSkip and conduct extensive experiments on a range of datasets to demonstrate that FedSkip achieves much higher accuracy, better aggregation efficiency and competing communication efficiency. Source code is available at: https://github.com/MediaBrain-SJTU/FedSkip.
Randomized Joint Diagonalization of Symmetric Matrices
Authors: Authors: Haoze He, Daniel Kressner
Subjects: Numerical Analysis (math.NA)
Arxiv link: https://arxiv.org/abs/2212.07248
Pdf link: https://arxiv.org/pdf/2212.07248
Abstract Given a family of nearly commuting symmetric matrices, we consider the task of computing an orthogonal matrix that nearly diagonalizes every matrix in the family. In this paper, we propose and analyze randomized joint diagonalization (RJD) for performing this task. RJD applies a standard eigenvalue solver to random linear combinations of the matrices. Unlike existing optimization-based methods, RJD is simple to implement and leverages existing high-quality linear algebra software packages. Our main novel contribution is to prove robust recovery: Given a family that is $\epsilon$-close to a commuting family, RJD jointly diagonalizes this family, with high probability, up to an error of norm O($\epsilon$). No other existing method is known to enjoy such a universal robust recovery guarantee. We also discuss how the algorithm can be further improved by deflation techniques and demonstrate its state-of-the-art performance by numerical experiments with synthetic and real-world data.
LGRASS: Linear Graph Spectral Sparsification for Final Task of the 3rd ACM-China International Parallel Computing Challenge
Authors: Authors: Yuxuan Chen, Jiyan Qiu, Zidong Han, Chenhan Bai
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Arxiv link: https://arxiv.org/abs/2212.07297
Pdf link: https://arxiv.org/pdf/2212.07297
Abstract This paper presents our solution for optimization task of the 3rd ACM-China IPCC. By the complexity analysis, we identified three time-consuming subroutines of original algorithm: marking edges, computing pseudo inverse and sorting edges. These subroutines becomes the main performance bottleneck owing to their super-linear time complexity. To address this, we proposed LGRASS, a linear graph spectral sparsification algorithm to run in strictly linear time. LGRASS takes advantage of spanning tree properties and efficient algorithms to optimize bottleneck subroutines. Furthermore, we crafted a parallel processing scheme for LGRASS to make full use of multi-processor hardware. Experiment shows that our proposed method fulfils the task in dozens of milliseconds on official test cases and keep its linearity as graph size scales up on random test cases.
RIS-aided User Tracking in Near-Field MIMO Systems: Joint Precoding Design and RIS Optimization
Authors: Authors: Silvia Palmucci, Anna Guerra, Andrea Abrardo, Davide Dardari
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)
Arxiv link: https://arxiv.org/abs/2212.07333
Pdf link: https://arxiv.org/pdf/2212.07333
Abstract In this paper we propose a novel framework that aims to jointly design the reflection coefficients of multiple reconfigurable intelligent surfaces (RISs) and the precoding strategy of a single base station (BS) to optimize the tracking of the position and the velocity of a single multi-antenna user equipment (UE). Differently from the literature, and to keep the overall complexity affordable, we assume that RIS optimization is performed less frequently than localization and precoding adaptation. The optimal RIS and precoder strategy is compared with the classical beam focusing strategy and that which maximizes the communication rate. It is shown that if the RISs are optimized for communication, the solution is suboptimal when used for tracking purposes. Numerical results show that it is possible to achieve the 6G positioning requirements in a typical indoor environment with only one BS and a few RISs operating at millimeter waves.
Learning useful representations for shifting tasks and distributions
Authors: Authors: Jianyu Zhang, Léon Bottou
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2212.07346
Pdf link: https://arxiv.org/pdf/2212.07346
Abstract Does the dominant approach to learn representations (as a side effect of optimizing an expected cost for a single training distribution) remain a good approach when we are dealing with multiple distributions. Our thesis is that such scenarios are better served by representations that are "richer" than those obtained with a single optimization episode. This is supported by a collection of empirical results obtained with an apparently na\"ive ensembling technique: concatenating the representations obtained with multiple training episodes using the same data, model, algorithm, and hyper-parameters, but different random seeds. These independently trained networks perform similarly. Yet, in a number of scenarios involving new distributions, the concatenated representation performs substantially better than an equivalently sized network trained from scratch. This proves that the representations constructed by multiple training episodes are in fact different. Although their concatenation carries little additional information about the training task under the training distribution, it becomes substantially more informative when tasks or distributions change. Meanwhile, a single training episode is unlikely to yield such a redundant representation because the optimization process has no reason to accumulate features that do not incrementally improve the training performance.
Studying the workload of a fully decentralized Web3 system: IPFS
Authors: Authors: Pedro Ákos Costa, João Leitão, Yannis Psaras
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Arxiv link: https://arxiv.org/abs/2212.07375
Pdf link: https://arxiv.org/pdf/2212.07375
Abstract Web3 aims at creating a decentralized platform that is competitive with modern cloud infrastructures that support today's Internet. However, Web3 is still limited, supporting only applications in the domains of content creation and sharing, decentralized financing, and decentralized communication. This is mainly due to the technologies supporting Web3: blockchain, IPFS, and libp2p, that although provide a good collection of tools to develop Web3 applications, are still limited in terms of design and performance. This motivates the need to better understand these technologies as to enable novel optimizations that can push Web3 to its full potential. Unfortunately, understanding the current behavior of a fully decentralized large-scale distributed system is a difficult task, as there is no centralized authority that has full knowledge of the system operation. To this end, in this paper we characterize the workload of IPFS, a key enabler of Web3. To achieve this, we have collected traces from accesses performed by users to one of the most popular IPFS gateways located in North America for a period of two weeks. Through the fine analysis of these traces, we gathered the amount of requests to the system, and found the providers of the requested content. With this data, we characterize both the popularity of requested and provided content, as well as their geo-location (by matching IP address with the MaxMind database). Our results show that most of the requests in IPFS are only to a few different content, that is provided by large portion of peers in the system. Furthermore, our analysis also shows that most requests are provided by the two largest portions of providers in the system, located in North America and Europe. With these insights, we conclude that the current IPFS architecture is sub-optimal and propose a research agenda for the future.
Keyword: adam

There is no result

Keyword: gradient

Losses over Labels: Weakly Supervised Learning via Direct Loss Construction
Authors: Authors: Dylan Sam, J. Zico Kolter
Subjects: Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2212.06921
Pdf link: https://arxiv.org/pdf/2212.06921
Abstract Owing to the prohibitive costs of generating large amounts of labeled data, programmatic weak supervision is a growing paradigm within machine learning. In this setting, users design heuristics that provide noisy labels for subsets of the data. These weak labels are combined (typically via a graphical model) to form pseudolabels, which are then used to train a downstream model. In this work, we question a foundational premise of the typical weakly supervised learning pipeline: given that the heuristic provides all ``label" information, why do we need to generate pseudolabels at all? Instead, we propose to directly transform the heuristics themselves into corresponding loss functions that penalize differences between our model and the heuristic. By constructing losses directly from the heuristics, we can incorporate more information than is used in the standard weakly supervised pipeline, such as how the heuristics make their decisions, which explicitly informs feature selection during training. We call our method Losses over Labels (LoL) as it creates losses directly from heuristics without going through the intermediate step of a label. We show that LoL improves upon existing weak supervision methods on several benchmark text and image classification tasks and further demonstrate that incorporating gradient information leads to better performance on almost every task.
Mechanics of geodesics in Information geometry
Authors: Authors: Sumanto Chanda, Tatsuaki Wada
Subjects: Information Theory (cs.IT)
Arxiv link: https://arxiv.org/abs/2212.06959
Pdf link: https://arxiv.org/pdf/2212.06959
Abstract In this article we attempt to formulate Riemannian and Randers-Finsler metrics in information geometry and study their mechanical properties. Starting from the gradient flow equations, we show how to formulate Riemannian metrics, and demonstrate their duality under canonical transformation. Then we show how to formulate a Randers-Finsler metric from deformed gradient equations. The theories described are finally applied to the Gaussian model and tested to verify consistency.
Particle-Based Score Estimation for State Space Model Learning in Autonomous Driving
Authors: Authors: Angad Singh, Omar Makhlouf, Maximilian Igl, Joao Messias, Arnaud Doucet, Shimon Whiteson
Subjects: Robotics (cs.RO); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2212.06968
Pdf link: https://arxiv.org/pdf/2212.06968
Abstract Multi-object state estimation is a fundamental problem for robotic applications where a robot must interact with other moving objects. Typically, other objects' relevant state features are not directly observable, and must instead be inferred from observations. Particle filtering can perform such inference given approximate transition and observation models. However, these models are often unknown a priori, yielding a difficult parameter estimation problem since observations jointly carry transition and observation noise. In this work, we consider learning maximum-likelihood parameters using particle methods. Recent methods addressing this problem typically differentiate through time in a particle filter, which requires workarounds to the non-differentiable resampling step, that yield biased or high variance gradient estimates. By contrast, we exploit Fisher's identity to obtain a particle-based approximation of the score function (the gradient of the log likelihood) that yields a low variance estimate while only requiring stepwise differentiation through the transition and observation models. We apply our method to real data collected from autonomous vehicles (AVs) and show that it learns better models than existing techniques and is more stable in training, yielding an effective smoother for tracking the trajectories of vehicles around an AV.
Simplification of Forest Classifiers and Regressors
Authors: Authors: Atsuyoshi Nakamura, Kento Sakurada
Subjects: Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2212.07103
Pdf link: https://arxiv.org/pdf/2212.07103
Abstract We study the problem of sharing as many branching conditions of a given forest classifier or regressor as possible while keeping classification performance. As a constraint for preventing from accuracy degradation, we first consider the one that the decision paths of all the given feature vectors must not change. For a branching condition that a value of a certain feature is at most a given threshold, the set of values satisfying such constraint can be represented as an interval. Thus, the problem is reduced to the problem of finding the minimum set intersecting all the constraint-satisfying intervals for each set of branching conditions on the same feature. We propose an algorithm for the original problem using an algorithm solving this problem efficiently. The constraint is relaxed later to promote further sharing of branching conditions by allowing decision path change of a certain ratio of the given feature vectors or allowing a certain number of non-intersected constraint-satisfying intervals. We also extended our algorithm for both the relaxations. The effectiveness of our method is demonstrated through comprehensive experiments using 21 datasets (13 classification and 8 regression datasets in UCI machine learning repository) and 4 classifiers/regressors (random forest, extremely randomized trees, AdaBoost and gradient boosting).
Collision-free Source Seeking Control Methods for Unicycle Robots
Authors: Authors: Tinghua Li, Bayu Jayawardhana
Subjects: Robotics (cs.RO); Systems and Control (eess.SY)
Arxiv link: https://arxiv.org/abs/2212.07203
Pdf link: https://arxiv.org/pdf/2212.07203
Abstract In this work, we propose a collision-free source seeking control framework for unicycle robots traversing an unknown cluttered environment. In this framework, the obstacle avoidance is guided by the control barrier functions (CBF) embedded in quadratic programming and the source seeking control relies solely on the use of on-board sensors that measure signal strength of the source. To tackle the mixed relative degree of the CBF, we proposed three different CBF, namely the zeroing control barrier functions (ZCBF), exponential control barrier functions (ECBF), and reciprocal control barrier functions (RCBF) that can directly be integrated with our recent gradient-ascent source-seeking control law. We provide rigorous analysis of the three different methods and show the efficacy of the approaches in simulations using Matlab, as well as, using a realistic dynamic environment with moving obstacles in Gazebo/ROS.
Keyword: super-resolution

Mitigating Artifacts in Real-World Video Super-Resolution Models
Authors: Authors: Liangbin Xie, Xintao Wang, Shuwei Shi, Jinjin Gu, Chao Dong, Ying Shan
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2212.07339
Pdf link: https://arxiv.org/pdf/2212.07339
Abstract The recurrent structure is a prevalent framework for the task of video super-resolution, which models the temporal dependency between frames via hidden states. When applied to real-world scenarios with unknown and complex degradations, hidden states tend to contain unpleasant artifacts and propagate them to restored frames. In this circumstance, our analyses show that such artifacts can be largely alleviated when the hidden state is replaced with a cleaner counterpart. Based on the observations, we propose a Hidden State Attention (HSA) module to mitigate artifacts in real-world video super-resolution. Specifically, we first adopt various cheap filters to produce a hidden state pool. For example, Gaussian blur filters are for smoothing artifacts while sharpening filters are for enhancing details. To aggregate a new hidden state that contains fewer artifacts from the hidden state pool, we devise a Selective Cross Attention (SCA) module, in which the attention between input features and each hidden state is calculated. Equipped with HSA, our proposed method, namely FastRealVSR, is able to achieve 2x speedup while obtaining better performance than Real-BasicVSR. Codes will be available at https://github.com/TencentARC/FastRealVSR
Bi-Noising Diffusion: Towards Conditional Diffusion Models with Generative Restoration Priors
Authors: Authors: Kangfu Mei, Nithin Gopalakrishnan Nair, Vishal M. Patel
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2212.07352
Pdf link: https://arxiv.org/pdf/2212.07352
Abstract Conditional diffusion probabilistic models can model the distribution of natural images and can generate diverse and realistic samples based on given conditions. However, oftentimes their results can be unrealistic with observable color shifts and textures. We believe that this issue results from the divergence between the probabilistic distribution learned by the model and the distribution of natural images. The delicate conditions gradually enlarge the divergence during each sampling timestep. To address this issue, we introduce a new method that brings the predicted samples to the training data manifold using a pretrained unconditional diffusion model. The unconditional model acts as a regularizer and reduces the divergence introduced by the conditional model at each sampling step. We perform comprehensive experiments to demonstrate the effectiveness of our approach on super-resolution, colorization, turbulence removal, and image-deraining tasks. The improvements obtained by our method suggest that the priors can be incorporated as a general plugin for improving conditional diffusion models.

zoq / arxiv-updates

New submissions for Thu, 15 Dec 22 #408

Keyword: sgd

Keyword: optimization

Multi-Target Decision Making under Conditions of Severe Uncertainty

Are metaheuristics worth it? A computational comparison between nature-inspired and deterministic techniques on black-box optimization problems

Verifying term graph optimizations using Isabelle/HOL

Safety Correction from Baseline: Towards the Risk-aware Policy in Robotics via Dual-agent Reinforcement Learning

Efficient Sensor Scheduling Strategy Based on Spatio-temporal Scope Information Model

Rate-Splitting Multiple Access for Uplink Massive MIMO With Electromagnetic Exposure Constraints

A Predictive Operation Controller for an Electro-Thermal Microgrid Utilizing Variable Flow Temperatures

Artificial intelligence-driven digital twin of a modern house demonstrated in virtual reality

Efficient Non-isomorphic Graph Enumeration Algorithms for Subclasses of Perfect Graphs

Approximating Optimal Estimation of Time Offset Synchronization with Temperature Variations

Robust Multitarget Tracking in Interference Environments: A Message-Passing Approach

Multi-objective low-thrust spacecraft trajectory design using reachability analysis

RAGO: Recurrent Graph Optimizer For Multiple Rotation Averaging

FedSkip: Combatting Statistical Heterogeneity with Federated Skip Aggregation

Randomized Joint Diagonalization of Symmetric Matrices

LGRASS: Linear Graph Spectral Sparsification for Final Task of the 3rd ACM-China International Parallel Computing Challenge

RIS-aided User Tracking in Near-Field MIMO Systems: Joint Precoding Design and RIS Optimization

Learning useful representations for shifting tasks and distributions

Studying the workload of a fully decentralized Web3 system: IPFS

Keyword: adam

Keyword: gradient

Losses over Labels: Weakly Supervised Learning via Direct Loss Construction

Mechanics of geodesics in Information geometry

Particle-Based Score Estimation for State Space Model Learning in Autonomous Driving

Simplification of Forest Classifiers and Regressors

Collision-free Source Seeking Control Methods for Unicycle Robots

Keyword: super-resolution

Mitigating Artifacts in Real-World Video Super-Resolution Models

Bi-Noising Diffusion: Towards Conditional Diffusion Models with Generative Restoration Priors