New submissions for Thu, 17 Nov 22

Keyword: out of distribution detection

There is no result

Keyword: out-of-distribution detection

There is no result

Keyword: expected calibration error

There is no result

Keyword: overconfident

There is no result

Keyword: overconfidence

There is no result

Keyword: confidence

Bandit Algorithms for Prophet Inequality and Pandora's Box

Authors: Khashayar Gatmiry, Thomas Kesselheim, Sahil Singla, Yifan Wang
Subjects: Data Structures and Algorithms (cs.DS); Computer Science and Game Theory (cs.GT); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2211.08586
Pdf link: https://arxiv.org/pdf/2211.08586
Abstract The Prophet Inequality and Pandora's Box problems are fundamental stochastic problem with applications in Mechanism Design, Online Algorithms, Stochastic Optimization, Optimal Stopping, and Operations Research. A usual assumption in these works is that the probability distributions of the $n$ underlying random variables are given as input to the algorithm. Since in practice these distributions need to be learned, we initiate the study of such stochastic problems in the Multi-Armed Bandits model. In the Multi-Armed Bandits model we interact with $n$ unknown distributions over $T$ rounds: in round $t$ we play a policy $x^{(t)}$ and receive a partial (bandit) feedback on the performance of $x^{(t)}$. The goal is to minimize the regret, which is the difference over $T$ rounds in the total value of the optimal algorithm that knows the distributions vs. the total value of our algorithm that learns the distributions from the partial feedback. Our main results give near-optimal $\tilde{O}(\mathsf{poly}(n)\sqrt{T})$ total regret algorithms for both Prophet Inequality and Pandora's Box. Our proofs proceed by maintaining confidence intervals on the unknown indices of the optimal policy. The exploration-exploitation tradeoff prevents us from directly refining these confidence intervals, so the main technique is to design a regret upper bound that is learnable while playing low-regret Bandit policies.
Coronavirus statistics causes emotional bias: a social media text mining perspective
Authors: Linjiang Guo, Zijian Feng, Yuxue Chi, Mingzhu Wang, Yijun Liu
Subjects: Computers and Society (cs.CY); Social and Information Networks (cs.SI)
Arxiv link: https://arxiv.org/abs/2211.08644
Pdf link: https://arxiv.org/pdf/2211.08644
Abstract While COVID-19 has impacted humans for a long time, people search the web for pandemic-related information, causing anxiety. From a theoretic perspective, previous studies have confirmed that the number of COVID-19 cases can cause negative emotions, but how statistics of different dimensions, such as the number of imported cases, the number of local cases, and the number of government-designated lockdown zones, stimulate people's emotions requires detailed understanding. In order to obtain the views of people on COVID-19, this paper first proposes a deep learning model which classifies texts related to the pandemic from text data with place labels. Next, it conducts a sentiment analysis based on multi-task learning. Finally, it carries out a fixed-effect panel regression with outputs of the sentiment analysis. The performance of the algorithm shows a promising result. The empirical study demonstrates while the number of local cases is positively associated with risk perception, the number of imported cases is negatively associated with confidence levels, which explains why citizens tend to ascribe the protracted pandemic to foreign factors. Besides, this study finds that previous pandemic hits cities recover slowly from the suffering, while local governments' spending on healthcare can improve the situation. Our study illustrates the reasons for risk perception and confidence based on different sources of statistical information due to cognitive bias. It complements the knowledge related to epidemic information. It also contributes to a framework that combines sentiment analysis using advanced deep learning technology with the empirical regression method.
Fast and Accurate FSA System Using ELBERT: An Efficient and Lightweight BERT
Authors: Siyuan Lu, Chenchen Zhou, Keli Xie, Shiyi Liu, Jun Lin, Zhongfeng Wang
Subjects: Computation and Language (cs.CL)
Arxiv link: https://arxiv.org/abs/2211.08842
Pdf link: https://arxiv.org/pdf/2211.08842
Abstract As an application of Natural Language Processing (NLP) techniques, financial sentiment analysis (FSA) has become an invaluable tool for investors. Its speed and accuracy can significantly impact the returns of trading strategies.With the development of deep learning and Transformer-based pre-trained models like BERT, the accuracy of FSA has been much improved, but these time-consuming big models will also slow down the computation. To boost the processing speed of the FSA system and ensure high precision, we first propose an efficient and lightweight BERT (ELBERT) along with a novel confidence-window-based (CWB) early exit mechanism. Based on ELBERT, an innovative method to accelerate text processing on the GPU platform is developed, solving the difficult problem of making the early exit mechanism work more effectively with a large input batch size. Afterward, a fast and high-accuracy FSA system is built. Experimental results show that the proposed CWB early exit mechanism achieves significantly higher accuracy than existing early exit methods on BERT under the same computation cost. Besides, our FSA system can boost the processing speed to over 1000 texts per second with sufficient accuracy by using this acceleration method, which is nearly twice as fast as the FastBERT. Hence, this system can enable modern trading systems to quickly and accurately process financial text data.
Dynamical Linear Bandits
Authors: Marco Mussi, Alberto Maria Metelli, Marcello Restelli
Subjects: Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2211.08997
Pdf link: https://arxiv.org/pdf/2211.08997
Abstract In many real-world sequential decision-making problems, an action does not immediately reflect on the feedback and spreads its effects over a long time frame. For instance, in online advertising, investing in a platform produces an increase of awareness, but the actual reward, i.e., a conversion, might occur far in the future. Furthermore, whether a conversion takes place depends on: how fast the awareness grows, its vanishing effects, and the synergy or interference with other advertising platforms. Previous work has investigated the Multi-Armed Bandit framework with the possibility of delayed and aggregated feedback, without a particular structure on how an action propagates in the future, disregarding possible dynamical effects. In this paper, we introduce a novel setting, the Dynamical Linear Bandits (DLB), an extension of the linear bandits characterized by a hidden state. When an action is performed, the learner observes a noisy reward whose mean is a linear function of the hidden state and of the action. Then, the hidden state evolves according to a linear dynamics, affected by the performed action too. We start by introducing the setting, discussing the notion of optimal policy, and deriving an expected regret lower bound. Then, we provide an any-time optimistic regret minimization algorithm, Dynamical Linear Upper Confidence Bound (DynLin-UCB), that suffers an expected regret of order O(c d sqrt(T)), where c is a constant dependent on the properties of the linear dynamical evolution, and d is the dimension of the action vector. Finally, we conduct a numerical validation on a synthetic environment and on real-world data to show the effectiveness of DynLin-UCB in comparison with several baselines.
Keyword: scaling

Power-law Scaling to Assist with Key Challenges in Artificial Intelligence
Authors: Yuval Meir, Shira Sardi, Shiri Hodassman, Karin Kisos, Itamar Ben-Noam, Amir Goldental, Ido Kanter
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2211.08430
Pdf link: https://arxiv.org/pdf/2211.08430
Abstract Power-law scaling, a central concept in critical phenomena, is found to be useful in deep learning, where optimized test errors on handwritten digit examples converge as a power-law to zero with database size. For rapid decision making with one training epoch, each example is presented only once to the trained network, the power-law exponent increased with the number of hidden layers. For the largest dataset, the obtained test error was estimated to be in the proximity of state-of-the-art algorithms for large epoch numbers. Power-law scaling assists with key challenges found in current artificial intelligence applications and facilitates an a priori dataset size estimation to achieve a desired test accuracy. It establishes a benchmark for measuring training complexity and a quantitative hierarchy of machine learning tasks and algorithms.
On the Compositional Generalization Gap of In-Context Learning
Authors: Arian Hosseini, Ankit Vani, Dzmitry Bahdanau, Alessandro Sordoni, Aaron Courville
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2211.08473
Pdf link: https://arxiv.org/pdf/2211.08473
Abstract Pretrained large generative language models have shown great performance on many tasks, but exhibit low compositional generalization abilities. Scaling such models has been shown to improve their performance on various NLP tasks even just by conditioning them on a few examples to solve the task without any fine-tuning (also known as in-context learning). In this work, we look at the gap between the in-distribution (ID) and out-of-distribution (OOD) performance of such models in semantic parsing tasks with in-context learning. In the ID settings, the demonstrations are from the same split (test or train) that the model is being evaluated on, and in the OOD settings, they are from the other split. We look at how the relative generalization gap of in-context learning evolves as models are scaled up. We evaluate four model families, OPT, BLOOM, CodeGen and Codex on three semantic parsing datasets, CFQ, SCAN and GeoQuery with different number of exemplars, and observe a trend of decreasing relative generalization gap as models are scaled up.
ECCO: Equivalent Circuit Controlled Optimization
Authors: Aayushya Agarwal, Carmel Fiscko, Soummya Kar, Larry Pileggi, Bruno Sinopoli
Subjects: Systems and Control (eess.SY)
Arxiv link: https://arxiv.org/abs/2211.08478
Pdf link: https://arxiv.org/pdf/2211.08478
Abstract We propose an adaptive optimization algorithm for solving unconstrained scaled gradient flow problems that achieves fast convergence by controlling the optimization trajectory shape and the discretization step sizes. Under a broad class of scaling functions, we establish convergence of the proposed approach to critical points of smooth objective functions, while demonstrating its flexibility and robustness with respect to hyperparameter tuning. First, we prove convergence of component-wise scaled gradient flow to a critical point under regularity conditions. We show that this controlled gradient flow dynamics is equivalent to the transient response of an electrical circuit, allowing for circuit theory concepts to solve the problem. Based on this equivalence, we develop two optimization trajectory control schemes based on minimizing the charge stored in the circuit: one based on the true Hessian and one based on an approximate Hessian. While the control schemes are derived from circuit concepts, no circuit knowledge is needed to implement the algorithms. To find the value of the critical point, we propose a time step search routine for forward Euler discretization that controls the local truncation error, a method adapted from circuit simulation ideas. In simulation we find that the trajectory control outperforms uncontrolled gradient flow, and the error-aware discretization out-performs line search with the Armijo condition. Our algorithms are evaluated on convex and non-convex test functions, including neural networks, with convergence speeds comparable to or exceeding Adam.
The Future of Hackathon Research and Practice
Authors: Jeanette Falk, Alexander Nolte, Daniela Huppenkothen, Marion Weinzierl, Kiev Gama, Daniel Spikol, Erik Tollerud, Neil Chue Hong, Ines Knäpper, Linda Bailey Hayden
Subjects: Human-Computer Interaction (cs.HC); Software Engineering (cs.SE)
Arxiv link: https://arxiv.org/abs/2211.08963
Pdf link: https://arxiv.org/pdf/2211.08963
Abstract Hackathons are time-bounded collaborative events which have become a global phenomenon adopted by both researchers and practitioners in a plethora of contexts. Hackathon events are generally used to accelerate the development of, for example, scientific results and collaborations, communities, and innovative prototypes addressing urgent challenges. As hackathons have been adopted into many different contexts, the events have also been adapted in numerous ways corresponding to the unique needs and situations of organizers, participants and other stakeholders. While these interdisciplinary adaptions, in general affords many advantages - such as tailoring the format to specific needs - they also entail certain challenges, specifically: 1) limited exchange of best practices, 2) limited exchange of research findings, and 3) larger overarching questions that require interdisciplinary collaboration are not discovered and remain unaddressed. We call for interdisciplinary collaborations to address these challenges. As a first initiative towards this, we performed an interdisciplinary collaborative analysis in the context of a workshop at the Lorentz Center, Leiden in December 2021. In this paper, we present the results of this analysis in terms of six important areas which we envision to contribute to maturing hackathon research and practice: 1) hackathons for different purposes, 2) socio-technical event design, 3) scaling up, 4) making hackathons equitable, 5) studying hackathons, and 6) hackathon goals and how to reach them. We present these areas in terms of the state of the art and research proposals and conclude the paper by suggesting next steps needed for advancing hackathon research and practice.
Teaching Algorithmic Reasoning via In-context Learning
Authors: Hattie Zhou, Azade Nova, Hugo Larochelle, Aaron Courville, Behnam Neyshabur, Hanie Sedghi
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Arxiv link: https://arxiv.org/abs/2211.09066
Pdf link: https://arxiv.org/pdf/2211.09066
Abstract Large language models (LLMs) have shown increasing in-context learning capabilities through scaling up model and data size. Despite this progress, LLMs are still unable to solve algorithmic reasoning problems. While providing a rationale with the final answer has led to further improvements in multi-step reasoning problems, Anil et al. 2022 showed that even simple algorithmic reasoning tasks such as parity are far from solved. In this work, we identify and study four key stages for successfully teaching algorithmic reasoning to LLMs: (1) formulating algorithms as skills, (2) teaching multiple skills simultaneously (skill accumulation), (3) teaching how to combine skills (skill composition) and (4) teaching how to use skills as tools. We show that it is possible to teach algorithmic reasoning to LLMs via in-context learning, which we refer to as algorithmic prompting. We evaluate our approach on a variety of arithmetic and quantitative reasoning tasks, and demonstrate significant boosts in performance over existing prompting techniques. In particular, for long parity, addition, multiplication and subtraction, we achieve an error reduction of approximately 10x, 9x, 5x and 2x respectively compared to the best available baselines.
Egocentric Hand-object Interaction Detection
Authors: Yao Lu, Yanan Liu
Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
Arxiv link: https://arxiv.org/abs/2211.09067
Pdf link: https://arxiv.org/pdf/2211.09067
Abstract In this paper, we propose a method to jointly determine the status of hand-object interaction. This is crucial for egocentric human activity understanding and interaction. From a computer vision perspective, we believe that determining whether a hand is interacting with an object depends on whether there is an interactive hand pose and whether the hand is touching the object. Thus, we extract the hand pose, hand-object masks to jointly determine the interaction status. In order to solve the problem of hand pose estimation due to in-hand object occlusion, we use a multi-cam system to capture hand pose data from multiple perspectives. We evaluate and compare our method with the most recent work from Shan et al. \cite{Shan20} on selected images from EPIC-KITCHENS \cite{damen2018scaling} dataset and achieve $89\%$ accuracy on HOI (hand-object interaction) detection which is comparative to Shan's ($92\%$). However, for real-time performance, our method can run over $\textbf{30}$ FPS which is much more efficient than Shan's ($\textbf{1}\sim\textbf{2}$ FPS). A demo can be found from https://www.youtube.com/watch?v=XVj3zBuynmQ
Keyword: calibration

Omnidirectional robot modeling and simulation
Authors: Sandro Costa Magalhães, António Paulo Moreira, Paulo Costa
Subjects: Robotics (cs.RO); Systems and Control (eess.SY)
Arxiv link: https://arxiv.org/abs/2211.08532
Pdf link: https://arxiv.org/pdf/2211.08532
Abstract A robot simulation system is a basic need for any robotics application. With it, developers' teams of robots can test their algorithms and make initial calibrations without risk of damage to the real robots, assuring safety. However, building these simulation environments is usually time-consuming work, and when considering robot fleets, the simulation reveals to be computing expensive. With it, developers building teams of robots can test their algorithms and make initial calibrations without risk of damage to the real robots, assuring safety. An omnidirectional robot from the 5DPO robotics soccer team served to test this approach. The modeling issue was divided into two steps: modeling the motor's non-linear features and modeling the general behavior of the robot. A proper fitting of the robot was reached, considering the velocity robot's response.
Semantic keypoint extraction for scanned animals using multi-depth-camera systems
Authors: Raphael Falque, Teresa Vidal-Calleja, Alen Alempijevic
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2211.08634
Pdf link: https://arxiv.org/pdf/2211.08634
Abstract Keypoint annotation in point clouds is an important task for 3D reconstruction, object tracking and alignment, in particular in deformable or moving scenes. In the context of agriculture robotics, it is a critical task for livestock automation to work toward condition assessment or behaviour recognition. In this work, we propose a novel approach for semantic keypoint annotation in point clouds, by reformulating the keypoint extraction as a regression problem of the distance between the keypoints and the rest of the point cloud. We use the distance on the point cloud manifold mapped into a radial basis function (RBF), which is then learned using an encoder-decoder architecture. Special consideration is given to the data augmentation specific to multi-depth-camera systems by considering noise over the extrinsic calibration and camera frame dropout. Additionally, we investigate computationally efficient non-rigid deformation methods that can be applied to animal point clouds. Our method is tested on data collected in the field, on moving beef cattle, with a calibrated system of multiple hardware-synchronised RGB-D cameras.
Comparative Learning: A Sample Complexity Theory for Two Hypothesis Classes
Authors: Lunjia Hu, Charlotte Peale
Subjects: Machine Learning (cs.LG); Computational Complexity (cs.CC); Data Structures and Algorithms (cs.DS); Machine Learning (stat.ML)
Arxiv link: https://arxiv.org/abs/2211.09101
Pdf link: https://arxiv.org/pdf/2211.09101
Abstract In many learning theory problems, a central role is played by a hypothesis class: we might assume that the data is labeled according to a hypothesis in the class (usually referred to as the realizable setting), or we might evaluate the learned model by comparing it with the best hypothesis in the class (the agnostic setting). Taking a step beyond these classic setups that involve only a single hypothesis class, we introduce comparative learning as a combination of the realizable and agnostic settings in PAC learning: given two binary hypothesis classes $S$ and $B$, we assume that the data is labeled according to a hypothesis in the source class $S$ and require the learned model to achieve an accuracy comparable to the best hypothesis in the benchmark class $B$. Even when both $S$ and $B$ have infinite VC dimensions, comparative learning can still have a small sample complexity. We show that the sample complexity of comparative learning is characterized by the mutual VC dimension $\mathsf{VC}(S,B)$ which we define to be the maximum size of a subset shattered by both $S$ and $B$. We also show a similar result in the online setting, where we give a regret characterization in terms of the mutual Littlestone dimension $\mathsf{Ldim}(S,B)$. These results also hold for partial hypotheses. We additionally show that the insights necessary to characterize the sample complexity of comparative learning can be applied to characterize the sample complexity of realizable multiaccuracy and multicalibration using the mutual fat-shattering dimension, an analogue of the mutual VC dimension for real-valued hypotheses. This not only solves an open problem proposed by Hu, Peale, Reingold (2022), but also leads to independently interesting results extending classic ones about regression, boosting, and covering number to our two-hypothesis-class setting.
Holistic Evaluation of Language Models
Authors: Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michihiro Yasunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya Kumar, Benjamin Newman, Binhang Yuan, Bobby Yan, Ce Zhang, Christian Cosgrove, Christopher D. Manning, Christopher Ré, Diana Acosta-Navas, Drew A. Hudson, Eric Zelikman, Esin Durmus, Faisal Ladhak, Frieda Rong, Hongyu Ren, Huaxiu Yao, Jue Wang, Keshav Santhanam, Laurel Orr, Lucia Zheng, Mert Yuksekgonul, Mirac Suzgun, Nathan Kim, Neel Guha, Niladri Chatterji, Omar Khattab, Peter Henderson, Qian Huang, Ryan Chi, Sang Michael Xie, Shibani Santurkar, Surya Ganguli, Tatsunori Hashimoto, Thomas Icard, Tianyi Zhang, Vishrav Chaudhary, William Wang, Xuechen Li, Yifan Mai, Yuhui Zhang, Yuta Koreeda
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2211.09110
Pdf link: https://arxiv.org/pdf/2211.09110
Abstract Language models (LMs) are becoming the foundation for almost all major language technologies, but their capabilities, limitations, and risks are not well understood. We present Holistic Evaluation of Language Models (HELM) to improve the transparency of language models. First, we taxonomize the vast space of potential scenarios (i.e. use cases) and metrics (i.e. desiderata) that are of interest for LMs. Then we select a broad subset based on coverage and feasibility, noting what's missing or underrepresented (e.g. question answering for neglected English dialects, metrics for trustworthiness). Second, we adopt a multi-metric approach: We measure 7 metrics (accuracy, calibration, robustness, fairness, bias, toxicity, and efficiency) for each of 16 core scenarios when possible (87.5% of the time). This ensures metrics beyond accuracy don't fall to the wayside, and that trade-offs are clearly exposed. We also perform 7 targeted evaluations, based on 26 targeted scenarios, to analyze specific aspects (e.g. reasoning, disinformation). Third, we conduct a large-scale evaluation of 30 prominent language models (spanning open, limited-access, and closed models) on all 42 scenarios, 21 of which were not previously used in mainstream LM evaluation. Prior to HELM, models on average were evaluated on just 17.9% of the core HELM scenarios, with some prominent models not sharing a single scenario in common. We improve this to 96.0%: now all 30 models have been densely benchmarked on the same core scenarios and metrics under standardized conditions. Our evaluation surfaces 25 top-level findings. For full transparency, we release all raw model prompts and completions publicly for further analysis, as well as a general modular toolkit. We intend for HELM to be a living benchmark for the community, continuously updated with new scenarios, metrics, and models.

ericbeyer / L-arxiv-interest-tracker

New submissions for Thu, 17 Nov 22 #694

Keyword: out of distribution detection

Keyword: out-of-distribution detection

Keyword: expected calibration error

Keyword: overconfident

Keyword: overconfidence

Keyword: confidence

Bandit Algorithms for Prophet Inequality and Pandora's Box

Coronavirus statistics causes emotional bias: a social media text mining perspective

Fast and Accurate FSA System Using ELBERT: An Efficient and Lightweight BERT

Dynamical Linear Bandits

Keyword: scaling

Power-law Scaling to Assist with Key Challenges in Artificial Intelligence

On the Compositional Generalization Gap of In-Context Learning

ECCO: Equivalent Circuit Controlled Optimization

The Future of Hackathon Research and Practice

Teaching Algorithmic Reasoning via In-context Learning

Egocentric Hand-object Interaction Detection

Keyword: calibration

Omnidirectional robot modeling and simulation

Semantic keypoint extraction for scanned animals using multi-depth-camera systems

Comparative Learning: A Sample Complexity Theory for Two Hypothesis Classes

Holistic Evaluation of Language Models