New submissions for Tue, 26 Jul 22

Keyword: out of distribution detection

There is no result

Keyword: out-of-distribution detection

There is no result

Keyword: expected calibration error

There is no result

Keyword: overconfident

There is no result

Keyword: overconfidence

There is no result

Keyword: confidence

Anomaly Detection for Fraud in Cryptocurrency Time Series

Authors: Eran Kaufman, Andrey Iaremenko
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR)
Arxiv link: https://arxiv.org/abs/2207.11466
Pdf link: https://arxiv.org/pdf/2207.11466
Abstract Since the inception of Bitcoin in 2009, the market of cryptocurrencies has grown beyond initial expectations as daily trades exceed $10 billion. As industries become automated, the need for an automated fraud detector becomes very apparent. Detecting anomalies in real time prevents potential accidents and economic losses. Anomaly detection in multivariate time series data poses a particular challenge because it requires simultaneous consideration of temporal dependencies and relationships between variables. Identifying an anomaly in real time is not an easy task specifically because of the exact anomalistic behavior they observe. Some points may present pointwise global or local anomalistic behavior, while others may be anomalistic due to their frequency or seasonal behavior or due to a change in the trend. In this paper we suggested working on real time series of trades of Ethereum from specific accounts and surveyed a large variety of different algorithms traditional and new. We categorized them according to the strategy and the anomalistic behavior which they search and showed that when bundling them together to different groups, they can prove to be a good real-time detector with an alarm time of no longer than a few seconds and with very high confidence.
Self-Support Few-Shot Semantic Segmentation
Authors: Qi Fan, Wenjie Pei, Yu-Wing Tai, Chi-Keung Tang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2207.11549
Pdf link: https://arxiv.org/pdf/2207.11549
Abstract Existing few-shot segmentation methods have achieved great progress based on the support-query matching framework. But they still heavily suffer from the limited coverage of intra-class variations from the few-shot supports provided. Motivated by the simple Gestalt principle that pixels belonging to the same object are more similar than those to different objects of same class, we propose a novel self-support matching strategy to alleviate this problem, which uses query prototypes to match query features, where the query prototypes are collected from high-confidence query predictions. This strategy can effectively capture the consistent underlying characteristics of the query objects, and thus fittingly match query features. We also propose an adaptive self-support background prototype generation module and self-support loss to further facilitate the self-support matching procedure. Our self-support network substantially improves the prototype quality, benefits more improvement from stronger backbones and more supports, and achieves SOTA on multiple datasets. Codes are at \url{https://github.com/fanq15/SSP}.
Isabelle/HOL/GST: A Formal Proof Environment for Generalized Set Theories
Authors: Ciarán Dunne, J. B. Wells
Subjects: Logic in Computer Science (cs.LO); Logic (math.LO)
Arxiv link: https://arxiv.org/abs/2207.12039
Pdf link: https://arxiv.org/pdf/2207.12039
Abstract A generalized set theory (GST) is like a standard set theory but also can have non-set structured objects that can contain other structured objects including sets. This paper presents Isabelle/HOL support for GSTs, which are treated as type classes that combine features that specify kinds of mathematical objects, e.g., sets, ordinal numbers, functions, etc. GSTs can have an exception feature that eases representing partial functions and undefinedness. When assembling a GST, extra axioms are generated following a user-modifiable policy to fill specification gaps. Specialized type-like predicates called soft types are used extensively. Although a GST can be used without a model, for confidence in its consistency we build a model for each GST from components that specify each feature's contribution to each tier of a von-Neumann-style cumulative hierarchy defined via ordinal recursion, and we then connect the model to a separate type which the GST occupies.
Online Reinforcement Learning for Periodic MDP
Authors: Ayush Aniket, Arpan Chattopadhyay
Subjects: Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2207.12045
Pdf link: https://arxiv.org/pdf/2207.12045
Abstract We study learning in periodic Markov Decision Process(MDP), a special type of non-stationary MDP where both the state transition probabilities and reward functions vary periodically, under the average reward maximization setting. We formulate the problem as a stationary MDP by augmenting the state space with the period index, and propose a periodic upper confidence bound reinforcement learning-2 (PUCRL2) algorithm. We show that the regret of PUCRL2 varies linearly with the period and as sub-linear with the horizon length. Numerical results demonstrate the efficacy of PUCRL2.
A Confident Deep Learning loss function for one-step Conformal Prediction approximation
Authors: Julia A. Meister, Khuong An Nguyen, Zhiyuan Luo
Subjects: Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2207.12377
Pdf link: https://arxiv.org/pdf/2207.12377
Abstract Deep Learning predictions with measurable confidence are increasingly desirable for real-world problems, especially in high-risk settings. The Conformal Prediction (CP) framework is a versatile solution that automatically guarantees a maximum error rate. However, CP suffers from computational inefficiencies that limit its application to large-scale datasets. In this paper, we propose a novel conformal loss function that approximates the traditionally two-step CP approach in a single step. By evaluating and penalising deviations from the stringent expected CP output distribution, a Deep Learning model may learn the direct relationship between input data and conformal p-values. Our approach achieves significant training time reductions up to 86% compared to Aggregated Conformal Prediction (ACP), an accepted CP approximation variant. In terms of approximate validity and predictive efficiency, we carry out a comprehensive empirical evaluation to show our novel loss function's competitiveness with ACP on the well-established MNIST dataset.
Keyword: scaling

Scalable training of graph convolutional neural networks for fast and accurate predictions of HOMO-LUMO gap in molecules
Authors: Jong Youl Choi, Pei Zhang, Kshitij Mehta, Andrew Blanchard, Massimiliano Lupo Pasini
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Distributed, Parallel, and Cluster Computing (cs.DC); Chemical Physics (physics.chem-ph); Computational Physics (physics.comp-ph)
Arxiv link: https://arxiv.org/abs/2207.11333
Pdf link: https://arxiv.org/pdf/2207.11333
Abstract Graph Convolutional Neural Network (GCNN) is a popular class of deep learning (DL) models in material science to predict material properties from the graph representation of molecular structures. Training an accurate and comprehensive GCNN surrogate for molecular design requires large-scale graph datasets and is usually a time-consuming process. Recent advances in GPUs and distributed computing open a path to reduce the computational cost for GCNN training effectively. However, efficient utilization of high performance computing (HPC) resources for training requires simultaneously optimizing large-scale data management and scalable stochastic batched optimization techniques. In this work, we focus on building GCNN models on HPC systems to predict material properties of millions of molecules. We use HydraGNN, our in-house library for large-scale GCNN training, leveraging distributed data parallelism in PyTorch. We use ADIOS, a high-performance data management framework for efficient storage and reading of large molecular graph data. We perform parallel training on two open-source large-scale graph datasets to build a GCNN predictor for an important quantum property known as the HOMO-LUMO gap. We measure the scalability, accuracy, and convergence of our approach on two DOE supercomputers: the Summit supercomputer at the Oak Ridge Leadership Computing Facility (OLCF) and the Perlmutter system at the National Energy Research Scientific Computing Center (NERSC). We present our experimental results with HydraGNN showing i) reduction of data loading time up to 4.2 times compared with a conventional method and ii) linear scaling performance for training up to 1,024 GPUs on both Summit and Perlmutter.
The Interplay of Spectral Efficiency, User Density, and Energy in Random Access Protocols with Retransmissions
Authors: Derya Malak
Subjects: Information Theory (cs.IT)
Arxiv link: https://arxiv.org/abs/2207.11756
Pdf link: https://arxiv.org/pdf/2207.11756
Abstract The fifth-generation of wireless communication networks is required to support a range of use cases such as enhanced mobile broadband (eMBB), ultra-reliable, low-latency communications (URLLC), massive machine-type communications (mMTCs), with heterogeneous data rate, delay, and power requirements. The 4G LTE air interface uses extra overhead to enable scheduled access, which is not justified for small payload sizes. We employ a random access communication model with retransmissions for multiple users with small payloads at the low spectral efficiency regime. The radio resources are split non-orthogonally in the time and frequency dimensions. Retransmissions are combined via Hybrid Automatic Repeat reQuest (HARQ) methods, namely Chase Combining and Incremental Redundancy with a finite buffer size constraint $C{\sf buf}$. We determine the best scaling for the spectral efficiency (SE) versus signal-to-noise ratio (SNR) per bit and for the user density versus SNR per bit, for the sum-optimal regime and when the interference is treated as noise, using a Shannon capacity approximation. Numerical results show that the scaling results are applicable over a range of $\eta$, $T$, $C{\sf buf}$, $J$, at low received SNR values. The proposed analytical framework provides insights for resource allocation in general random access systems and specific 5G use cases for massive URLLC uplink access.
Enhancing Image Rescaling using Dual Latent Variables in Invertible Neural Network
Authors: Min Zhang, Zhihong Pan, Xin Zhou, C.-C. Jay Kuo
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
Arxiv link: https://arxiv.org/abs/2207.11844
Pdf link: https://arxiv.org/pdf/2207.11844
Abstract Normalizing flow models have been used successfully for generative image super-resolution (SR) by approximating complex distribution of natural images to simple tractable distribution in latent space through Invertible Neural Networks (INN). These models can generate multiple realistic SR images from one low-resolution (LR) input using randomly sampled points in the latent space, simulating the ill-posed nature of image upscaling where multiple high-resolution (HR) images correspond to the same LR. Lately, the invertible process in INN has also been used successfully by bidirectional image rescaling models like IRN and HCFlow for joint optimization of downscaling and inverse upscaling, resulting in significant improvements in upscaled image quality. While they are optimized for image downscaling too, the ill-posed nature of image downscaling, where one HR image could be downsized to multiple LR images depending on different interpolation kernels and resampling methods, is not considered. A new downscaling latent variable, in addition to the original one representing uncertainties in image upscaling, is introduced to model variations in the image downscaling process. This dual latent variable enhancement is applicable to different image rescaling models and it is shown in extensive experiments that it can improve image upscaling accuracy consistently without sacrificing image quality in downscaled LR images. It is also shown to be effective in enhancing other INN-based models for image restoration applications like image hiding.
Next-generation HPC models for future rotorcraft applications
Authors: Nicoletta Sanguini, Tommaso Benacchio, Daniele Malacrida, Federico Cipolletta, Francesco Rondina, Antonio Sciarappa, Luigi Capone
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Numerical Analysis (math.NA)
Arxiv link: https://arxiv.org/abs/2207.12269
Pdf link: https://arxiv.org/pdf/2207.12269
Abstract Rotorcraft technologies pose great scientific and industrial challenges for numerical computing. As available computational resources approach the exascale, finer scales and therefore more accurate simulations of engineering test cases become accessible. However, shifting legacy workflows and optimizing parallel efficiency and scalability of existing software on new hardware is often demanding. This paper reports preliminary results in CFD and structural dynamics simulations using the T106A Low Pressure Turbine (LPT) blade geometry on Leonardo S.p.A.'s davinci-1 high-performance computing (HPC) facility. Time to solution and scalability are assessed for commercial packages Ansys Fluent, STAR-CCM+, and ABAQUS, and the open-source scientific computing framework PyFR. In direct numerical simulations of compressible fluid flow, normalized time to solution values obtained using PyFR are found to be up to 8 times smaller than those obtained using Fluent and STAR-CCM+. The findings extend to the incompressible case. All models offer weak and strong scaling in tests performed on up to 48 compute nodes, each with 4 Nvidia A100 GPUs. In linear elasticity simulations with ABAQUS, both the iterative solver and the direct solver provide speedup in preliminary scaling tests, with the iterative solver outperforming the direct solver in terms of time-to-solution and memory usage. The results provide a first indication of the potential of HPC architectures in scaling engineering applications towards certification by simulation, and the first step for the Company towards the use of cutting-edge HPC toolkits in the field of Rotorcraft technologies.
Stable Parallel Training of Wasserstein Conditional Generative Adversarial Neural Networks
Authors: Massimiliano Lupo Pasini, Junqi Yin
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG); Multiagent Systems (cs.MA)
Arxiv link: https://arxiv.org/abs/2207.12315
Pdf link: https://arxiv.org/pdf/2207.12315
Abstract We propose a stable, parallel approach to train Wasserstein Conditional Generative Adversarial Neural Networks (W-CGANs) under the constraint of a fixed computational budget. Differently from previous distributed GANs training techniques, our approach avoids inter-process communications, reduces the risk of mode collapse and enhances scalability by using multiple generators, each one of them concurrently trained on a single data label. The use of the Wasserstein metric also reduces the risk of cycling by stabilizing the training of each generator. We illustrate the approach on the CIFAR10, CIFAR100, and ImageNet1k datasets, three standard benchmark image datasets, maintaining the original resolution of the images for each dataset. Performance is assessed in terms of scalability and final accuracy within a limited fixed computational time and computational resources. To measure accuracy, we use the inception score, the Frechet inception distance, and image quality. An improvement in inception score and Frechet inception distance is shown in comparison to previous results obtained by performing the parallel approach on deep convolutional conditional generative adversarial neural networks (DC-CGANs) as well as an improvement of image quality of the new images created by the GANs approach. Weak scaling is attained on both datasets using up to 2,000 NVIDIA V100 GPUs on the OLCF supercomputer Summit.
Keyword: calibration

Epersist: A Self Balancing Robot Using PID Controller And Deep Reinforcement Learning
Authors: Ghanta Sai Krishna, Dyavat Sumith, Garika Akshay
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI)
Arxiv link: https://arxiv.org/abs/2207.11431
Pdf link: https://arxiv.org/pdf/2207.11431
Abstract A two-wheeled self-balancing robot is an example of an inverse pendulum and is an inherently non-linear, unstable system. The fundamental concept of the proposed framework "Epersist" is to overcome the challenge of counterbalancing an initially unstable system by delivering robust control mechanisms, Proportional Integral Derivative(PID), and Reinforcement Learning (RL). Moreover, the micro-controller NodeMCUESP32 and inertial sensor in the Epersist employ fewer computational procedures to give accurate instruction regarding the spin of wheels to the motor driver, which helps control the wheels and balance the robot. This framework also consists of the mathematical model of the PID controller and a novel self-trained advantage actor-critic algorithm as the RL agent. After several experiments, control variable calibrations are made as the benchmark values to attain the angle of static equilibrium. This "Epersist" framework proposes PID and RL-assisted functional prototypes and simulations for better utility.
Keypoint-less Camera Calibration for Sports Field Registration in Soccer
Authors: Jonas Theiner, Ralph Ewerth
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2207.11709
Pdf link: https://arxiv.org/pdf/2207.11709
Abstract Sports field registration in broadcast videos is typically interpreted as the task of homography estimation, which provides a mapping between a planar field and the corresponding visible area of the image. In contrast to previous approaches, we consider the task as a camera calibration problem. First, we introduce a differentiable objective function that is able to learn the camera pose and focal length from segment correspondences (e.g., lines, point clouds), based on pixel-level annotations for segments of a known calibration object, i.e., the sports field. The calibration module iteratively minimizes the segment reprojection error induced by the estimated camera parameters. Second, we propose a novel approach for 3D sports field registration from broadcast soccer images. The calibration module does not require any training data and compared to the typical solution, which subsequently refines an initial estimation, our solution does it in one step. The proposed method is evaluated for sports field registration on two datasets and achieves superior results compared to two state-of-the-art approaches.
Visual Perturbation-aware Collaborative Learning for Overcoming the Language Prior Problem
Authors: Yudong Han, Liqiang Nie, Jianhua Yin, Jianlong Wu, Yan Yan
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2207.11850
Pdf link: https://arxiv.org/pdf/2207.11850
Abstract Several studies have recently pointed that existing Visual Question Answering (VQA) models heavily suffer from the language prior problem, which refers to capturing superficial statistical correlations between the question type and the answer whereas ignoring the image contents. Numerous efforts have been dedicated to strengthen the image dependency by creating the delicate models or introducing the extra visual annotations. However, these methods cannot sufficiently explore how the visual cues explicitly affect the learned answer representation, which is vital for language reliance alleviation. Moreover, they generally emphasize the class-level discrimination of the learned answer representation, which overlooks the more fine-grained instance-level patterns and demands further optimization. In this paper, we propose a novel collaborative learning scheme from the viewpoint of visual perturbation calibration, which can better investigate the fine-grained visual effects and mitigate the language prior problem by learning the instance-level characteristics. Specifically, we devise a visual controller to construct two sorts of curated images with different perturbation extents, based on which the collaborative learning of intra-instance invariance and inter-instance discrimination is implemented by two well-designed discriminators. Besides, we implement the information bottleneck modulator on latent space for further bias alleviation and representation calibration. We impose our visual perturbation-aware framework to three orthodox baselines and the experimental results on two diagnostic VQA-CP benchmark datasets evidently demonstrate its effectiveness. In addition, we also justify its robustness on the balanced VQA benchmark.
Representational Ethical Model Calibration
Authors: Robert Carruthers, Isabel Straw, James K Ruffle, Daniel Herron, Amy Nelson, Danilo Bzdok, Delmiro Fernandez-Reyes, Geraint Rees, Parashkev Nachev
Subjects: Machine Learning (cs.LG); Computers and Society (cs.CY)
Arxiv link: https://arxiv.org/abs/2207.12043
Pdf link: https://arxiv.org/pdf/2207.12043
Abstract Equity is widely held to be fundamental to the ethics of healthcare. In the context of clinical decision-making, it rests on the comparative fidelity of the intelligence -- evidence-based or intuitive -- guiding the management of each individual patient. Though brought to recent attention by the individuating power of contemporary machine learning, such epistemic equity arises in the context of any decision guidance, whether traditional or innovative. Yet no general framework for its quantification, let alone assurance, currently exists. Here we formulate epistemic equity in terms of model fidelity evaluated over learnt multi-dimensional representations of identity crafted to maximise the captured diversity of the population, introducing a comprehensive framework for Representational Ethical Model Calibration. We demonstrate use of the framework on large-scale multimodal data from UK Biobank to derive diverse representations of the population, quantify model performance, and institute responsive remediation. We offer our approach as a principled solution to quantifying and assuring epistemic equity in healthcare, with applications across the research, clinical, and regulatory domains.
Calibrated One-class Classification for Unsupervised Time Series Anomaly Detection
Authors: Hongzuo Xu, Yijie Wang, Songlei Jian, Qing Liao, Yongjun Wang, Guansong Pang
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Arxiv link: https://arxiv.org/abs/2207.12201
Pdf link: https://arxiv.org/pdf/2207.12201
Abstract Unsupervised time series anomaly detection is instrumental in monitoring and alarming potential faults of target systems in various domains. Current state-of-the-art time series anomaly detectors mainly focus on devising advanced neural network structures and new reconstruction/prediction learning objectives to learn data normality (normal patterns and behaviors) as accurately as possible. However, these one-class learning methods can be deceived by unknown anomalies in the training data (i.e., anomaly contamination). Further, their normality learning also lacks knowledge about the anomalies of interest. Consequently, they often learn a biased, inaccurate normality boundary. This paper proposes a novel one-class learning approach, named calibrated one-class classification, to tackle this problem. Our one-class classifier is calibrated in two ways: (1) by adaptively penalizing uncertain predictions, which helps eliminate the impact of anomaly contamination while accentuating the predictions that the one-class model is confident in, and (2) by discriminating the normal samples from native anomaly examples that are generated to simulate genuine time series abnormal behaviors on the basis of original data. These two calibrations result in contamination-tolerant, anomaly-informed one-class learning, yielding a significantly improved normality modeling. Extensive experiments on six real-world datasets show that our model substantially outperforms twelve state-of-the-art competitors and obtains 6% - 31% F1 score improvement. The source code is available at \url{https://github.com/xuhongzuo/couta}.
Task-Relevant Failure Detection for Trajectory Predictors in Autonomous Vehicles
Authors: Alec Farid, Sushant Veer, Boris Ivanovic, Karen Leung, Marco Pavone
Subjects: Robotics (cs.RO)
Arxiv link: https://arxiv.org/abs/2207.12380
Pdf link: https://arxiv.org/pdf/2207.12380
Abstract In modern autonomy stacks, prediction modules are paramount to planning motions in the presence of other mobile agents. However, failures in prediction modules can mislead the downstream planner into making unsafe decisions. Indeed, the high uncertainty inherent to the task of trajectory forecasting ensures that such mispredictions occur frequently. Motivated by the need to improve safety of autonomous vehicles without compromising on their performance, we develop a probabilistic run-time monitor that detects when a "harmful" prediction failure occurs, i.e., a task-relevant failure detector. We achieve this by propagating trajectory prediction errors to the planning cost to reason about their impact on the AV. Furthermore, our detector comes equipped with performance measures on the false-positive and the false-negative rate and allows for data-free calibration. In our experiments we compared our detector with various others and found that our detector has the highest area under the receiver operator characteristic curve.

ericbeyer / L-arxiv-interest-tracker

New submissions for Tue, 26 Jul 22 #580

Keyword: out of distribution detection

Keyword: out-of-distribution detection

Keyword: expected calibration error

Keyword: overconfident

Keyword: overconfidence

Keyword: confidence

Anomaly Detection for Fraud in Cryptocurrency Time Series

Self-Support Few-Shot Semantic Segmentation

Isabelle/HOL/GST: A Formal Proof Environment for Generalized Set Theories

Online Reinforcement Learning for Periodic MDP

A Confident Deep Learning loss function for one-step Conformal Prediction approximation

Keyword: scaling

Scalable training of graph convolutional neural networks for fast and accurate predictions of HOMO-LUMO gap in molecules

The Interplay of Spectral Efficiency, User Density, and Energy in Random Access Protocols with Retransmissions

Enhancing Image Rescaling using Dual Latent Variables in Invertible Neural Network

Next-generation HPC models for future rotorcraft applications

Stable Parallel Training of Wasserstein Conditional Generative Adversarial Neural Networks

Keyword: calibration

Epersist: A Self Balancing Robot Using PID Controller And Deep Reinforcement Learning

Keypoint-less Camera Calibration for Sports Field Registration in Soccer

Visual Perturbation-aware Collaborative Learning for Overcoming the Language Prior Problem

Representational Ethical Model Calibration

Calibrated One-class Classification for Unsupervised Time Series Anomaly Detection

Task-Relevant Failure Detection for Trajectory Predictors in Autonomous Vehicles