Abstract
A well-known failure mode of neural networks corresponds to high confidence erroneous predictions, especially for data that somehow differs from the training distribution. Such an unsafe behaviour limits their applicability. To counter that, we show that models offering accurate confidence levels can be defined via adding constraints in their internal representations. That is, we encode class labels as fixed unique binary vectors, or class codes, and use those to enforce class-dependent activation patterns throughout the model. Resulting predictors are dubbed Total Activation Classifiers (TAC), and TAC is used as an additional component to a base classifier to indicate how reliable a prediction is. Given a data instance, TAC slices intermediate representations into disjoint sets and reduces such slices into scalars, yielding activation profiles. During training, activation profiles are pushed towards the code assigned to a given training instance. At testing time, one can predict the class corresponding to the code that best matches the activation profile of an example. Empirically, we observe that the resemblance between activation patterns and their corresponding codes results in an inexpensive unsupervised approach for inducing discriminative confidence scores. Namely, we show that TAC is at least as good as state-of-the-art confidence scores extracted from existing models, while strictly improving the model's value on the rejection setting. TAC was also observed to work well on multiple types of architectures and data modalities.
Batch-Size Independent Regret Bounds for Combinatorial Semi-Bandits with Probabilistically Triggered Arms or Independent Arms
Authors: Xutong Liu, Jinhang Zuo, Siwei Wang, Carlee Joe-Wong, John C.S. Lui, Wei Chen
Abstract
In this paper, we study the combinatorial semi-bandits (CMAB) and focus on reducing the dependency of the batch-size $K$ in the regret bound, where $K$ is the total number of arms that can be pulled or triggered in each round. First, for the setting of CMAB with probabilistically triggered arms (CMAB-T), we discover a novel (directional) triggering probability and variance modulated (TPVM) condition that can replace the previously-used smoothness condition for various applications, such as cascading bandits, online network exploration and online influence maximization. Under this new condition, we propose a BCUCB-T algorithm with variance-aware confidence intervals and conduct regret analysis which reduces the $O(K)$ factor to $O(\log K)$ or $O(\log^2 K)$ in the regret bound, significantly improving the regret bounds for the above applications. Second, for the setting of non-triggering CMAB with independent arms, we propose a SESCB algorithm which leverages on the non-triggering version of the TPVM condition and completely removes the dependency on $K$ in the leading regret. As a valuable by-product, the regret analysis used in this paper can improve several existing results by a factor of $O(\log K)$. Finally, experimental evaluations show our superior performance compared with benchmark algorithms in different applications.
Explainable Artificial Intelligence Applications in Cyber Security: State-of-the-Art in Research
Abstract
This survey presents a comprehensive review of current literature on Explainable Artificial Intelligence (XAI) methods for cyber security applications. Due to the rapid development of Internet-connected systems and Artificial Intelligence in recent years, Artificial Intelligence including Machine Learning (ML) and Deep Learning (DL) has been widely utilized in the fields of cyber security including intrusion detection, malware detection, and spam filtering. However, although Artificial Intelligence-based approaches for the detection and defense of cyber attacks and threats are more advanced and efficient compared to the conventional signature-based and rule-based cyber security strategies, most ML-based techniques and DL-based techniques are deployed in the black-box manner, meaning that security experts and customers are unable to explain how such procedures reach particular conclusions. The deficiencies of transparency and interpretability of existing Artificial Intelligence techniques would decrease human users' confidence in the models utilized for the defense against cyber attacks, especially in current situations where cyber attacks become increasingly diverse and complicated. Therefore, it is essential to apply XAI in the establishment of cyber security models to create more explainable models while maintaining high accuracy and allowing human users to comprehend, trust, and manage the next generation of cyber defense mechanisms. Although there are papers reviewing Artificial Intelligence applications in cyber security areas and the vast literature on applying XAI in many fields including healthcare, financial services, and criminal justice, the surprising fact is that there are currently no survey research articles that concentrate on XAI applications in cyber security.
Keyword: scaling
Simulating BFT Protocol Implementations at Scale
Authors: Christian Berger, Sadok Ben Toumia, Hans P. Reiser
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Abstract
The novel blockchain generation of Byzantine fault-tolerant (BFT) state machine replication (SMR) protocols focuses on scalability and performance to meet requirements of distributed ledger technology (DLT), e.g., decentralization and geographic dispersion. Validating scalability and performance of BFT protocol implementations requires careful evaluation. While experiments with real protocol deployments usually offer the best realism, they are costly and time-consuming. In this paper, we explore simulation of unmodified BFT protocol implementations as as a method for cheap and rapid protocol evaluation: We can accurately forecast the performance of a BFT protocol while experimentally scaling its environment, i.e., by varying the number of nodes or geographic dispersion. Our approach is resource-friendly and preserves application-realism, since existing BFT frameworks can be simply plugged into the simulation engine without requiring code modifications or re-implementation.
Reducing Impacts of System Heterogeneity in Federated Learning using Weight Update Magnitudes
Abstract
The widespread adoption of handheld devices have fueled rapid growth in new applications. Several of these new applications employ machine learning models to train on user data that is typically private and sensitive. Federated Learning enables machine learning models to train locally on each handheld device while only synchronizing their neuron updates with a server. While this enables user privacy, technology scaling and software advancements have resulted in handheld devices with varying performance capabilities. This results in the training time of federated learning tasks to be dictated by a few low-performance straggler devices, essentially becoming a bottleneck to the entire training process. In this work, we aim to mitigate the performance bottleneck of federated learning by dynamically forming sub-models for stragglers based on their performance and accuracy feedback. To this end, we offer the Invariant Dropout, a dynamic technique that forms a sub-model based on the neuron update threshold. Invariant Dropout uses neuron updates from the non-straggler clients to develop a tailored sub-models for each straggler during each training iteration. All corresponding weights which have a magnitude less than the threshold are dropped for the iteration. We evaluate Invariant Dropout using five real-world mobile clients. Our evaluations show that Invariant Dropout obtains a maximum accuracy gain of 1.4% points over state-of-the-art Ordered Dropout while mitigating performance bottlenecks of stragglers.
Constructing relaxation systems for lattice Boltzmann methods
Authors: Stephan Simonis, Martin Frank, Mathias J. Krause
Subjects: Numerical Analysis (math.NA); Analysis of PDEs (math.AP)
Abstract
We present the first top-down ansatz for constructing lattice Boltzmann methods (LBM) in d dimensions. In particular, we construct a relaxation system (RS) for a given scalar, linear, d-dimensional advection-diffusion equation. Subsequently, the RS is linked to a d-dimensional discrete velocity Boltzmann model (DVBM) on the zeroth and first energy shell. Algebraic characterizations of the equilibrium, the moment space, and the collision operator are carried out. Further, a closed equation form of the RS expresses the added relaxation terms as prefactored higher order derivatives of the conserved quantity. Here, a generalized (2d+1)x(2d+1) RS is linked to a DdQ(2d+1) DVBM which, upon complete discretization, yields an LBM with second order accuracy in space and time. A rigorous convergence result for arbitrary scaling of the RS, the DVBM and conclusively also for the final LBM is proven. The top-down constructed LBM is numerically tested on multiple GPUs with smooth and non-smooth initial data in d=3 dimensions for several grid-normalized non-dimensional numbers.
Keyword: out of distribution detection
There is no result
Keyword: out-of-distribution detection
There is no result
Keyword: expected calibration error
There is no result
Keyword: overconfident
There is no result
Keyword: overconfidence
There is no result
Keyword: confidence
Constraining Representations Yields Models That Know What They Don't Know
Batch-Size Independent Regret Bounds for Combinatorial Semi-Bandits with Probabilistically Triggered Arms or Independent Arms
Explainable Artificial Intelligence Applications in Cyber Security: State-of-the-Art in Research
Keyword: scaling
Simulating BFT Protocol Implementations at Scale
Reducing Impacts of System Heterogeneity in Federated Learning using Weight Update Magnitudes
Constructing relaxation systems for lattice Boltzmann methods
Keyword: calibration
There is no result