New submissions for Fri, 27 May 22

Keyword: out of distribution detection

There is no result

Keyword: out-of-distribution detection

There is no result

Keyword: expected calibration error

There is no result

Keyword: overconfident

There is no result

Keyword: overconfidence

There is no result

Keyword: confidence

How explainable are adversarially-robust CNNs?

Authors: Mehdi Nourelahi, Lars Kotthoff, Peijie Chen, Anh Nguyen
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)
Arxiv link: https://arxiv.org/abs/2205.13042
Pdf link: https://arxiv.org/pdf/2205.13042
Abstract Three important criteria of existing convolutional neural networks (CNNs) are (1) test-set accuracy; (2) out-of-distribution accuracy; and (3) explainability. While these criteria have been studied independently, their relationship is unknown. For example, do CNNs that have a stronger out-of-distribution performance have also stronger explainability? Furthermore, most prior feature-importance studies only evaluate methods on 2-3 common vanilla ImageNet-trained CNNs, leaving it unknown how these methods generalize to CNNs of other architectures and training algorithms. Here, we perform the first, large-scale evaluation of the relations of the three criteria using 9 feature-importance methods and 12 ImageNet-trained CNNs that are of 3 training algorithms and 5 CNN architectures. We find several important insights and recommendations for ML practitioners. First, adversarially robust CNNs have a higher explainability score on gradient-based attribution methods (but not CAM-based or perturbation-based methods). Second, AdvProp models, despite being highly accurate more than both vanilla and robust models alone, are not superior in explainability. Third, among 9 feature attribution methods tested, GradCAM and RISE are consistently the best methods. Fourth, Insertion and Deletion are biased towards vanilla and robust models respectively, due to their strong correlation with the confidence score distributions of a CNN. Fifth, we did not find a single CNN to be the best in all three criteria, which interestingly suggests that CNNs are harder to interpret as they become more accurate.
Near-Optimal Goal-Oriented Reinforcement Learning in Non-Stationary Environments
Authors: Liyu Chen, Haipeng Luo
Subjects: Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2205.13044
Pdf link: https://arxiv.org/pdf/2205.13044
Abstract We initiate the study of dynamic regret minimization for goal-oriented reinforcement learning modeled by a non-stationary stochastic shortest path problem with changing cost and transition functions. We start by establishing a lower bound $\Omega((B{\star} SAT{\star}(\Deltac + B{\star}^2\DeltaP))^{1/3}K^{2/3})$, where $B{\star}$ is the maximum expected cost of the optimal policy of any episode starting from any state, $T_{\star}$ is the maximum hitting time of the optimal policy of any episode starting from the initial state, $SA$ is the number of state-action pairs, $\Delta_c$ and $\Delta_P$ are the amount of changes of the cost and transition functions respectively, and $K$ is the number of episodes. The different roles of $\Delta_c$ and $\Delta_P$ in this lower bound inspire us to design algorithms that estimate costs and transitions separately. Specifically, assuming the knowledge of $\Delta_c$ and $\Delta_P$, we develop a simple but sub-optimal algorithm and another more involved minimax optimal algorithm (up to logarithmic terms). These algorithms combine the ideas of finite-horizon approximation [Chen et al., 2022a], special Bernstein-style bonuses of the MVP algorithm [Zhang et al., 2020], adaptive confidence widening [Wei and Luo, 2021], as well as some new techniques such as properly penalizing long-horizon policies. Finally, when $\Delta_c$ and $\DeltaP$ are unknown, we develop a variant of the MASTER algorithm [Wei and Luo, 2021] and integrate the aforementioned ideas into it to achieve $\widetilde{O}(\min{B{\star} S\sqrt{ALK}, (B{\star}^2S^2AT{\star}(\Deltac+B{\star}\Delta_P))^{1/3}K^{2/3}})$ regret, where $L$ is the unknown number of changes of the environment.
Penalizing Proposals using Classifiers for Semi-Supervised Object Detection
Authors: Somnath Hazra, Pallab Dasgupta
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2205.13219
Pdf link: https://arxiv.org/pdf/2205.13219
Abstract Obtaining gold standard annotated data for object detection is often costly, involving human-level effort. Semi-supervised object detection algorithms solve the problem with a small amount of gold-standard labels and a large unlabelled dataset used to generate silver-standard labels. But training on the silver standard labels does not produce good results, because they are machine-generated annotations. In this work, we design a modified loss function to train on large silver standard annotated sets generated by a weak annotator. We include a confidence metric associated with the annotation as an additional term in the loss function, signifying the quality of the annotation. We test the effectiveness of our approach on various test sets and use numerous variations to compare the results with some of the current approaches to object detection. In comparison with the baseline where no confidence metric is used, we achieved a 4\% gain in mAP with 25\% labeled data and 10\% gain in mAP with 50\% labeled data by using the proposed confidence metric.
Deep Active Learning with Noise Stability
Authors: Xingjian Li, Pengkun Yang, Tianyang Wang, Xueying Zhan, Min Xu, Dejing Dou, Chengzhong Xu
Subjects: Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2205.13340
Pdf link: https://arxiv.org/pdf/2205.13340
Abstract Uncertainty estimation for unlabeled data is crucial to active learning. With a deep neural network employed as the backbone model, the data selection process is highly challenging due to the potential over-confidence of the model inference. Existing methods resort to special learning fashions (e.g. adversarial) or auxiliary models to address this challenge. This tends to result in complex and inefficient pipelines, which would render the methods impractical. In this work, we propose a novel algorithm that leverages noise stability to estimate data uncertainty in a Single-Training Multi-Inference fashion. The key idea is to measure the output derivation from the original observation when the model parameters are randomly perturbed by noise. We provide theoretical analyses by leveraging the small Gaussian noise theory and demonstrate that our method favors a subset with large and diverse gradients. Despite its simplicity, our method outperforms the state-of-the-art active learning baselines in various tasks, including computer vision, natural language processing, and structural data analysis.
Keyword: scaling

VizInspect Pro -- Automated Optical Inspection (AOI) solution
Authors: Faraz Waseem, Sanjit Menon, Haotian Xu, Debashis Mondal
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2205.13095
Pdf link: https://arxiv.org/pdf/2205.13095
Abstract Traditional vision based Automated Optical Inspection (referred to as AOI in paper) systems present multiple challenges in factory settings including inability to scale across multiple product lines, requirement of vendor programming expertise, little tolerance to variations and lack of cloud connectivity for aggregated insights. The lack of flexibility in these systems presents a unique opportunity for a deep learning based AOI system specifically for factory automation. The proposed solution, VizInspect pro is a generic computer vision based AOI solution built on top of Leo - An edge AI platform. Innovative features that overcome challenges of traditional vision systems include deep learning based image analysis which combines the power of self-learning with high speed and accuracy, an intuitive user interface to configure inspection profiles in minutes without ML or vision expertise and the ability to solve complex inspection challenges while being tolerant to deviations and unpredictable defects. This solution has been validated by multiple external enterprise customers with confirmed value propositions. In this paper we show you how this solution and platform solved problems around model development, deployment, scaling multiple inferences and visualizations.
Keyword: calibration

SigMaNet: One Laplacian to Rule Them All
Authors: Stefano Fiorini, Stefano Coniglio, Michele Ciavotta, Enza Messina
Subjects: Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2205.13459
Pdf link: https://arxiv.org/pdf/2205.13459
Abstract This paper introduces SigMaNet, a generalized Graph Convolutional Network (GCN) capable of handling both undirected and directed graphs with weights not restricted in sign and magnitude. The cornerstone of SigMaNet is the introduction of a generalized Laplacian matrix: the Sign-Magnetic Laplacian ($L^\sigma$). The adoption of such a matrix allows us to bridge a gap in the current literature by extending the theory of spectral GCNs to directed graphs with both positive and negative weights. $L^{\sigma}$ exhibits several desirable properties not enjoyed by the traditional Laplacian matrices on which several state-of-the-art architectures are based. In particular, $L^\sigma$ is completely parameter-free, which is not the case of Laplacian operators such as the Magnetic Laplacian $L^{(q)}$, where the calibration of the parameter q is an essential yet problematic component of the operator. $L^\sigma$ simplifies the approach, while also allowing for a natural interpretation of the signs of the edges in terms of their directions. The versatility of the proposed approach is amply demonstrated experimentally; the proposed network SigMaNet turns out to be competitive in all the tasks we considered, regardless of the graph structure.

ericbeyer / L-arxiv-interest-tracker

New submissions for Fri, 27 May 22 #522

Keyword: out of distribution detection

Keyword: out-of-distribution detection

Keyword: expected calibration error

Keyword: overconfident

Keyword: overconfidence

Keyword: confidence

How explainable are adversarially-robust CNNs?

Near-Optimal Goal-Oriented Reinforcement Learning in Non-Stationary Environments

Penalizing Proposals using Classifiers for Semi-Supervised Object Detection

Deep Active Learning with Noise Stability

Keyword: scaling

VizInspect Pro -- Automated Optical Inspection (AOI) solution

Keyword: calibration

SigMaNet: One Laplacian to Rule Them All