Abstract
Commonly used AI networks are very self-confident in their predictions, even when the evidence for a certain decision is dubious. The investigation of a deep learning model output is pivotal for understanding its decision processes and assessing its capabilities and limitations. By analyzing the distributions of raw network output vectors, it can be observed that each class has its own decision boundary and, thus, the same raw output value has different support for different classes. Inspired by this fact, we have developed a new method for out-of-distribution detection. The method offers an explanatory step beyond simple thresholding of the softmax output towards understanding and interpretation of the model learning process and its output. Instead of assigning the class label of the highest logit to each new sample presented to the network, it takes the distributions over all classes into consideration. A probability score interpreter (PSI) is created based on the joint logit values in relation to their respective correct vs wrong class distributions. The PSI suggests whether the sample is likely to belong to a specific class, whether the network is unsure, or whether the sample is likely an outlier or unknown type for the network. The simple PSI has the benefit of being applicable on already trained networks. The distributions for correct vs wrong class for each output node are established by simply running the training examples through the trained network. We demonstrate our OOD detection method on a challenging transmission electron microscopy virus image dataset. We simulate a real-world application in which images of virus types unknown to a trained virus classifier, yet acquired with the same procedures and instruments, constitute the OOD samples.
Keyword: expected calibration error
Accurate and Reliable Methods for 5G UAV Jamming Identification With Calibrated Uncertainty
Authors: Hamed Farkhari, Joseanne Viana, Pedro Sebastiao, Luis Miguel Campos, Luis Bernardo, Rui Dinis, Sarang Kahvazadeh
Abstract
Only increasing accuracy without considering uncertainty may negatively impact Deep Neural Network (DNN) decision-making and decrease its reliability. This paper proposes five combined preprocessing and post-processing methods for time-series binary classification problems that simultaneously increase the accuracy and reliability of DNN outputs applied in a 5G UAV security dataset. These techniques use DNN outputs as input parameters and process them in different ways. Two methods use a well-known Machine Learning (ML) algorithm as a complement, and the other three use only confidence values that the DNN estimates. We compare seven different metrics, such as the Expected Calibration Error (ECE), Maximum Calibration Error (MCE), Mean Confidence (MC), Mean Accuracy (MA), Normalized Negative Log Likelihood (NLL), Brier Score Loss (BSL), and Reliability Score (RS) and the tradeoffs between them to evaluate the proposed hybrid algorithms. First, we show that the eXtreme Gradient Boosting (XGB) classifier might not be reliable for binary classification under the conditions this work presents. Second, we demonstrate that at least one of the potential methods can achieve better results than the classification in the DNN softmax layer. Finally, we show that the prospective methods may improve accuracy and reliability with better uncertainty calibration based on the assumption that the RS determines the difference between MC and MA metrics, and this difference should be zero to increase reliability. For example, Method 3 presents the best RS of 0.65 even when compared to the XGB classifier, which achieves RS of 7.22.
Calibration Meets Explanation: A Simple and Effective Approach for Model Confidence Estimates
Authors: Dongfang Li, Baotian Hu, Qingcai Chen
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Abstract
Calibration strengthens the trustworthiness of black-box models by producing better accurate confidence estimates on given examples. However, little is known about if model explanations can help confidence calibration. Intuitively, humans look at important features attributions and decide whether the model is trustworthy. Similarly, the explanations can tell us when the model may or may not know. Inspired by this, we propose a method named CME that leverages model explanations to make the model less confident with non-inductive attributions. The idea is that when the model is not highly confident, it is difficult to identify strong indications of any class, and the tokens accordingly do not have high attribution scores for any class and vice versa. We conduct extensive experiments on six datasets with two popular pre-trained language models in the in-domain and out-of-domain settings. The results show that CME improves calibration performance in all settings. The expected calibration errors are further reduced when combined with temperature scaling. Our findings highlight that model explanations can help calibrate posterior estimates.
Keyword: overconfident
There is no result
Keyword: overconfidence
There is no result
Keyword: confidence
Accurate and Reliable Methods for 5G UAV Jamming Identification With Calibrated Uncertainty
Authors: Hamed Farkhari, Joseanne Viana, Pedro Sebastiao, Luis Miguel Campos, Luis Bernardo, Rui Dinis, Sarang Kahvazadeh
Abstract
Only increasing accuracy without considering uncertainty may negatively impact Deep Neural Network (DNN) decision-making and decrease its reliability. This paper proposes five combined preprocessing and post-processing methods for time-series binary classification problems that simultaneously increase the accuracy and reliability of DNN outputs applied in a 5G UAV security dataset. These techniques use DNN outputs as input parameters and process them in different ways. Two methods use a well-known Machine Learning (ML) algorithm as a complement, and the other three use only confidence values that the DNN estimates. We compare seven different metrics, such as the Expected Calibration Error (ECE), Maximum Calibration Error (MCE), Mean Confidence (MC), Mean Accuracy (MA), Normalized Negative Log Likelihood (NLL), Brier Score Loss (BSL), and Reliability Score (RS) and the tradeoffs between them to evaluate the proposed hybrid algorithms. First, we show that the eXtreme Gradient Boosting (XGB) classifier might not be reliable for binary classification under the conditions this work presents. Second, we demonstrate that at least one of the potential methods can achieve better results than the classification in the DNN softmax layer. Finally, we show that the prospective methods may improve accuracy and reliability with better uncertainty calibration based on the assumption that the RS determines the difference between MC and MA metrics, and this difference should be zero to increase reliability. For example, Method 3 presents the best RS of 0.65 even when compared to the XGB classifier, which achieves RS of 7.22.
Calibration Meets Explanation: A Simple and Effective Approach for Model Confidence Estimates
Authors: Dongfang Li, Baotian Hu, Qingcai Chen
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Abstract
Calibration strengthens the trustworthiness of black-box models by producing better accurate confidence estimates on given examples. However, little is known about if model explanations can help confidence calibration. Intuitively, humans look at important features attributions and decide whether the model is trustworthy. Similarly, the explanations can tell us when the model may or may not know. Inspired by this, we propose a method named CME that leverages model explanations to make the model less confident with non-inductive attributions. The idea is that when the model is not highly confident, it is difficult to identify strong indications of any class, and the tokens accordingly do not have high attribution scores for any class and vice versa. We conduct extensive experiments on six datasets with two popular pre-trained language models in the in-domain and out-of-domain settings. The results show that CME improves calibration performance in all settings. The expected calibration errors are further reduced when combined with temperature scaling. Our findings highlight that model explanations can help calibrate posterior estimates.
Examining the Differential Risk from High-level Artificial Intelligence and the Question of Control
Authors: Kyle A. Kilian, Christopher J. Ventura, Mark M. Bailey
Subjects: Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Human-Computer Interaction (cs.HC)
Abstract
Artificial Intelligence (AI) is one of the most transformative technologies of the 21st century. The extent and scope of future AI capabilities remain a key uncertainty, with widespread disagreement on timelines and potential impacts. As nations and technology companies race toward greater complexity and autonomy in AI systems, there are concerns over the extent of integration and oversight of opaque AI decision processes. This is especially true in the subfield of machine learning (ML), where systems learn to optimize objectives without human assistance. Objectives can be imperfectly specified or executed in an unexpected or potentially harmful way. This becomes more concerning as systems increase in power and autonomy, where an abrupt capability jump could result in unexpected shifts in power dynamics or even catastrophic failures. This study presents a hierarchical complex systems framework to model AI risk and provide a template for alternative futures analysis. Survey data were collected from domain experts in the public and private sectors to classify AI impact and likelihood. The results show increased uncertainty over the powerful AI agent scenario, confidence in multiagent environments, and increased concern over AI alignment failures and influence-seeking behavior.
XAI-BayesHAR: A novel Framework for Human Activity Recognition with Integrated Uncertainty and Shapely Values
Abstract
Human activity recognition (HAR) using IMU sensors, namely accelerometer and gyroscope, has several applications in smart homes, healthcare and human-machine interface systems. In practice, the IMU-based HAR system is expected to encounter variations in measurement due to sensor degradation, alien environment or sensor noise and will be subjected to unknown activities. In view of practical deployment of the solution, analysis of statistical confidence over the activity class score are important metrics. In this paper, we therefore propose XAI-BayesHAR, an integrated Bayesian framework, that improves the overall activity classification accuracy of IMU-based HAR solutions by recursively tracking the feature embedding vector and its associated uncertainty via Kalman filter. Additionally, XAI-BayesHAR acts as an out of data distribution (OOD) detector using the predictive uncertainty which help to evaluate and detect alien input data distribution. Furthermore, Shapley value-based performance of the proposed framework is also evaluated to understand the importance of the feature embedding vector and accordingly used for model compression
Camera Alignment and Weighted Contrastive Learning for Domain Adaptation in Video Person ReID
Authors: Djebril Mekhazni, Maximilien Dufau, Christian Desrosiers, Marco Pedersoli, Eric Granger
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Abstract
Systems for person re-identification (ReID) can achieve a high accuracy when trained on large fully-labeled image datasets. However, the domain shift typically associated with diverse operational capture conditions (e.g., camera viewpoints and lighting) may translate to a significant decline in performance. This paper focuses on unsupervised domain adaptation (UDA) for video-based ReID - a relevant scenario that is less explored in the literature. In this scenario, the ReID model must adapt to a complex target domain defined by a network of diverse video cameras based on tracklet information. State-of-art methods cluster unlabeled target data, yet domain shifts across target cameras (sub-domains) can lead to poor initialization of clustering methods that propagates noise across epochs, thus preventing the ReID model to accurately associate samples of same identity. In this paper, an UDA method is introduced for video person ReID that leverages knowledge on video tracklets, and on the distribution of frames captured over target cameras to improve the performance of CNN backbones trained using pseudo-labels. Our method relies on an adversarial approach, where a camera-discriminator network is introduced to extract discriminant camera-independent representations, facilitating the subsequent clustering. In addition, a weighted contrastive loss is proposed to leverage the confidence of clusters, and mitigate the risk of incorrect identity associations. Experimental results obtained on three challenging video-based person ReID datasets - PRID2011, iLIDS-VID, and MARS - indicate that our proposed method can outperform related state-of-the-art methods. Our code is available at: \url{https://github.com/dmekhazni/CAWCL-ReID}
Keyword: scaling
Intriguing Properties of Compression on Multilingual Models
Authors: Kelechi Ogueji, Orevaoghene Ahia, Gbemileke Onilude, Sebastian Gehrmann, Sara Hooker, Julia Kreutzer
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Abstract
Multilingual models are often particularly dependent on scaling to generalize to a growing number of languages. Compression techniques are widely relied upon to reconcile the growth in model size with real world resource constraints, but compression can have a disparate effect on model performance for low-resource languages. It is thus crucial to understand the trade-offs between scale, multilingualism, and compression. In this work, we propose an experimental framework to characterize the impact of sparsifying multilingual pre-trained language models during fine-tuning. Applying this framework to mBERT named entity recognition models across 40 languages, we find that compression confers several intriguing and previously unknown generalization properties. In contrast to prior findings, we find that compression may improve model robustness over dense models. We additionally observe that under certain sparsification regimes compression may aid, rather than disproportionately impact the performance of low-resource languages.
Calibration Meets Explanation: A Simple and Effective Approach for Model Confidence Estimates
Authors: Dongfang Li, Baotian Hu, Qingcai Chen
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Abstract
Calibration strengthens the trustworthiness of black-box models by producing better accurate confidence estimates on given examples. However, little is known about if model explanations can help confidence calibration. Intuitively, humans look at important features attributions and decide whether the model is trustworthy. Similarly, the explanations can tell us when the model may or may not know. Inspired by this, we propose a method named CME that leverages model explanations to make the model less confident with non-inductive attributions. The idea is that when the model is not highly confident, it is difficult to identify strong indications of any class, and the tokens accordingly do not have high attribution scores for any class and vice versa. We conduct extensive experiments on six datasets with two popular pre-trained language models in the in-domain and out-of-domain settings. The results show that CME improves calibration performance in all settings. The expected calibration errors are further reduced when combined with temperature scaling. Our findings highlight that model explanations can help calibrate posterior estimates.
Fast Key Points Detection and Matching for Tree-Structured Images
Authors: Hao Wang, Xiwen Chen, Abolfazl Razi
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Abstract
This paper offers a new authentication algorithm based on image matching of nano-resolution visual identifiers with tree-shaped patterns. The algorithm includes image-to-tree conversion by greedy extraction of the fractal pattern skeleton along with a custom-built graph matching algorithm that is robust against imaging artifacts such as scaling, rotation, scratch, and illumination change. The proposed algorithm is applicable to a variety of tree-structured image matching, but our focus is on dendrites, recently-developed visual identifiers. Dendrites are entropy rich and unclonable with existing 2D and 3D printers due to their natural randomness, nano-resolution granularity, and 3D facets, making them an appropriate choice for security applications such as supply chain trace and tracking. The proposed algorithm improves upon graph matching with standard image descriptors. For instance, image inconsistency due to the camera sensor noise may cause unexpected feature extraction leading to inaccurate tree conversion and authentication failure. Also, previous tree extraction algorithms are prohibitively slow hindering their scalability to large systems. In this paper, we fix the current issues of [1] and accelerate the key points extraction up to 10-times faster by implementing a new skeleton extraction method, a new key points searching algorithm, as well as an optimized key point matching algorithm. Using minimum enclosing circle and center points, make the algorithm robust to the choice of pattern shape. In contrast to [1] our algorithm handles general graphs with loop connections, therefore is applicable to a wider range of applications such as transportation map analysis, fingerprints, and retina vessel imaging.
Moving Frame Net: SE(3)-Equivariant Network for Volumes
Authors: Mateus Sangalli (CMM), Samy Blusseau (CMM), Santiago Velasco-Forero (CMM), Jesus Angulo (CMM)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
Abstract
Equivariance of neural networks to transformations helps to improve their performance and reduce generalization error in computer vision tasks, as they apply to datasets presenting symmetries (e.g. scalings, rotations, translations). The method of moving frames is classical for deriving operators invariant to the action of a Lie group in a manifold.Recently, a rotation and translation equivariant neural network for image data was proposed based on the moving frames approach. In this paper we significantly improve that approach by reducing the computation of moving frames to only one, at the input stage, instead of repeated computations at each layer. The equivariance of the resulting architecture is proved theoretically and we build a rotation and translation equivariant neural network to process volumes, i.e. signals on the 3D space. Our trained model overperforms the benchmarks in the medical volume classification of most of the tested datasets from MedMNIST3D.
Keyword: calibration
A degree 4 sum-of-squares lower bound for the clique number of the Paley graph
Authors: Dmitriy Kunisky, Xifan Yu
Subjects: Data Structures and Algorithms (cs.DS); Computational Complexity (cs.CC); Number Theory (math.NT); Optimization and Control (math.OC)
Abstract
We prove that the degree 4 sum-of-squares (SOS) relaxation of the clique number of the Paley graph on a prime number $p$ of vertices has value at least $\Omega(p^{1/3})$. This is in contrast to the widely believed conjecture that the actual clique number of the Paley graph is $O(\mathrm{polylog}(p))$. Our result may be viewed as a derandomization of that of Deshpande and Montanari (2015), who showed the same lower bound (up to $\mathrm{polylog}(p)$ terms) with high probability for the Erd\H{o}s-R\'{e}nyi random graph on $p$ vertices, whose clique number is with high probability $O(\log(p))$. We also show that our lower bound is optimal for the Feige-Krauthgamer construction of pseudomoments, derandomizing an argument of Kelner. Finally, we present numerical experiments indicating that the value of the degree 4 SOS relaxation of the Paley graph may scale as $O(p^{1/2 - \epsilon})$ for some $\epsilon > 0$, and give a matrix norm calculation indicating that the pseudocalibration proof strategy for SOS lower bounds for random graphs will not immediately transfer to the Paley graph. Taken together, our results suggest that degree 4 SOS may break the "$\sqrt{p}$ barrier" for upper bounds on the clique number of Paley graphs, but prove that it can at best improve the exponent from $1/2$ to $1/3$.
ESKNet-An enhanced adaptive selection kernel convolution for breast tumors segmentation
Abstract
Breast cancer is one of the common cancers that endanger the health of women globally. Accurate target lesion segmentation is essential for early clinical intervention and postoperative follow-up. Recently, many convolutional neural networks (CNNs) have been proposed to segment breast tumors from ultrasound images. However, the complex ultrasound pattern and the variable tumor shape and size bring challenges to the accurate segmentation of the breast lesion. Motivated by the selective kernel convolution, we introduce an enhanced selective kernel convolution for breast tumor segmentation, which integrates multiple feature map region representations and adaptively recalibrates the weights of these feature map regions from the channel and spatial dimensions. This region recalibration strategy enables the network to focus more on high-contributing region features and mitigate the perturbation of less useful regions. Finally, the enhanced selective kernel convolution is integrated into U-net with deep supervision constraints to adaptively capture the robust representation of breast tumors. Extensive experiments with twelve state-of-the-art deep learning segmentation methods on three public breast ultrasound datasets demonstrate that our method has a more competitive segmentation performance in breast ultrasound images.
Accurate and Reliable Methods for 5G UAV Jamming Identification With Calibrated Uncertainty
Authors: Hamed Farkhari, Joseanne Viana, Pedro Sebastiao, Luis Miguel Campos, Luis Bernardo, Rui Dinis, Sarang Kahvazadeh
Abstract
Only increasing accuracy without considering uncertainty may negatively impact Deep Neural Network (DNN) decision-making and decrease its reliability. This paper proposes five combined preprocessing and post-processing methods for time-series binary classification problems that simultaneously increase the accuracy and reliability of DNN outputs applied in a 5G UAV security dataset. These techniques use DNN outputs as input parameters and process them in different ways. Two methods use a well-known Machine Learning (ML) algorithm as a complement, and the other three use only confidence values that the DNN estimates. We compare seven different metrics, such as the Expected Calibration Error (ECE), Maximum Calibration Error (MCE), Mean Confidence (MC), Mean Accuracy (MA), Normalized Negative Log Likelihood (NLL), Brier Score Loss (BSL), and Reliability Score (RS) and the tradeoffs between them to evaluate the proposed hybrid algorithms. First, we show that the eXtreme Gradient Boosting (XGB) classifier might not be reliable for binary classification under the conditions this work presents. Second, we demonstrate that at least one of the potential methods can achieve better results than the classification in the DNN softmax layer. Finally, we show that the prospective methods may improve accuracy and reliability with better uncertainty calibration based on the assumption that the RS determines the difference between MC and MA metrics, and this difference should be zero to increase reliability. For example, Method 3 presents the best RS of 0.65 even when compared to the XGB classifier, which achieves RS of 7.22.
Calibration Meets Explanation: A Simple and Effective Approach for Model Confidence Estimates
Authors: Dongfang Li, Baotian Hu, Qingcai Chen
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Abstract
Calibration strengthens the trustworthiness of black-box models by producing better accurate confidence estimates on given examples. However, little is known about if model explanations can help confidence calibration. Intuitively, humans look at important features attributions and decide whether the model is trustworthy. Similarly, the explanations can tell us when the model may or may not know. Inspired by this, we propose a method named CME that leverages model explanations to make the model less confident with non-inductive attributions. The idea is that when the model is not highly confident, it is difficult to identify strong indications of any class, and the tokens accordingly do not have high attribution scores for any class and vice versa. We conduct extensive experiments on six datasets with two popular pre-trained language models in the in-domain and out-of-domain settings. The results show that CME improves calibration performance in all settings. The expected calibration errors are further reduced when combined with temperature scaling. Our findings highlight that model explanations can help calibrate posterior estimates.
Learning body models: from humans to humanoids
Authors: Matej Hoffmann
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Neurons and Cognition (q-bio.NC)
Abstract
Humans and animals excel in combining information from multiple sensory modalities, controlling their complex bodies, adapting to growth, failures, or using tools. These capabilities are also highly desirable in robots. They are displayed by machines to some extent. Yet, the artificial creatures are lagging behind. The key foundation is an internal representation of the body that the agent - human, animal, or robot - has developed. The mechanisms of operation of body models in the brain are largely unknown and even less is known about how they are constructed from experience after birth. In collaboration with developmental psychologists, we conducted targeted experiments to understand how infants acquire first "sensorimotor body knowledge". These experiments inform our work in which we construct embodied computational models on humanoid robots that address the mechanisms behind learning, adaptation, and operation of multimodal body representations. At the same time, we assess which of the features of the "body in the brain" should be transferred to robots to give rise to more adaptive and resilient, self-calibrating machines. We extend traditional robot kinematic calibration focusing on self-contained approaches where no external metrology is needed: self-contact and self-observation. Problem formulation allowing to combine several ways of closing the kinematic chain simultaneously is presented, along with a calibration toolbox and experimental validation on several robot platforms. Finally, next to models of the body itself, we study peripersonal space - the space immediately surrounding the body. Again, embodied computational models are developed and subsequently, the possibility of turning these biologically inspired representations into safe human-robot collaboration is studied.
A Driving Risk Surrogate and Its Application in Car-Following Scenario at Expressway
Authors: Renfei Wu, Linheng Li, Haotian Shi, Yikang Rui, Dong Ngoduy, Bin Ran
Abstract
Traffic safety is important in reducing death and building a harmonious society. In addition to studies of accident incidences, the perception of driving risk is significant in guiding the implementation of appropriate driving countermeasures. Risk assessment can be conducted in real-time for traffic safety due to the rapid development of communication technology and computing capabilities. This paper aims at the problems of difficult calibration and inconsistent thresholds in the existing risk assessment methods. It proposes a risk assessment model based on the potential field to quantify the driving risk of vehicles. Firstly, virtual energy is proposed as an attribute considering vehicle sizes and velocity. Secondly, the driving risk surrogate(DRS) is proposed based on potential field theory to describe the risk degree of vehicles. Risk factors are quantified by establishing submodels, including an interactive vehicle risk surrogate, a restrictions risk surrogate, and a speed risk surrogate. To unify the risk threshold, acceleration for implementation guidance is derived from the risk field strength. Finally, a naturalistic driving dataset in Nanjing, China, is selected, and 3063 pairs of following naturalistic trajectories are screened out. Based on that, the proposed model and other models use for comparisons are calibrated through the improved particle optimization algorithm. Simulations prove that the proposed model performs better than other algorithms in risk perception and response, car-following trajectory, and velocity estimation. In addition, the proposed model exhibits better car-following ability than existing car-following models.
Keyword: out of distribution detection
There is no result
Keyword: out-of-distribution detection
Interpreting deep learning output for out-of-distribution detection
Keyword: expected calibration error
Accurate and Reliable Methods for 5G UAV Jamming Identification With Calibrated Uncertainty
Calibration Meets Explanation: A Simple and Effective Approach for Model Confidence Estimates
Keyword: overconfident
There is no result
Keyword: overconfidence
There is no result
Keyword: confidence
Accurate and Reliable Methods for 5G UAV Jamming Identification With Calibrated Uncertainty
Calibration Meets Explanation: A Simple and Effective Approach for Model Confidence Estimates
Examining the Differential Risk from High-level Artificial Intelligence and the Question of Control
XAI-BayesHAR: A novel Framework for Human Activity Recognition with Integrated Uncertainty and Shapely Values
Camera Alignment and Weighted Contrastive Learning for Domain Adaptation in Video Person ReID
Keyword: scaling
Intriguing Properties of Compression on Multilingual Models
Calibration Meets Explanation: A Simple and Effective Approach for Model Confidence Estimates
Fast Key Points Detection and Matching for Tree-Structured Images
Moving Frame Net: SE(3)-Equivariant Network for Volumes
Keyword: calibration
A degree 4 sum-of-squares lower bound for the clique number of the Paley graph
ESKNet-An enhanced adaptive selection kernel convolution for breast tumors segmentation
Accurate and Reliable Methods for 5G UAV Jamming Identification With Calibrated Uncertainty
Calibration Meets Explanation: A Simple and Effective Approach for Model Confidence Estimates
Learning body models: from humans to humanoids
A Driving Risk Surrogate and Its Application in Car-Following Scenario at Expressway