Abstract
Neural networks have achieved impressive performance for data in the distribution which is the same as the training set but can produce an overconfident incorrect result for the data these networks have never seen. Therefore, it is essential to detect whether inputs come from out-of-distribution(OOD) in order to guarantee the safety of neural networks deployed in the real world. In this paper, we propose a simple and effective post-hoc technique, WeShort, to reduce the overconfidence of neural networks on OOD data. Our method is inspired by the observation of the internal residual structure, which shows the separation of the OOD and in-distribution (ID) data in the shortcut layer. Our method is compatible with different OOD detection scores and can generalize well to different architectures of networks. We demonstrate our method on various OOD datasets to show its competitive performances and provide reasonable hypotheses to explain why our method works. On the ImageNet benchmark, Weshort achieves state-of-the-art performance on the false positive rate (FPR95) and the area under the receiver operating characteristic (AUROC) on the family of post-hoc methods.
Keyword: overconfidence
WeShort: Out-of-distribution Detection With Weak Shortcut structure
Abstract
Neural networks have achieved impressive performance for data in the distribution which is the same as the training set but can produce an overconfident incorrect result for the data these networks have never seen. Therefore, it is essential to detect whether inputs come from out-of-distribution(OOD) in order to guarantee the safety of neural networks deployed in the real world. In this paper, we propose a simple and effective post-hoc technique, WeShort, to reduce the overconfidence of neural networks on OOD data. Our method is inspired by the observation of the internal residual structure, which shows the separation of the OOD and in-distribution (ID) data in the shortcut layer. Our method is compatible with different OOD detection scores and can generalize well to different architectures of networks. We demonstrate our method on various OOD datasets to show its competitive performances and provide reasonable hypotheses to explain why our method works. On the ImageNet benchmark, Weshort achieves state-of-the-art performance on the false positive rate (FPR95) and the area under the receiver operating characteristic (AUROC) on the family of post-hoc methods.
Keyword: confidence
WeShort: Out-of-distribution Detection With Weak Shortcut structure
Abstract
Neural networks have achieved impressive performance for data in the distribution which is the same as the training set but can produce an overconfident incorrect result for the data these networks have never seen. Therefore, it is essential to detect whether inputs come from out-of-distribution(OOD) in order to guarantee the safety of neural networks deployed in the real world. In this paper, we propose a simple and effective post-hoc technique, WeShort, to reduce the overconfidence of neural networks on OOD data. Our method is inspired by the observation of the internal residual structure, which shows the separation of the OOD and in-distribution (ID) data in the shortcut layer. Our method is compatible with different OOD detection scores and can generalize well to different architectures of networks. We demonstrate our method on various OOD datasets to show its competitive performances and provide reasonable hypotheses to explain why our method works. On the ImageNet benchmark, Weshort achieves state-of-the-art performance on the false positive rate (FPR95) and the area under the receiver operating characteristic (AUROC) on the family of post-hoc methods.
Discovering Domain Disentanglement for Generalized Multi-source Domain Adaptation
Authors: Zixin Wang, Yadan Luo, Peng-Fei Zhang, Sen Wang, Zi Huang
Abstract
A typical multi-source domain adaptation (MSDA) approach aims to transfer knowledge learned from a set of labeled source domains, to an unlabeled target domain. Nevertheless, prior works strictly assume that each source domain shares the identical group of classes with the target domain, which could hardly be guaranteed as the target label space is not observable. In this paper, we consider a more versatile setting of MSDA, namely Generalized Multi-source Domain Adaptation, wherein the source domains are partially overlapped, and the target domain is allowed to contain novel categories that are not presented in any source domains. This new setting is more elusive than any existing domain adaptation protocols due to the coexistence of the domain and category shifts across the source and target domains. To address this issue, we propose a variational domain disentanglement (VDD) framework, which decomposes the domain representations and semantic features for each instance by encouraging dimension-wise independence. To identify the target samples of unknown classes, we leverage online pseudo labeling, which assigns the pseudo-labels to unlabeled target data based on the confidence scores. Quantitative and qualitative experiments conducted on two benchmark datasets demonstrate the validity of the proposed framework.
Keyword: scaling
Language Models (Mostly) Know What They Know
Authors: Saurav Kadavath, Tom Conerly, Amanda Askell, Tom Henighan, Dawn Drain, Ethan Perez, Nicholas Schiefer, Zac Hatfield Dodds, Nova DasSarma, Eli Tran-Johnson, Scott Johnston, Sheer El-Showk, Andy Jones, Nelson Elhage, Tristan Hume, Anna Chen, Yuntao Bai, Sam Bowman, Stanislav Fort, Deep Ganguli, Danny Hernandez, Josh Jacobson, Jackson Kernion, Shauna Kravec, Liane Lovitt, Kamal Ndousse, Catherine Olsson, Sam Ringer, Dario Amodei, Tom Brown, Jack Clark, Nicholas Joseph, Ben Mann, Sam McCandlish, Chris Olah, Jared Kaplan
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Abstract
We study whether language models can evaluate the validity of their own claims and predict which questions they will be able to answer correctly. We first show that larger models are well-calibrated on diverse multiple choice and true/false questions when they are provided in the right format. Thus we can approach self-evaluation on open-ended sampling tasks by asking models to first propose answers, and then to evaluate the probability "P(True)" that their answers are correct. We find encouraging performance, calibration, and scaling for P(True) on a diverse array of tasks. Performance at self-evaluation further improves when we allow models to consider many of their own samples before predicting the validity of one specific possibility. Next, we investigate whether models can be trained to predict "P(IK)", the probability that "I know" the answer to a question, without reference to any particular proposed answer. Models perform well at predicting P(IK) and partially generalize across tasks, though they struggle with calibration of P(IK) on new tasks. The predicted P(IK) probabilities also increase appropriately in the presence of relevant source materials in the context, and to the presence of hints towards the solution of mathematical word problems. We hope these observations lay the groundwork for training more honest models, and for investigating how honesty generalizes to cases where models are trained on objectives other than the imitation of human writing.
CompoundE: Knowledge Graph Embedding with Translation, Rotation and Scaling Compound Operations
Authors: Xiou Ge, Yun-Cheng Wang, Bin Wang, C.-C. Jay Kuo
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
Abstract
Translation, rotation, and scaling are three commonly used geometric manipulation operations in image processing. Besides, some of them are successfully used in developing effective knowledge graph embedding (KGE) models such as TransE and RotatE. Inspired by the synergy, we propose a new KGE model by leveraging all three operations in this work. Since translation, rotation, and scaling operations are cascaded to form a compound one, the new model is named CompoundE. By casting CompoundE in the framework of group theory, we show that quite a few scoring-function-based KGE models are special cases of CompoundE. CompoundE extends the simple distance-based relation to relation-dependent compound operations on head and/or tail entities. To demonstrate the effectiveness of CompoundE, we conduct experiments on three popular KG completion datasets. Experimental results show that CompoundE consistently achieves the state of-the-art performance.
PAC Reinforcement Learning for Predictive State Representations
Authors: Wenhao Zhan, Masatoshi Uehara, Wen Sun, Jason D. Lee
Abstract
In this paper we study online Reinforcement Learning (RL) in partially observable dynamical systems. We focus on the Predictive State Representations (PSRs) model, which is an expressive model that captures other well-known models such as Partially Observable Markov Decision Processes (POMDP). PSR represents the states using a set of predictions of future observations and is defined entirely using observable quantities. We develop a novel model-based algorithm for PSRs that can learn a near optimal policy in sample complexity scaling polynomially with respect to all the relevant parameters of the systems. Our algorithm naturally works with function approximation to extend to systems with potentially large state and observation spaces. We show that given a realizable model class, the sample complexity of learning the near optimal policy only scales polynomially with respect to the statistical complexity of the model class, without any explicit polynomial dependence on the size of the state and observation spaces. Notably, our work is the first work that shows polynomial sample complexities to compete with the globally optimal policy in PSRs. Finally, we demonstrate how our general theorem can be directly used to derive sample complexity bounds for special models including $m$-step weakly revealing and $m$-step decodable tabular POMDPs, POMDPs with low-rank latent transition, and POMDPs with linear emission and latent transition.
Keyword: calibration
Language Models (Mostly) Know What They Know
Authors: Saurav Kadavath, Tom Conerly, Amanda Askell, Tom Henighan, Dawn Drain, Ethan Perez, Nicholas Schiefer, Zac Hatfield Dodds, Nova DasSarma, Eli Tran-Johnson, Scott Johnston, Sheer El-Showk, Andy Jones, Nelson Elhage, Tristan Hume, Anna Chen, Yuntao Bai, Sam Bowman, Stanislav Fort, Deep Ganguli, Danny Hernandez, Josh Jacobson, Jackson Kernion, Shauna Kravec, Liane Lovitt, Kamal Ndousse, Catherine Olsson, Sam Ringer, Dario Amodei, Tom Brown, Jack Clark, Nicholas Joseph, Ben Mann, Sam McCandlish, Chris Olah, Jared Kaplan
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Abstract
We study whether language models can evaluate the validity of their own claims and predict which questions they will be able to answer correctly. We first show that larger models are well-calibrated on diverse multiple choice and true/false questions when they are provided in the right format. Thus we can approach self-evaluation on open-ended sampling tasks by asking models to first propose answers, and then to evaluate the probability "P(True)" that their answers are correct. We find encouraging performance, calibration, and scaling for P(True) on a diverse array of tasks. Performance at self-evaluation further improves when we allow models to consider many of their own samples before predicting the validity of one specific possibility. Next, we investigate whether models can be trained to predict "P(IK)", the probability that "I know" the answer to a question, without reference to any particular proposed answer. Models perform well at predicting P(IK) and partially generalize across tasks, though they struggle with calibration of P(IK) on new tasks. The predicted P(IK) probabilities also increase appropriately in the presence of relevant source materials in the context, and to the presence of hints towards the solution of mathematical word problems. We hope these observations lay the groundwork for training more honest models, and for investigating how honesty generalizes to cases where models are trained on objectives other than the imitation of human writing.
Keyword: out of distribution detection
There is no result
Keyword: out-of-distribution detection
There is no result
Keyword: expected calibration error
There is no result
Keyword: overconfident
WeShort: Out-of-distribution Detection With Weak Shortcut structure
Keyword: overconfidence
WeShort: Out-of-distribution Detection With Weak Shortcut structure
Keyword: confidence
WeShort: Out-of-distribution Detection With Weak Shortcut structure
Discovering Domain Disentanglement for Generalized Multi-source Domain Adaptation
Keyword: scaling
Language Models (Mostly) Know What They Know
CompoundE: Knowledge Graph Embedding with Translation, Rotation and Scaling Compound Operations
PAC Reinforcement Learning for Predictive State Representations
Keyword: calibration
Language Models (Mostly) Know What They Know