Abstract
Many high-performing works on out-of-distribution (OOD) detection use real or synthetically generated outlier data to regularise model confidence; however, they often require retraining of the base network or specialised model architectures. Our work demonstrates that Noisy Inliers Make Great Outliers (NIMGO) in the challenging field of OOD object detection. We hypothesise that synthetic outliers need only be minimally perturbed variants of the in-distribution (ID) data in order to train a discriminator to identify OOD samples -- without expensive retraining of the base network. To test our hypothesis, we generate a synthetic outlier set by applying an additive-noise perturbation to ID samples at the image or bounding-box level. An auxiliary feature monitoring multilayer perceptron (MLP) is then trained to detect OOD feature representations using the perturbed ID samples as a proxy. During testing, we demonstrate that the auxiliary MLP distinguishes ID samples from OOD samples at a state-of-the-art level, reducing the false positive rate by more than 20\% (absolute) over the previous state-of-the-art on the OpenImages dataset. Extensive additional ablations provide empirical evidence in support of our hypothesis.
Keyword: scaling
Weight-variant Latent Causal Models
Authors: Yuhang Liu, Zhen Zhang, Dong Gong, Mingming Gong, Biwei Huang, Anton van den Hengel, Kun Zhang, Javen Qinfeng Shi
Abstract
Causal representation learning exposes latent high-level causal variables behind low-level observations, which has enormous potential for a set of downstream tasks of interest. Despite this, identifying the true latent causal representation from observed data is a great challenge. In this work we focus on identifying latent causal variables. To this end, we analysis three intrinsic properties in latent space, including transitivity, permutation and scaling. We show that the transitivity severely hinders the identifiability of latent causal variables, while permutation and scaling guide the direction of identifying latent causal variable. To break the transitivity, we assume the underlying latent causal relations to be linear Gaussian models, in which the weights, mean and variance of Gaussian noise are modulated by an additionally observed variable. Under these assumptions we theoretically show that the latent causal variables can be identifiable up to trivial permutation and scaling. Built on this theoretical result, we propose a novel method, termed Structural caUsAl Variational autoEncoder, which directly learns latent causal variables, together with the mapping from the latent causal variables to the observed ones. Experimental results on synthetic and real data demonstrate the identifiable result and the ability of the proposed method for learning latent causal variables.
Identifying Latent Causal Content for Multi-Source Domain Adaptation
Authors: Yuhang Liu, Zhen Zhang, Dong Gong, Mingming Gong, Biwei Huang, Kun Zhang, Javen Qinfeng Shi
Abstract
Multi-source domain adaptation (MSDA) learns to predict the labels in target domain data, under the setting where all data from multiple source domains are labelled and the data from the target domain are unlabeled. To handle this problem, most of methods focus on learning invariant representations across domains. However, their success severely relies on the assumption that label distribution remains unchanged across domains. To mitigate it, we propose a new assumption, latent covariate shift, where the marginal distribution of the latent content variable changes across domains, and the conditional distribution of the label given the latent content remains invariant across domains. We introduce a latent style variable to complement the latent content variable forming a latent causal graph as the data and label generating process. We show that although the latent style variable is unidentifiable due to transitivity property in the latent space, the latent content variable can be identified up to simple scaling under some mild conditions. This motivates us to propose a novel method for MSDA, which learns the invariant label distribution conditional on the latent content variable, instead of learning invariant representations. Empirical evaluation on simulation and real data demonstrates the effectiveness of the proposed method, compared with many state-of-the-art methods based on invariant representation.
Efficient and Interpretable Neural Models for Entity Tracking
Abstract
What would it take for a natural language model to understand a novel, such as The Lord of the Rings? Among other things, such a model must be able to: (a) identify and record new characters (entities) and their attributes as they are introduced in the text, and (b) identify subsequent references to the characters previously introduced and update their attributes. This problem of entity tracking is essential for language understanding, and thus, useful for a wide array of downstream applications in NLP such as question-answering, summarization. In this thesis, we focus on two key problems in relation to facilitating the use of entity tracking models: (i) scaling entity tracking models to long documents, such as a novel, and (ii) integrating entity tracking into language models. Applying language technologies to long documents has garnered interest recently, but computational constraints are a significant bottleneck in scaling up current methods. In this thesis, we argue that computationally efficient entity tracking models can be developed by representing entities with rich, fixed-dimensional vector representations derived from pretrained language models, and by exploiting the ephemeral nature of entities. We also argue for the integration of entity tracking into language models as it will allow for: (i) wider application given the current ubiquitous use of pretrained language models in NLP applications, and (ii) easier adoption since it is much easier to swap in a new pretrained language model than to integrate a separate standalone entity tracking model.
Physics-based adaptivity of a spectral method for the Vlasov-Poisson equations based on the asymmetrically-weighted Hermite expansion in velocity space
Authors: Cecilia Pagliantini, Gian Luca Delzanno, Stefano Markidis
Abstract
We propose a spectral method for the 1D-1V Vlasov-Poisson system where the discretization in velocity space is based on asymmetrically-weighted Hermite functions, dynamically adapted via a scaling $\alpha$ and shifting $u$ of the velocity variable. Specifically, at each time instant an adaptivity criterion selects new values of $\alpha$ and $u$ based on the numerical solution of the discrete Vlasov-Poisson system obtained at that time step. Once the new values of the Hermite parameters $\alpha$ and $u$ are fixed, the Hermite expansion is updated and the discrete system is further evolved for the next time step. The procedure is applied iteratively over the desired temporal interval. The key aspects of the adaptive algorithm are: the map between approximation spaces associated with different values of the Hermite parameters that preserves total mass, momentum and energy; and the adaptivity criterion to update $\alpha$ and $u$ based on physics considerations relating the Hermite parameters to the average velocity and temperature of each plasma species. For the discretization of the spatial coordinate, we rely on Fourier functions and use the implicit midpoint rule for time stepping. The resulting numerical method possesses intrinsically the property of fluid-kinetic coupling, where the low-order terms of the expansion are akin to the fluid moments of a macroscopic description of the plasma, while kinetic physics is retained by adding more spectral terms. Moreover, the scheme features conservation of total mass, momentum and energy associated in the discrete, for periodic boundary conditions. A set of numerical experiments confirms that the adaptive method outperforms the non-adaptive one in terms of accuracy and stability of the numerical solution.
Improving Datacenter Utilization through Containerized Service-Based Architecture
Authors: Aos Mulahuwaish, Shane Korbel, Basheer Qolomany
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Abstract
The modern datacenter's computing capabilities have far outstripped the applications running within and have become a hidden cost of doing business due to how software is architected and deployed. Resources are over-allocated to monolithic applications that sit idle for large parts of the day. If applications were architected and deployed differently, shared services could be used for multiple applications as needed. When combined with powerful orchestration software, containerized microservices can both deploy and dynamically scale applications from very small to very large within moments scaling the application not only across a single datacenter but across all datacenters where the application(s) are deployed. In this paper, we analyze data from an application(s) deployed both as a single monolithic codebase and as a containerized application using microservice-based architecture to calculate the performance and computing resource waste are both architected and deployed. A modern approach is offered as a solution as a path from how to go from a monolithic codebase to a more efficient, reliable, scalable, and less costly deployment model.
Keyword: calibration
Differentiable Programming for Earth System Modeling
Authors: Maximilian Gelbrecht, Alistair White, Sebastian Bathiany, Niklas Boers
Subjects: Machine Learning (cs.LG); Atmospheric and Oceanic Physics (physics.ao-ph)
Abstract
Earth System Models (ESMs) are the primary tools for investigating future Earth system states at time scales from decades to centuries, especially in response to anthropogenic greenhouse gas release. State-of-the-art ESMs can reproduce the observational global mean temperature anomalies of the last 150 years. Nevertheless, ESMs need further improvements, most importantly regarding (i) the large spread in their estimates of climate sensitivity, i.e., the temperature response to increases in atmospheric greenhouse gases, (ii) the modeled spatial patterns of key variables such as temperature and precipitation, (iii) their representation of extreme weather events, and (iv) their representation of multistable Earth system components and their ability to predict associated abrupt transitions. Here, we argue that making ESMs automatically differentiable has huge potential to advance ESMs, especially with respect to these key shortcomings. First, automatic differentiability would allow objective calibration of ESMs, i.e., the selection of optimal values with respect to a cost function for a large number of free parameters, which are currently tuned mostly manually. Second, recent advances in Machine Learning (ML) and in the amount, accuracy, and resolution of observational data promise to be helpful with at least some of the above aspects because ML may be used to incorporate additional information from observations into ESMs. Automatic differentiability is an essential ingredient in the construction of such hybrid models, combining process-based ESMs with ML components. We document recent work showcasing the potential of automatic differentiation for a new generation of substantially improved, data-informed ESMs.
Uncertainty-Induced Transferability Representation for Source-Free Unsupervised Domain Adaptation
Abstract
Source-free unsupervised domain adaptation (SFUDA) aims to learn a target domain model using unlabeled target data and the knowledge of a well-trained source domain model. Most previous SFUDA works focus on inferring semantics of target data based on the source knowledge. Without measuring the transferability of the source knowledge, these methods insufficiently exploit the source knowledge, and fail to identify the reliability of the inferred target semantics. However, existing transferability measurements require either source data or target labels, which are infeasible in SFUDA. To this end, firstly, we propose a novel Uncertainty-induced Transferability Representation (UTR), which leverages uncertainty as the tool to analyse the channel-wise transferability of the source encoder in the absence of the source data and target labels. The domain-level UTR unravels how transferable the encoder channels are to the target domain and the instance-level UTR characterizes the reliability of the inferred target semantics. Secondly, based on the UTR, we propose a novel Calibrated Adaption Framework (CAF) for SFUDA, including i)the source knowledge calibration module that guides the target model to learn the transferable source knowledge and discard the non-transferable one, and ii)the target semantics calibration module that calibrates the unreliable semantics. With the help of the calibrated source knowledge and the target semantics, the model adapts to the target domain safely and ultimately better. We verified the effectiveness of our method using experimental results and demonstrated that the proposed method achieves state-of-the-art performances on the three SFUDA benchmarks. Code is available at https://github.com/SPIresearch/UTR.
Experimental Performance Evaluation of Cell-free Massive MIMO Systems Using COTS RRU with OTA Reciprocity Calibration and Phase Synchronization
Authors: Yang Cao, Pan Wang, Kang Zheng, Xianghu Liang, Dongjie Liu, Mengting Lou, Jing Jin, Qixing Wang, Dongming Wang, Yongming Huang, Xiaohu You, Jiangzhou Wang
Abstract
Downlink coherent multiuser transmission is an essential technique for cell-free massive multiple-input multiple output (MIMO) systems, and the availability of channel state information (CSI) at the transmitter is a basic requirement. To avoid CSI feedback in a time-division duplex system, the uplink channel parameters should be calibrated to obtain the downlink CSI due to the radio frequency circuit mismatch of the transceiver. In this paper, a design of a reference signal for over-the-air reciprocity calibration is proposed. The frequency domain generated reference signals can make full use of the flexible frame structure of the fifth generation (5G) new radio, which can be completely transparent to commercial off-the-shelf (COTS) remote radio units (RRUs) and commercial user equipments. To further obtain the calibration of multiple RRUs, an interleaved RRU grouping with a genetic algorithm is proposed, and an averaged Argos calibration algorithm is also presented. We develop a cell-free massive MIMO prototype system with COTS RRUs, demonstrate the statistical characteristics of the calibration error and the effectiveness of the calibration algorithm, and evaluate the impact of the calibration delay on the different cooperative transmission schemes.
Keyword: out of distribution detection
There is no result
Keyword: out-of-distribution detection
There is no result
Keyword: expected calibration error
There is no result
Keyword: overconfident
There is no result
Keyword: overconfidence
There is no result
Keyword: confidence
Noisy Inliers Make Great Outliers: Out-of-Distribution Object Detection with Noisy Synthetic Outliers
Keyword: scaling
Weight-variant Latent Causal Models
Identifying Latent Causal Content for Multi-Source Domain Adaptation
Efficient and Interpretable Neural Models for Entity Tracking
Physics-based adaptivity of a spectral method for the Vlasov-Poisson equations based on the asymmetrically-weighted Hermite expansion in velocity space
Improving Datacenter Utilization through Containerized Service-Based Architecture
Keyword: calibration
Differentiable Programming for Earth System Modeling
Uncertainty-Induced Transferability Representation for Source-Free Unsupervised Domain Adaptation
Experimental Performance Evaluation of Cell-free Massive MIMO Systems Using COTS RRU with OTA Reciprocity Calibration and Phase Synchronization