New submissions for Tue, 5 Jul 22

Keyword: out of distribution detection

There is no result

Keyword: out-of-distribution detection

There is no result

Keyword: expected calibration error

There is no result

Keyword: overconfident

There is no result

Keyword: overconfidence

There is no result

Keyword: confidence

ImLoveNet: Misaligned Image-supported Registration Network for Low-overlap Point Cloud Pairs

Authors: Honghua Chen, Zeyong Wei, Yabin Xu, Mingqiang Wei, Jun Wang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2207.00826
Pdf link: https://arxiv.org/pdf/2207.00826
Abstract Low-overlap regions between paired point clouds make the captured features very low-confidence, leading cutting edge models to point cloud registration with poor quality. Beyond the traditional wisdom, we raise an intriguing question: Is it possible to exploit an intermediate yet misaligned image between two low-overlap point clouds to enhance the performance of cutting-edge registration models? To answer it, we propose a misaligned image supported registration network for low-overlap point cloud pairs, dubbed ImLoveNet. ImLoveNet first learns triple deep features across different modalities and then exports these features to a two-stage classifier, for progressively obtaining the high-confidence overlap region between the two point clouds. Therefore, soft correspondences are well established on the predicted overlap region, resulting in accurate rigid transformations for registration. ImLoveNet is simple to implement yet effective, since 1) the misaligned image provides clearer overlap information for the two low-overlap point clouds to better locate overlap parts; 2) it contains certain geometry knowledge to extract better deep features; and 3) it does not require the extrinsic parameters of the imaging device with respect to the reference frame of the 3D point cloud. Extensive qualitative and quantitative evaluations on different kinds of benchmarks demonstrate the effectiveness and superiority of our ImLoveNet over state-of-the-art approaches.
Firenze: Model Evaluation Using Weak Signals
Authors: Bhavna Soman, Ali Torkamani, Michael J. Morais, Jeffrey Bickford, Baris Coskun
Subjects: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2207.00827
Pdf link: https://arxiv.org/pdf/2207.00827
Abstract Data labels in the security field are frequently noisy, limited, or biased towards a subset of the population. As a result, commonplace evaluation methods such as accuracy, precision and recall metrics, or analysis of performance curves computed from labeled datasets do not provide sufficient confidence in the real-world performance of a machine learning (ML) model. This has slowed the adoption of machine learning in the field. In the industry today, we rely on domain expertise and lengthy manual evaluation to build this confidence before shipping a new model for security applications. In this paper, we introduce Firenze, a novel framework for comparative evaluation of ML models' performance using domain expertise, encoded into scalable functions called markers. We show that markers computed and combined over select subsets of samples called regions of interest can provide a robust estimate of their real-world performances. Critically, we use statistical hypothesis testing to ensure that observed differences-and therefore conclusions emerging from our framework-are more prominent than that observable from the noise alone. Using simulations and two real-world datasets for malware and domain-name-service reputation detection, we illustrate our approach's effectiveness, limitations, and insights. Taken together, we propose Firenze as a resource for fast, interpretable, and collaborative model development and evaluation by mixed teams of researchers, domain experts, and business owners.
Learning Switching Criteria for Sim2Real Transfer of Robotic Fabric Manipulation Policies
Authors: Satvik Sharma, Ellen Novoseller, Vainavi Viswanath, Zaynah Javed, Rishi Parikh, Ryan Hoque, Ashwin Balakrishna, Daniel S. Brown, Ken Goldberg
Subjects: Robotics (cs.RO)
Arxiv link: https://arxiv.org/abs/2207.00911
Pdf link: https://arxiv.org/pdf/2207.00911
Abstract Simulation-to-reality transfer has emerged as a popular and highly successful method to train robotic control policies for a wide variety of tasks. However, it is often challenging to determine when policies trained in simulation are ready to be transferred to the physical world. Deploying policies that have been trained with very little simulation data can result in unreliable and dangerous behaviors on physical hardware. On the other hand, excessive training in simulation can cause policies to overfit to the visual appearance and dynamics of the simulator. In this work, we study strategies to automatically determine when policies trained in simulation can be reliably transferred to a physical robot. We specifically study these ideas in the context of robotic fabric manipulation, in which successful sim2real transfer is especially challenging due to the difficulties of precisely modeling the dynamics and visual appearance of fabric. Results in a fabric smoothing task suggest that our switching criteria correlate well with performance in real. In particular, our confidence-based switching criteria achieve average final fabric coverage of 87.2-93.7% within 55-60% of the total training budget. See https://tinyurl.com/lsc-case for code and supplemental materials.
Pregrasp Object Material Classification by a Novel Gripper Design with Integrated Spectroscopy
Authors: Nathaniel Hanson, Tarik Kelestemur, Deniz Erdogmus, Taskin Padir
Subjects: Robotics (cs.RO)
Arxiv link: https://arxiv.org/abs/2207.00942
Pdf link: https://arxiv.org/pdf/2207.00942
Abstract Robots benefit from being able to classify objects they interact with or manipulate based on their material properties. This capability ensures fine manipulation of complex objects through proper grasp pose and force selection. Prior work has focused on haptic or visual processing to determine material type at grasp time. In this work, we introduce a novel parallel robot gripper design and a method for collecting spectral readings and visual images from within the gripper finger. We train a nonlinear Support Vector Machine (SVM) that can classify the material of the object about to be grasped through recursive estimation, with increasing confidence as the distance from the finger tips to the object decreases. In order to validate the hardware design and classification method, we collect samples from 16 real and fake fruit varieties (composed of polystyrene/plastic) resulting in a dataset containing spectral curves, scene images, and high-resolution texture images as the objects are grasped, lifted, and released. Our modeling method demonstrates an accuracy of 96.4% in classifying objects in a 32 class decision problem. This represents a performance improvement by 29.4% over the state of the art computer vision algorithms at distinguishing between visually similar materials. In contrast to prior work, our recursive estimation model accounts for increasing spectral signal strength and allows for decisions to be made as the gripper approaches an object. We conclude that spectroscopy is a promising sensing modality for enabling robots to not only classify grasped objects but also understand their underlying material composition.
Boosting Single-Frame 3D Object Detection by Simulating Multi-Frame Point Clouds
Authors: Wu Zheng, Li Jiang, Fanbin Lu, Yangyang Ye, Chi-Wing Fu
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2207.01030
Pdf link: https://arxiv.org/pdf/2207.01030
Abstract To boost a detector for single-frame 3D object detection, we present a new approach to train it to simulate features and responses following a detector trained on multi-frame point clouds. Our approach needs multi-frame point clouds only when training the single-frame detector, and once trained, it can detect objects with only single-frame point clouds as inputs during the inference. We design a novel Simulated Multi-Frame Single-Stage object Detector (SMF-SSD) framework to realize the approach: multi-view dense object fusion to densify ground-truth objects to generate a multi-frame point cloud; self-attention voxel distillation to facilitate one-to-many knowledge transfer from multi- to single-frame voxels; multi-scale BEV feature distillation to transfer knowledge in low-level spatial and high-level semantic BEV features; and adaptive response distillation to activate single-frame responses of high confidence and accurate localization. Experimental results on the Waymo test set show that our SMF-SSD consistently outperforms all state-of-the-art single-frame 3D object detectors for all object classes of difficulty levels 1 and 2 in terms of both mAP and mAPH.
RAF: Recursive Adversarial Attacks on Face Recognition Using Extremely Limited Queries
Authors: Keshav Kasichainula, Hadi Mansourifar, Weidong Shi
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2207.01149
Pdf link: https://arxiv.org/pdf/2207.01149
Abstract Recent successful adversarial attacks on face recognition show that, despite the remarkable progress of face recognition models, they are still far behind the human intelligence for perception and recognition. It reveals the vulnerability of deep convolutional neural networks (CNNs) as state-of-the-art building block for face recognition models against adversarial examples, which can cause certain consequences for secure systems. Gradient-based adversarial attacks are widely studied before and proved to be successful against face recognition models. However, finding the optimized perturbation per each face needs to submitting the significant number of queries to the target model. In this paper, we propose recursive adversarial attack on face recognition using automatic face warping which needs extremely limited number of queries to fool the target model. Instead of a random face warping procedure, the warping functions are applied on specific detected regions of face like eyebrows, nose, lips, etc. We evaluate the robustness of proposed method in the decision-based black-box attack setting, where the attackers have no access to the model parameters and gradients, but hard-label predictions and confidence scores are provided by the target model.
Pareto Optimization for Active Learning under Out-of-Distribution Data Scenarios
Authors: Xueying Zhan, Zeyu Dai, Qingzhong Wang, Qing Li, Haoyi Xiong, Dejing Dou, Antoni B. Chan
Subjects: Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2207.01190
Pdf link: https://arxiv.org/pdf/2207.01190
Abstract Pool-based Active Learning (AL) has achieved great success in minimizing labeling cost by sequentially selecting informative unlabeled samples from a large unlabeled data pool and querying their labels from oracle/annotators. However, existing AL sampling strategies might not work well in out-of-distribution (OOD) data scenarios, where the unlabeled data pool contains some data samples that do not belong to the classes of the target task. Achieving good AL performance under OOD data scenarios is a challenging task due to the natural conflict between AL sampling strategies and OOD sample detection. AL selects data that are hard to be classified by the current basic classifier (e.g., samples whose predicted class probabilities have high entropy), while OOD samples tend to have more uniform predicted class probabilities (i.e., high entropy) than in-distribution (ID) data. In this paper, we propose a sampling scheme, Monte-Carlo Pareto Optimization for Active Learning (POAL), which selects optimal subsets of unlabeled samples with fixed batch size from the unlabeled data pool. We cast the AL sampling task as a multi-objective optimization problem, and thus we utilize Pareto optimization based on two conflicting objectives: (1) the normal AL data sampling scheme (e.g., maximum entropy), and (2) the confidence of not being an OOD sample. Experimental results show its effectiveness on both classical Machine Learning (ML) and Deep Learning (DL) tasks.
S$^{5}$Mars: Self-Supervised and Semi-Supervised Learning for Mars Segmentation
Authors: Jiahang Zhang, Lilang Lin, Zejia Fan, Wenjing Wang, Jiaying Liu
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2207.01200
Pdf link: https://arxiv.org/pdf/2207.01200
Abstract Deep learning has become a powerful tool for Mars exploration. Mars terrain segmentation is an important Martian vision task, which is the base of rover autonomous planning and safe driving. However, existing deep-learning-based terrain segmentation methods face two problems: one is the lack of sufficient detailed and high-confidence annotations, and the other is the over-reliance of models on annotated training data. In this paper, we address these two problems from the perspective of joint data and method design. We first present a new Mars terrain segmentation dataset which contains 6K high-resolution images and is sparsely annotated based on confidence, ensuring the high quality of labels. Then to learn from this sparse data, we propose a representation-learning-based framework for Mars terrain segmentation, including a self-supervised learning stage (for pre-training) and a semi-supervised learning stage (for fine-tuning). Specifically, for self-supervised learning, we design a multi-task mechanism based on the masked image modeling (MIM) concept to emphasize the texture information of images. For semi-supervised learning, since our dataset is sparsely annotated, we encourage the model to excavate the information of unlabeled area in each image by generating and utilizing pseudo-labels online. We name our dataset and method Self-Supervised and Semi-Supervised Segmentation for Mars (S$^{5}$Mars). Experimental results show that our method can outperform state-of-the-art approaches and improve terrain segmentation performance by a large margin.
Safe Reinforcement Learning via Confidence-Based Filters
Authors: Sebastian Curi, Armin Lederer, Sandra Hirche, Andreas Krause
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Systems and Control (eess.SY)
Arxiv link: https://arxiv.org/abs/2207.01337
Pdf link: https://arxiv.org/pdf/2207.01337
Abstract Ensuring safety is a crucial challenge when deploying reinforcement learning (RL) to real-world systems. We develop confidence-based safety filters, a control-theoretic approach for certifying state safety constraints for nominal policies learned via standard RL techniques, based on probabilistic dynamics models. Our approach is based on a reformulation of state constraints in terms of cost functions, reducing safety verification to a standard RL task. By exploiting the concept of hallucinating inputs, we extend this formulation to determine a "backup" policy that is safe for the unknown system with high probability. Finally, the nominal policy is minimally adjusted at every time step during a roll-out towards the backup policy, such that safe recovery can be guaranteed afterwards. We provide formal safety guarantees, and empirically demonstrate the effectiveness of our approach.
Beyond mAP: Re-evaluating and Improving Performance in Instance Segmentation with Semantic Sorting and Contrastive Flow
Authors: Rohit Jena, Lukas Zhornyak, Nehal Doiphode, Vivek Buch, James Gee, Jianbo Shi
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2207.01614
Pdf link: https://arxiv.org/pdf/2207.01614
Abstract Top-down instance segmentation methods improve mAP by hedging bets on low-confidence predictions to match a ground truth. Moreover, the query-key paradigm of top-down methods leads to the instance merging problem. An excessive number of duplicate predictions leads to the (over)counting error, and the independence of category and localization branches leads to the naming error. The de-facto mAP metric doesn't capture these errors, as we show that a trivial dithering scheme can simultaneously increase mAP with hedging errors. To this end, we propose two graph-based metrics that quantifies the amount of hedging both inter-and intra-class. We conjecture the source of the hedging problem is due to feature merging and propose a) Contrastive Flow Field to encode contextual differences between instances as a supervisory signal, and b) Semantic Sorting and NMS step to suppress duplicates and incorrectly categorized prediction. Ablations show that our method encodes contextual information better than baselines, and experiments on COCO our method simultaneously reduces merging and hedging errors compared to state-of-the-art instance segmentation methods.
Keyword: scaling

Differentiable Collision Detection for a Set of Convex Primitives
Authors: Kevin Tracy, Taylor A. Howell, Zachary Manchester
Subjects: Robotics (cs.RO)
Arxiv link: https://arxiv.org/abs/2207.00669
Pdf link: https://arxiv.org/pdf/2207.00669
Abstract Collision detection between objects is critical for simulation, control, and learning for robotic systems. However, existing collision detection routines are inherently non-differentiable, limiting their usefulness in optimization-based algorithms. In this work, we propose a fully differentiable collision-detection framework that reasons about distances between a set of composable and highly expressive convex primitive shapes. This is achieved by formulating the collision detection problem as a convex optimization problem that seeks to find the minimum uniform scaling to be applied to each object before there is an intersection. The optimization problem is fully differentiable and is able to return both the collision detection status as well as the contact points on each object.
Exponential Convergence of Sinkhorn Under Regularization Scheduling
Authors: Jingbang Chen, Yang P. Liu, Richard Peng, Arvind Ramaswami
Subjects: Data Structures and Algorithms (cs.DS)
Arxiv link: https://arxiv.org/abs/2207.00736
Pdf link: https://arxiv.org/pdf/2207.00736
Abstract The optimal transport (OT) problem or earth mover's distance has wide applications to machine learning and statistical problems. In 2013, Cuturi [Cut13] introduced the Sinkhorn algorithm for matrix scaling as a method to compute solutions to regularized optimal transport problems. In this paper, we introduce the exponentially converging Sinkhorn algorithm ExpSinkhorn, by modifying the Sinkhorn algorithm to adaptively double the regularization parameter $\eta$ periodically. The resulting method has the benefits of automatically finding the regularization parameter $\eta$ for a desired accuracy $\varepsilon$, and has iteration complexity depending on $\log(1/\varepsilon)$, as opposed to $\varepsilon^{-O(1)}$ as in previous analyses [Cut13] [ANWR17]. We also show empirically that our algorithm is efficient and robust.
Approximate Vanishing Ideal Computations at Scale
Authors: Elias Wirth, Hiroshi Kera, Sebastian Pokutta
Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC)
Arxiv link: https://arxiv.org/abs/2207.01236
Pdf link: https://arxiv.org/pdf/2207.01236
Abstract The approximate vanishing ideal of a set of points $X = {\mathbf{x}_1, \ldots, \mathbf{x}_m}\subseteq [0,1]^n$ is the set of polynomials that approximately evaluate to $0$ over all points $\mathbf{x} \in X$ and admits an efficient representation by a finite set of polynomials called generators. Algorithms that construct this set of generators are extensively studied but ultimately find little practical application because their computational complexities are thought to be superlinear in the number of samples $m$. In this paper, we focus on scaling up the Oracle Approximate Vanishing Ideal algorithm (OAVI), one of the most powerful of these methods. We prove that the computational complexity of OAVI is not superlinear but linear in the number of samples $m$ and polynomial in the number of features $n$, making OAVI an attractive preprocessing technique for large-scale machine learning. To further accelerate OAVI's training time, we propose two changes: First, as the name suggests, OAVI makes repeated oracle calls to convex solvers throughout its execution. By replacing the Pairwise Conditional Gradients algorithm, one of the standard solvers used in OAVI, with the faster Blended Pairwise Conditional Gradients algorithm, we illustrate how OAVI directly benefits from advancements in the study of convex solvers. Second, we propose Inverse Hessian Boosting (IHB): IHB exploits the fact that OAVI repeatedly solves quadratic convex optimization problems that differ only by very little and whose solutions can be written in closed form using inverse Hessian information. By efficiently updating the inverse of the Hessian matrix, the convex optimization problems can be solved almost instantly, accelerating OAVI's training time by up to multiple orders of magnitude. We complement our theoretical analysis with extensive numerical experiments on data sets whose sample numbers are in the millions.
Selectively increasing the diversity of GAN-generated samples
Authors: Jan Dubiński, Kamil Deja, Sandro Wenzel, Przemysław Rokita, Tomasz Trzciński
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2207.01561
Pdf link: https://arxiv.org/pdf/2207.01561
Abstract Generative Adversarial Networks (GANs) are powerful models able to synthesize data samples closely resembling the distribution of real data, yet the diversity of those generated samples is limited due to the so-called mode collapse phenomenon observed in GANs. Especially prone to mode collapse are conditional GANs, which tend to ignore the input noise vector and focus on the conditional information. Recent methods proposed to mitigate this limitation increase the diversity of generated samples, yet they reduce the performance of the models when similarity of samples is required. To address this shortcoming, we propose a novel method to selectively increase the diversity of GAN-generated samples. By adding a simple, yet effective regularization to the training loss function we encourage the generator to discover new data modes for inputs related to diverse outputs while generating consistent samples for the remaining ones. More precisely, we maximise the ratio of distances between generated images and input latent vectors scaling the effect according to the diversity of samples for a given conditional input. We show the superiority of our method in a synthetic benchmark as well as a real-life scenario of simulating data from the Zero Degree Calorimeter of ALICE experiment in LHC, CERN.
Keyword: calibration

Eliciting and Learning with Soft Labels from Every Annotator
Authors: Katherine M. Collins, Umang Bhatt, Adrian Weller
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Human-Computer Interaction (cs.HC)
Arxiv link: https://arxiv.org/abs/2207.00810
Pdf link: https://arxiv.org/pdf/2207.00810
Abstract The labels used to train machine learning (ML) models are of paramount importance. Typically for ML classification tasks, datasets contain hard labels, yet learning using soft labels has been shown to yield benefits for model generalization, robustness, and calibration. Earlier work found success in forming soft labels from multiple annotators' hard labels; however, this approach may not converge to the best labels and necessitates many annotators, which can be expensive and inefficient. We focus on efficiently eliciting soft labels from individual annotators. We collect and release a dataset of soft labels for CIFAR-10 via a crowdsourcing study ($N=242$). We demonstrate that learning with our labels achieves comparable model performance to prior approaches while requiring far fewer annotators. Our elicitation methodology therefore shows promise towards enabling practitioners to enjoy the benefits of improved model performance and reliability with fewer annotators, and serves as a guide for future dataset curators on the benefits of leveraging richer information, such as categorical uncertainty, from individual annotators.
Towards Robust Video Object Segmentation with Adaptive Object Calibration
Authors: Xiaohao Xu, Jinglu Wang, Xiang Ming, Yan Lu
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2207.00887
Pdf link: https://arxiv.org/pdf/2207.00887
Abstract In the booming video era, video segmentation attracts increasing research attention in the multimedia community. Semi-supervised video object segmentation (VOS) aims at segmenting objects in all target frames of a video, given annotated object masks of reference frames. Most existing methods build pixel-wise reference-target correlations and then perform pixel-wise tracking to obtain target masks. Due to neglecting object-level cues, pixel-level approaches make the tracking vulnerable to perturbations, and even indiscriminate among similar objects. Towards robust VOS, the key insight is to calibrate the representation and mask of each specific object to be expressive and discriminative. Accordingly, we propose a new deep network, which can adaptively construct object representations and calibrate object masks to achieve stronger robustness. First, we construct the object representations by applying an adaptive object proxy (AOP) aggregation method, where the proxies represent arbitrary-shaped segments at multi-levels for reference. Then, prototype masks are initially generated from the reference-target correlations based on AOP. Afterwards, such proto-masks are further calibrated through network modulation, conditioning on the object proxy representations. We consolidate this conditional mask calibration process in a progressive manner, where the object representations and proto-masks evolve to be discriminative iteratively. Extensive experiments are conducted on the standard VOS benchmarks, YouTube-VOS-18/19 and DAVIS-17. Our model achieves the state-of-the-art performance among existing published works, and also exhibits superior robustness against perturbations. Our project repo is at https://github.com/JerryX1110/Robust-Video-Object-Segmentation
Lasers to Events: Automatic Extrinsic Calibration of Lidars and Event Cameras
Authors: Kevin Ta, David Bruggemann, Tim Brödermann, Christos Sakaridis, Luc Van Gool
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
Arxiv link: https://arxiv.org/abs/2207.01009
Pdf link: https://arxiv.org/pdf/2207.01009
Abstract Despite significant academic and corporate efforts, autonomous driving under adverse visual conditions still proves challenging. As neuromorphic technology has matured, its application to robotics and autonomous vehicle systems has become an area of active research. Low-light and latency-demanding situations can benefit. To enable event cameras to operate alongside staple sensors like lidar in perception tasks, we propose a direct, temporally-decoupled calibration method between event cameras and lidars. The high dynamic range and low-light operation of event cameras are exploited to directly register lidar laser returns, allowing information-based correlation methods to optimize for the 6-DoF extrinsic calibration between the two sensors. This paper presents the first direct calibration method between event cameras and lidars, removing dependencies on frame-based camera intermediaries and/or highly-accurate hand measurements. Code will be made publicly available.
Look beyond labels: Incorporating functional summary information in Bayesian neural networks
Authors: Vishnu Raj, Tianyu Cui, Markus Heinonen, Pekka Marttinen
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Arxiv link: https://arxiv.org/abs/2207.01234
Pdf link: https://arxiv.org/pdf/2207.01234
Abstract Bayesian deep learning offers a principled approach to train neural networks that accounts for both aleatoric and epistemic uncertainty. In variational inference, priors are often specified over the weight parameters, but they do not capture the true prior knowledge in large and complex neural network architectures. We present a simple approach to incorporate summary information about the predicted probability (such as sigmoid or softmax score) outputs in Bayesian neural networks (BNNs). The available summary information is incorporated as augmented data and modeled with a Dirichlet process, and we derive the corresponding \emph{Summary Evidence Lower BOund}. We show how the method can inform the model about task difficulty or class imbalance. Extensive empirical experiments show that, with negligible computational overhead, the proposed method yields a BNN with a better calibration of uncertainty.
Parametric and Multivariate Uncertainty Calibration for Regression and Object Detection
Authors: Fabian Küppers, Jonas Schneider, Anselm Haselhoff
Subjects: Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2207.01242
Pdf link: https://arxiv.org/pdf/2207.01242
Abstract Reliable spatial uncertainty evaluation of object detection models is of special interest and has been subject of recent work. In this work, we review the existing definitions for uncertainty calibration of probabilistic regression tasks. We inspect the calibration properties of common detection networks and extend state-of-the-art recalibration methods. Our methods use a Gaussian process (GP) recalibration scheme that yields parametric distributions as output (e.g. Gaussian or Cauchy). The usage of GP recalibration allows for a local (conditional) uncertainty calibration by capturing dependencies between neighboring samples. The use of parametric distributions such as as Gaussian allows for a simplified adaption of calibration in subsequent processes, e.g., for Kalman filtering in the scope of object tracking. In addition, we use the GP recalibration scheme to perform covariance estimation which allows for post-hoc introduction of local correlations between the output quantities, e.g., position, width, or height in object detection. To measure the joint calibration of multivariate and possibly correlated data, we introduce the quantile calibration error which is based on the Mahalanobis distance between the predicted distribution and the ground truth to determine whether the ground truth is within a predicted quantile. Our experiments show that common detection models overestimate the spatial uncertainty in comparison to the observed error. We show that the simple Isotonic Regression recalibration method is sufficient to achieve a good uncertainty quantification in terms of calibrated quantiles. In contrast, if normal distributions are required for subsequent processes, our GP-Normal recalibration method yields the best results. Finally, we show that our covariance estimation method is able to achieve best calibration results for joint multivariate calibration.
VECtor: A Versatile Event-Centric Benchmark for Multi-Sensor SLAM
Authors: Ling Gao, Yuxuan Liang, Jiaqi Yang, Shaoxun Wu, Chenyu Wang, Jiaben Chen, Laurent Kneip
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2207.01404
Pdf link: https://arxiv.org/pdf/2207.01404
Abstract Event cameras have recently gained in popularity as they hold strong potential to complement regular cameras in situations of high dynamics or challenging illumination. An important problem that may benefit from the addition of an event camera is given by Simultaneous Localization And Mapping (SLAM). However, in order to ensure progress on event-inclusive multi-sensor SLAM, novel benchmark sequences are needed. Our contribution is the first complete set of benchmark datasets captured with a multi-sensor setup containing an event-based stereo camera, a regular stereo camera, multiple depth sensors, and an inertial measurement unit. The setup is fully hardware-synchronized and underwent accurate extrinsic calibration. All sequences come with ground truth data captured by highly accurate external reference devices such as a motion capture system. Individual sequences include both small and large-scale environments, and cover the specific challenges targeted by dynamic vision sensors.
Recommendation Systems with Distribution-Free Reliability Guarantees
Authors: Anastasios N. Angelopoulos, Karl Krauth, Stephen Bates, Yixin Wang, Michael I. Jordan
Subjects: Information Retrieval (cs.IR); Machine Learning (cs.LG); Machine Learning (stat.ML)
Arxiv link: https://arxiv.org/abs/2207.01609
Pdf link: https://arxiv.org/pdf/2207.01609
Abstract When building recommendation systems, we seek to output a helpful set of items to the user. Under the hood, a ranking model predicts which of two candidate items is better, and we must distill these pairwise comparisons into the user-facing output. However, a learned ranking model is never perfect, so taking its predictions at face value gives no guarantee that the user-facing output is reliable. Building from a pre-trained ranking model, we show how to return a set of items that is rigorously guaranteed to contain mostly good items. Our procedure endows any ranking model with rigorous finite-sample control of the false discovery rate (FDR), regardless of the (unknown) data distribution. Moreover, our calibration algorithm enables the easy and principled integration of multiple objectives in recommender systems. As an example, we show how to optimize for recommendation diversity subject to a user-specified level of FDR control, circumventing the need to specify ad hoc weights of a diversity loss against an accuracy loss. Throughout, we focus on the problem of learning to rank a set of possible recommendations, evaluating our methods on the Yahoo! Learning to Rank and MSMarco datasets.

ericbeyer / L-arxiv-interest-tracker

New submissions for Tue, 5 Jul 22 #559

Keyword: out of distribution detection

Keyword: out-of-distribution detection

Keyword: expected calibration error

Keyword: overconfident

Keyword: overconfidence

Keyword: confidence

ImLoveNet: Misaligned Image-supported Registration Network for Low-overlap Point Cloud Pairs

Firenze: Model Evaluation Using Weak Signals

Learning Switching Criteria for Sim2Real Transfer of Robotic Fabric Manipulation Policies

Pregrasp Object Material Classification by a Novel Gripper Design with Integrated Spectroscopy

Boosting Single-Frame 3D Object Detection by Simulating Multi-Frame Point Clouds

RAF: Recursive Adversarial Attacks on Face Recognition Using Extremely Limited Queries

Pareto Optimization for Active Learning under Out-of-Distribution Data Scenarios

S$^{5}$Mars: Self-Supervised and Semi-Supervised Learning for Mars Segmentation

Safe Reinforcement Learning via Confidence-Based Filters

Beyond mAP: Re-evaluating and Improving Performance in Instance Segmentation with Semantic Sorting and Contrastive Flow

Keyword: scaling

Differentiable Collision Detection for a Set of Convex Primitives

Exponential Convergence of Sinkhorn Under Regularization Scheduling

Approximate Vanishing Ideal Computations at Scale

Selectively increasing the diversity of GAN-generated samples

Keyword: calibration

Eliciting and Learning with Soft Labels from Every Annotator

Towards Robust Video Object Segmentation with Adaptive Object Calibration

Lasers to Events: Automatic Extrinsic Calibration of Lidars and Event Cameras

Look beyond labels: Incorporating functional summary information in Bayesian neural networks

Parametric and Multivariate Uncertainty Calibration for Regression and Object Detection

VECtor: A Versatile Event-Centric Benchmark for Multi-Sensor SLAM

Recommendation Systems with Distribution-Free Reliability Guarantees