New submissions for Fri, 21 Oct 22

Keyword: out of distribution detection

There is no result

Keyword: out-of-distribution detection

There is no result

Keyword: expected calibration error

There is no result

Keyword: overconfident

There is no result

Keyword: overconfidence

There is no result

Keyword: confidence

MMRNet: Improving Reliability for Multimodal Computer Vision for Bin Picking via Multimodal Redundancy

Authors: Yuhao Chen, Hayden Gunraj, E. Zhixuan Zeng, Maximilian Gilles, Alexander Wong
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Robotics (cs.RO)
Arxiv link: https://arxiv.org/abs/2210.10842
Pdf link: https://arxiv.org/pdf/2210.10842
Abstract Recently, there has been tremendous interest in industry 4.0 infrastructure to address labor shortages in global supply chains. Deploying artificial intelligence-enabled robotic bin picking systems in real world has become particularly important for reducing labor demands and costs while increasing efficiency. To this end, artificial intelligence-enabled robotic bin picking systems may be used to automate bin picking, but may also cause expensive damage during an abnormal event such as a sensor failure. As such, reliability becomes a critical factor for translating artificial intelligence research to real world applications and products. In this paper, we propose a reliable vision system with MultiModal Redundancy (MMRNet) for tackling object detection and segmentation for robotic bin picking using data from different modalities. This is the first system that introduces the concept of multimodal redundancy to combat sensor failure issues during deployment. In particular, we realize the multimodal redundancy framework with a gate fusion module and dynamic ensemble learning. Finally, we present a new label-free multimodal consistency score that utilizes the output from all modalities to measure the overall system output reliability and uncertainty. Through experiments, we demonstrate that in an event of missing modality, our system provides a much more reliable performance compared to baseline models. We also demonstrate that our MC score is a more powerful reliability indicator for outputs during inference time where model generated confidence score are often over-confident.
Non-Iterative Scribble-Supervised Learning with Pacing Pseudo-Masks for Medical Image Segmentation
Authors: Zefan Yang, Di Lin, Dong Ni, Yi Wang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2210.10956
Pdf link: https://arxiv.org/pdf/2210.10956
Abstract Scribble-supervised medical image segmentation tackles the limitation of sparse masks. Conventional approaches alternate between: labeling pseudo-masks and optimizing network parameters. However, such iterative two-stage paradigm is unwieldy and could be trapped in poor local optima since the networks undesirably regress to the erroneous pseudo-masks. To address these issues, we propose a non-iterative method where a stream of varying (pacing) pseudo-masks teach a network via consistency training, named PacingPseudo. Our motivation lies first in a non-iterative process. Interestingly, it can be achieved gracefully by a siamese architecture, wherein a stream of pseudo-masks naturally assimilate a stream of predicted masks during training. Second, we make the consistency training effective with two necessary designs: (i) entropy regularization to obtain high-confidence pseudo-masks for effective teaching; and (ii) distorted augmentations to create discrepancy between the pseudo-mask and predicted-mask streams for consistency regularization. Third, we devise a new memory bank mechanism that provides an extra source of ensemble features to complement scarce labeled pixels. The efficacy of the proposed PacingPseudo is validated on three public medical image datasets, including the segmentation tasks of abdominal multi-organs, cardiac structures, and myocardium. Extensive experiments demonstrate our PacingPseudo improves the baseline by large margins and consistently outcompetes several previous methods. In some cases, our PacingPseudo achieves comparable performance with its fully-supervised counterparts, showing the feasibility of our method for the challenging scribble-supervised segmentation applications. The code and scribble annotations will be publicly available.
Towards Mitigating the Problem of Insufficient and Ambiguous Supervision in Online Crowdsourcing Annotation
Authors: Qian-Wei Wang, Bowen Zhao, Mingyan Zhu, Tianxiang Li, Zimo Liu, Shu-Tao Xia
Subjects: Artificial Intelligence (cs.AI)
Arxiv link: https://arxiv.org/abs/2210.11194
Pdf link: https://arxiv.org/pdf/2210.11194
Abstract In real-world crowdsourcing annotation systems, due to differences in user knowledge and cultural backgrounds, as well as the high cost of acquiring annotation information, the supervision information we obtain might be insufficient and ambiguous. To mitigate the negative impacts, in this paper, we investigate a more general and broadly applicable learning problem, i.e. \emph{semi-supervised partial label learning}, and propose a novel method based on pseudo-labeling and contrastive learning. Following the key inventing principle, our method facilitate the partial label disambiguation process with unlabeled data and at the same time assign reliable pseudo-labels to weakly supervised examples. Specifically, our method learns from the ambiguous labeling information via partial cross-entropy loss. Meanwhile, high-accuracy pseudo-labels are generated for both partial and unlabeled examples through confidence-based thresholding and contrastive learning is performed in a hybrid unsupervised and supervised manner for more discriminative representations, while its supervision increases curriculumly. The two main components systematically work as a whole and reciprocate each other. In experiments, our method consistently outperforms all comparing methods by a significant margin and set up the first state-of-the-art performance for semi-supervised partial label learning on image benchmarks.
Safe Policy Improvement in Constrained Markov Decision Processes
Authors: Luigi Berducci, Radu Grosu
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Formal Languages and Automata Theory (cs.FL); Robotics (cs.RO)
Arxiv link: https://arxiv.org/abs/2210.11259
Pdf link: https://arxiv.org/pdf/2210.11259
Abstract The automatic synthesis of a policy through reinforcement learning (RL) from a given set of formal requirements depends on the construction of a reward signal and consists of the iterative application of many policy-improvement steps. The synthesis algorithm has to balance target, safety, and comfort requirements in a single objective and to guarantee that the policy improvement does not increase the number of safety-requirements violations, especially for safety-critical applications. In this work, we present a solution to the synthesis problem by solving its two main challenges: reward-shaping from a set of formal requirements and safe policy update. For the former, we propose an automatic reward-shaping procedure, defining a scalar reward signal compliant with the task specification. For the latter, we introduce an algorithm ensuring that the policy is improved in a safe fashion with high-confidence guarantees. We also discuss the adoption of a model-based RL algorithm to efficiently use the collected data and train a model-free agent on the predicted trajectories, where the safety violation does not have the same impact as in the real world. Finally, we demonstrate in standard control benchmarks that the resulting learning procedure is effective and robust even under heavy perturbations of the hyperparameters.
Play It Back: Iterative Attention for Audio Recognition
Authors: Alexandros Stergiou, Dima Damen
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
Arxiv link: https://arxiv.org/abs/2210.11328
Pdf link: https://arxiv.org/pdf/2210.11328
Abstract A key function of auditory cognition is the association of characteristic sounds with their corresponding semantics over time. Humans attempting to discriminate between fine-grained audio categories, often replay the same discriminative sounds to increase their prediction confidence. We propose an end-to-end attention-based architecture that through selective repetition attends over the most discriminative sounds across the audio sequence. Our model initially uses the full audio sequence and iteratively refines the temporal segments replayed based on slot attention. At each playback, the selected segments are replayed using a smaller hop length which represents higher resolution features within these segments. We show that our method can consistently achieve state-of-the-art performance across three audio-classification benchmarks: AudioSet, VGG-Sound, and EPIC-KITCHENS-100.
G2NetPL: Generic Game-Theoretic Network for Partial-Label Image Classification
Authors: Rabab Abdelfattah, Xin Zhang, Mostafa M. Fouda, Xiaofeng Wang, Song Wang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2210.11469
Pdf link: https://arxiv.org/pdf/2210.11469
Abstract Multi-label image classification aims to predict all possible labels in an image. It is usually formulated as a partial-label learning problem, since it could be expensive in practice to annotate all the labels in every training image. Existing works on partial-label learning focus on the case where each training image is labeled with only a subset of its positive/negative labels. To effectively address partial-label classification, this paper proposes an end-to-end Generic Game-theoretic Network (G2NetPL) for partial-label learning, which can be applied to most partial-label settings, including a very challenging, but annotation-efficient case where only a subset of the training images are labeled, each with only one positive label, while the rest of the training images remain unlabeled. In G2NetPL, each unobserved label is associated with a soft pseudo label, which, together with the network, formulates a two-player non-zero-sum non-cooperative game. The objective of the network is to minimize the loss function with given pseudo labels, while the pseudo labels will seek convergence to 1 (positive) or 0 (negative) with a penalty of deviating from the predicted labels determined by the network. In addition, we introduce a confidence-aware scheduler into the loss of the network to adaptively perform easy-to-hard learning for different labels. Extensive experiments demonstrate that our proposed G2NetPL outperforms many state-of-the-art multi-label classification methods under various partial-label settings on three different datasets.
Keyword: scaling

Explainable Multi-Agent Recommendation System for Energy-Efficient Decision Support in Smart Homes
Authors: Alona Zharova, Annika Boer, Julia Knoblauch, Kai Ingo Schewina, Jana Vihs
Subjects: Multiagent Systems (cs.MA)
Arxiv link: https://arxiv.org/abs/2210.11218
Pdf link: https://arxiv.org/pdf/2210.11218
Abstract Understandable and persuasive recommendations support the electricity consumers' behavioral change to tackle the energy efficiency problem. Generating load shifting recommendations for household appliances as explainable increases the transparency and trustworthiness of the system. This paper proposes an explainable multi-agent recommendation system for load shifting for household appliances. First, we provide agents with enhanced predictive capacity by including weather data, applying state-of-the-art models, and tuning the hyperparameters. Second, we suggest an Explainability Agent providing transparent recommendations. We also provide an overview of the predictive and explainability performance. Third, we discuss the impact and scaling potential of the suggested approach.
Transcending Scaling Laws with 0.1% Extra Compute
Authors: Yi Tay, Jason Wei, Hyung Won Chung, Vinh Q. Tran, David R. So, Siamak Shakeri, Xavier Garcia, Huaixiu Steven Zheng, Jinfeng Rao, Aakanksha Chowdhery, Denny Zhou, Donald Metzler, Slav Petrov, Neil Houlsby, Quoc V. Le, Mostafa Dehghani
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2210.11399
Pdf link: https://arxiv.org/pdf/2210.11399
Abstract Scaling language models improves performance but comes with significant computational costs. This paper proposes UL2R, a method that substantially improves existing language models and their scaling curves with a relatively tiny amount of extra compute. The key idea is to continue training a state-of-the-art large language model (e.g., PaLM) on a few more steps with UL2's mixture-of-denoiser objective. We show that, with almost negligible extra computational costs and no new sources of data, we are able to substantially improve the scaling properties of large language models on downstream metrics. In this paper, we continue training PaLM with UL2R, introducing a new set of models at 8B, 62B, and 540B scale which we call U-PaLM. Impressively, at 540B scale, we show an approximately 2x computational savings rate where U-PaLM achieves the same performance as the final PaLM 540B model at around half its computational budget (i.e., saving $\sim$4.4 million TPUv4 hours). We further show that this improved scaling curve leads to 'emergent abilities' on challenging BIG-Bench tasks -- for instance, U-PaLM does much better than PaLM on some tasks or demonstrates better quality at much smaller scale (62B as opposed to 540B). Overall, we show that U-PaLM outperforms PaLM on many few-shot setups, i.e., English NLP tasks (e.g., commonsense reasoning, question answering), reasoning tasks with chain-of-thought (e.g., GSM8K), multilingual tasks (MGSM, TydiQA), MMLU and challenging BIG-Bench tasks. Finally, we provide qualitative examples showing the new capabilities of U-PaLM for single and multi-span infilling.
Scaling Instruction-Finetuned Language Models
Authors: Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Eric Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane Gu, Zhuyun Dai, Mirac Suzgun, Xinyun Chen, Aakanksha Chowdhery, Sharan Narang, Gaurav Mishra, Adams Yu, Vincent Zhao, Yanping Huang, Andrew Dai, Hongkun Yu, Slav Petrov, Ed H. Chi, Jeff Dean, Jacob Devlin, Adam Roberts, Denny Zhou, Quoc V. Le, Jason Wei
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL)
Arxiv link: https://arxiv.org/abs/2210.11416
Pdf link: https://arxiv.org/pdf/2210.11416
Abstract Finetuning language models on a collection of datasets phrased as instructions has been shown to improve model performance and generalization to unseen tasks. In this paper we explore instruction finetuning with a particular focus on (1) scaling the number of tasks, (2) scaling the model size, and (3) finetuning on chain-of-thought data. We find that instruction finetuning with the above aspects dramatically improves performance on a variety of model classes (PaLM, T5, U-PaLM), prompting setups (zero-shot, few-shot, CoT), and evaluation benchmarks (MMLU, BBH, TyDiQA, MGSM, open-ended generation). For instance, Flan-PaLM 540B instruction-finetuned on 1.8K tasks outperforms PALM 540B by a large margin (+9.4% on average). Flan-PaLM 540B achieves state-of-the-art performance on several benchmarks, such as 75.2% on five-shot MMLU. We also publicly release Flan-T5 checkpoints, which achieve strong few-shot performance even compared to much larger models, such as PaLM 62B. Overall, instruction finetuning is a general method for improving the performance and usability of pretrained language models.
Keyword: calibration

Multi-hypothesis 3D human pose estimation metrics favor miscalibrated distributions
Authors: Paweł A. Pierzchlewicz, R. James Cotton, Mohammad Bashiri, Fabian H. Sinz
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2210.11179
Pdf link: https://arxiv.org/pdf/2210.11179
Abstract Due to depth ambiguities and occlusions, lifting 2D poses to 3D is a highly ill-posed problem. Well-calibrated distributions of possible poses can make these ambiguities explicit and preserve the resulting uncertainty for downstream tasks. This study shows that previous attempts, which account for these ambiguities via multiple hypotheses generation, produce miscalibrated distributions. We identify that miscalibration can be attributed to the use of sample-based metrics such as minMPJPE. In a series of simulations, we show that minimizing minMPJPE, as commonly done, should converge to the correct mean prediction. However, it fails to correctly capture the uncertainty, thus resulting in a miscalibrated distribution. To mitigate this problem, we propose an accurate and well-calibrated model called Conditional Graph Normalizing Flow (cGNFs). Our model is structured such that a single cGNF can estimate both conditional and marginal densities within the same model - effectively solving a zero-shot density estimation problem. We evaluate cGNF on the Human~3.6M dataset and show that cGNF provides a well-calibrated distribution estimate while being close to state-of-the-art in terms of overall minMPJPE. Furthermore, cGNF outperforms previous methods on occluded joints while it remains well-calibrated.
Towards Evology: a Market Ecology Agent-Based Model of US Equity Mutual Funds
Authors: Aymeric Vie, Maarten Scholl, Alissa M. Kleinnijenhuis, Doyne J. Farmer
Subjects: Multiagent Systems (cs.MA); Computational Engineering, Finance, and Science (cs.CE)
Arxiv link: https://arxiv.org/abs/2210.11344
Pdf link: https://arxiv.org/pdf/2210.11344
Abstract The profitability of various investment styles in investment funds depends on macroeconomic conditions. Market ecology, which views financial markets as ecosystems of diverse, interacting and evolving trading strategies, has shown that endogenous interactions between strategies determine market behaviour and styles' performance. We present Evology: a heterogeneous, empirically calibrated multi-agent market ecology agent-based model to quantify endogenous interactions between US equity mutual funds, particularly Value and Growth investment styles. We outline the model design, validation and calibration approach and its potential for optimising investment strategies using machine learning algorithms.

ericbeyer / L-arxiv-interest-tracker

New submissions for Fri, 21 Oct 22 #668

Keyword: out of distribution detection

Keyword: out-of-distribution detection

Keyword: expected calibration error

Keyword: overconfident

Keyword: overconfidence

Keyword: confidence

MMRNet: Improving Reliability for Multimodal Computer Vision for Bin Picking via Multimodal Redundancy

Non-Iterative Scribble-Supervised Learning with Pacing Pseudo-Masks for Medical Image Segmentation

Towards Mitigating the Problem of Insufficient and Ambiguous Supervision in Online Crowdsourcing Annotation

Safe Policy Improvement in Constrained Markov Decision Processes

Play It Back: Iterative Attention for Audio Recognition

G2NetPL: Generic Game-Theoretic Network for Partial-Label Image Classification

Keyword: scaling

Explainable Multi-Agent Recommendation System for Energy-Efficient Decision Support in Smart Homes

Transcending Scaling Laws with 0.1% Extra Compute

Scaling Instruction-Finetuned Language Models

Keyword: calibration

Multi-hypothesis 3D human pose estimation metrics favor miscalibrated distributions

Towards Evology: a Market Ecology Agent-Based Model of US Equity Mutual Funds