Abstract
Acne detection is crucial for interpretative diagnosis and precise treatment of skin disease. The arbitrary boundary and small size of acne lesions lead to a significant number of poor-quality proposals in two-stage detection. In this paper, we propose a novel head structure for Region Proposal Network to improve the proposals' quality in two ways. At first, a Spatial Aware Double Head(SADH) structure is proposed to disentangle the representation learning for classification and localization from two different spatial perspectives. The proposed SADH ensures a steeper classification confidence gradient and suppresses the proposals having low intersection-over-union(IoU) with the matched ground truth. Then, we propose a Normalized Wasserstein Distance prediction branch to improve the correlation between the proposals' classification scores and IoUs. In addition, to facilitate further research on acne detection, we construct a new dataset named AcneSCU, with high-resolution imageries, precise annotations, and fine-grained lesion categories. Extensive experiments are conducted on both AcneSCU and the public dataset ACNE04, and the results demonstrate the proposed method could improve the proposals' quality, consistently outperforming state-of-the-art approaches. Code and the collected dataset are available in https://github.com/pingguokiller/acnedetection.
Using Quantile Forecasts for Dynamic Equivalents of Active Distribution Grids under Uncertainty
Authors: Johanna Vorwerk, Thierry Zufferey, Petros Aristidou, Gabriela Hug
Abstract
While distribution networks (DNs) turn from consumers to active and responsive intelligent DNs, the question of how to represent them in large-scale transmission network (TN) studies is still under investigation. The standard approach that uses aggregated models for the inverter-interfaced generation and conventional load models introduces significant errors to the dynamic modeling that can lead to instabilities. This paper presents a new approach based on quantile forecasting to represent the uncertainty originating in DNs at the TN level. First, we aquire a required rich dataset employing Monte Carlo simulations of a DN. Then, we use machine learning (ML) algorithms to not only predict the most probable response but also intervals of potential responses with predefined confidence. These quantile methods represent the variance in DN responses at the TN level. The results indicate excellent performance for most ML techniques. The tuned quantile equivalents predict accurate bands for the current at the TN/DN-interface, and tests with unseen TN conditions indicate robustness. A final assessment that compares the MC trajectories against the predicted intervals highlights the potential of the proposed method.
Keyword: scaling
More ConvNets in the 2020s: Scaling up Kernels Beyond 51x51 using Sparsity
Abstract
Transformers have quickly shined in the computer vision world since the emergence of Vision Transformers (ViTs). The dominant role of convolutional neural networks (CNNs) seems to be challenged by increasingly effective transformer-based models. Very recently, a couple of advanced convolutional models strike back with large kernels motivated by the local but large attention mechanism, showing appealing performance and efficiency. While one of them, i.e. RepLKNet, impressively manages to scale the kernel size to 31x31 with improved performance, the performance starts to saturate as the kernel size continues growing, compared to the scaling trend of advanced ViTs such as Swin Transformer. In this paper, we explore the possibility of training extreme convolutions larger than 31x31 and test whether the performance gap can be eliminated by strategically enlarging convolutions. This study ends up with a recipe for applying extremely large kernels from the perspective of sparsity, which can smoothly scale up kernels to 61x61 with better performance. Built on this recipe, we propose Sparse Large Kernel Network (SLaK), a pure CNN architecture equipped with 51x51 kernels that can perform on par with or better than state-of-the-art hierarchical Transformers and modern ConvNet architectures like ConvNeXt and RepLKNet, on ImageNet classification as well as typical downstream tasks. Our code is available here https://github.com/VITA-Group/SLaK.
Keyword: calibration
SST-Calib: Simultaneous Spatial-Temporal Parameter Calibration between LIDAR and Camera
Abstract
With information from multiple input modalities, sensor fusion-based algorithms usually out-perform their single-modality counterparts in robotics. Camera and LIDAR, with complementary semantic and depth information, are the typical choices for detection tasks in complicated driving environments. For most camera-LIDAR fusion algorithms, however, the calibration of the sensor suite will greatly impact the performance. More specifically, the detection algorithm usually requires an accurate geometric relationship among multiple sensors as the input, and it is often assumed that the contents from these sensors are captured at the same time. Preparing such sensor suites involves carefully designed calibration rigs and accurate synchronization mechanisms, and the preparation process is usually done offline. In this work, a segmentation-based framework is proposed to jointly estimate the geometrical and temporal parameters in the calibration of a camera-LIDAR suite. A semantic segmentation mask is first applied to both sensor modalities, and the calibration parameters are optimized through pixel-wise bidirectional loss. We specifically incorporated the velocity information from optical flow for temporal parameters. Since supervision is only performed at the segmentation level, no calibration label is needed within the framework. The proposed algorithm is tested on the KITTI dataset, and the result shows an accurate real-time calibration of both geometric and temporal parameters.
Continuous Target-free Extrinsic Calibration of a Multi-Sensor System from a Sequence of Static Viewpoints
Authors: Philipp Glira, Christoph Weidinger, Johann Weichselbaum
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
Abstract
Mobile robotic applications need precise information about the geometric position of the individual sensors on the platform. This information is given by the extrinsic calibration parameters which define how the sensor is rotated and translated with respect to a fixed reference coordinate system. Erroneous calibration parameters have a negative impact on typical robotic estimation tasks, e.g. SLAM. In this work we propose a new method for a continuous estimation of the calibration parameters during operation of the robot. The parameter estimation is based on the matching of point clouds which are acquired by the sensors from multiple static viewpoints. Consequently, our method does not need any special calibration targets and is applicable to any sensor whose measurements can be converted to point clouds. We demonstrate the suitability of our method by calibrating a multi-sensor system composed by 2 lidar sensors, 3 cameras, and an imaging radar sensor.
Predicting Opinion Dynamics via Sociologically-Informed Neural Networks
Authors: Maya Okawa, Tomoharu Iwata
Subjects: Social and Information Networks (cs.SI); Machine Learning (cs.LG); Machine Learning (stat.ML)
Abstract
Opinion formation and propagation are crucial phenomena in social networks and have been extensively studied across several disciplines. Traditionally, theoretical models of opinion dynamics have been proposed to describe the interactions between individuals (i.e., social interaction) and their impact on the evolution of collective opinions. Although these models can incorporate sociological and psychological knowledge on the mechanisms of social interaction, they demand extensive calibration with real data to make reliable predictions, requiring much time and effort. Recently, the widespread use of social media platforms provides new paradigms to learn deep learning models from a large volume of social media data. However, these methods ignore any scientific knowledge about the mechanism of social interaction. In this work, we present the first hybrid method called Sociologically-Informed Neural Network (SINN), which integrates theoretical models and social media data by transporting the concepts of physics-informed neural networks (PINNs) from natural science (i.e., physics) into social science (i.e., sociology and social psychology). In particular, we recast theoretical models as ordinary differential equations (ODEs). Then we train a neural network that simultaneously approximates the data and conforms to the ODEs that represent the social scientific knowledge. In addition, we extend PINNs by integrating matrix factorization and a language model to incorporate rich side information (e.g., user profiles) and structural knowledge (e.g., cluster structure of the social interaction network). Moreover, we develop an end-to-end training procedure for SINN, which involves Gumbel-Softmax approximation to include stochastic mechanisms of social interaction. Extensive experiments on real-world and synthetic datasets show SINN outperforms six baseline methods in predicting opinion dynamics.
Keyword: out of distribution detection
There is no result
Keyword: out-of-distribution detection
There is no result
Keyword: expected calibration error
There is no result
Keyword: overconfident
There is no result
Keyword: overconfidence
There is no result
Keyword: confidence
Learning High-quality Proposals for Acne Detection
Using Quantile Forecasts for Dynamic Equivalents of Active Distribution Grids under Uncertainty
Keyword: scaling
More ConvNets in the 2020s: Scaling up Kernels Beyond 51x51 using Sparsity
Keyword: calibration
SST-Calib: Simultaneous Spatial-Temporal Parameter Calibration between LIDAR and Camera
Continuous Target-free Extrinsic Calibration of a Multi-Sensor System from a Sequence of Static Viewpoints
Predicting Opinion Dynamics via Sociologically-Informed Neural Networks