New submissions for Wed, 20 Apr 22

Keyword: SLAM

There is no result

Keyword: Visual inertial

There is no result

Keyword: livox

There is no result

Keyword: loam

There is no result

Keyword: Visual inertial odometry

There is no result

Keyword: lidar

Shape-Aware Monocular 3D Object Detection

Authors: Wei Chen, Jie Zhao, Wan-Lei Zhao, Song-Yuan Wu
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2204.08717
Pdf link: https://arxiv.org/pdf/2204.08717
Abstract The detection of 3D objects through a single perspective camera is a challenging issue. The anchor-free and keypoint-based models receive increasing attention recently due to their effectiveness and simplicity. However, most of these methods are vulnerable to occluded and truncated objects. In this paper, a single-stage monocular 3D object detection model is proposed. An instance-segmentation head is integrated into the model training, which allows the model to be aware of the visible shape of a target object. The detection largely avoids interference from irrelevant regions surrounding the target objects. In addition, we also reveal that the popular IoU-based evaluation metrics, which were originally designed for evaluating stereo or LiDAR-based detection methods, are insensitive to the improvement of monocular 3D object detection algorithms. A novel evaluation metric, namely average depth similarity (ADS) is proposed for the monocular 3D object detection models. Our method outperforms the baseline on both the popular and the proposed evaluation metrics while maintaining real-time efficiency.
Proposal-free Lidar Panoptic Segmentation with Pillar-level Affinity
Authors: Qi Chen, Sourabh Vora
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2204.08744
Pdf link: https://arxiv.org/pdf/2204.08744
Abstract We propose a simple yet effective proposal-free architecture for lidar panoptic segmentation. We jointly optimize both semantic segmentation and class-agnostic instance classification in a single network using a pillar-based bird's-eye view representation. The instance classification head learns pairwise affinity between pillars to determine whether the pillars belong to the same instance or not. We further propose a local clustering algorithm to propagate instance ids by merging semantic segmentation and affinity predictions. Our experiments on nuScenes dataset show that our approach outperforms previous proposal-free methods and is comparable to proposal-based methods which requires extra annotation from object detection.
Sensor Data Fusion in Top-View Grid Maps using Evidential Reasoning with Advanced Conflict Resolution
Authors: Sven Richter, Frank Bieder, Sascha Wirges, Christoph Stiller
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2204.08780
Pdf link: https://arxiv.org/pdf/2204.08780
Abstract We present a new method to combine evidential top-view grid maps estimated based on heterogeneous sensor sources. Dempster's combination rule that is usually applied in this context provides undesired results with highly conflicting inputs. Therefore, we use more advanced evidential reasoning techniques and improve the conflict resolution by modeling the reliability of the evidence sources. We propose a data-driven reliability estimation to optimize the fusion quality using the Kitti-360 dataset. We apply the proposed method to the fusion of LiDAR and stereo camera data and evaluate the results qualitatively and quantitatively. The results demonstrate that our proposed method robustly combines measurements from heterogeneous sensors and successfully resolves sensor conflicts.
Keyword: loop detection

There is no result

Keyword: autonomous driving

Dynamic Point Cloud Denoising via Gradient Fields
Authors: Qianjiang Hu, Wei Hu
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2204.08755
Pdf link: https://arxiv.org/pdf/2204.08755
Abstract 3D dynamic point clouds provide a discrete representation of real-world objects or scenes in motion, which have been widely applied in immersive telepresence, autonomous driving, surveillance, etc. However, point clouds acquired from sensors are usually perturbed by noise, which affects downstream tasks such as surface reconstruction and analysis. Although many efforts have been made for static point cloud denoising, dynamic point cloud denoising remains under-explored. In this paper, we propose a novel gradient-field-based dynamic point cloud denoising method, exploiting the temporal correspondence via the estimation of gradient fields -- a fundamental problem in dynamic point cloud processing and analysis. The gradient field is the gradient of the log-probability function of the noisy point cloud, based on which we perform gradient ascent so as to converge each point to the underlying clean surface. We estimate the gradient of each surface patch and exploit the temporal correspondence, where the temporally corresponding patches are searched leveraging on rigid motion in classical mechanics. In particular, we treat each patch as a rigid object, which moves in the gradient field of an adjacent frame via force until reaching a balanced state, i.e., when the sum of gradients over the patch reaches 0. Since the gradient would be smaller when the point is closer to the underlying surface, the balanced patch would fit the underlying surface well, thus leading to the temporal correspondence. Finally, the position of each point in the patch is updated along the direction of the gradient averaged from corresponding patches in adjacent frames. Experimental results demonstrate that the proposed model outperforms state-of-the-art methods under both synthetic noise and simulated real-world noise.
An Efficient Domain-Incremental Learning Approach to Drive in All Weather Conditions
Authors: M. Jehanzeb Mirza, Marc Masana, Horst Possegger, Horst Bischof
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2204.08817
Pdf link: https://arxiv.org/pdf/2204.08817
Abstract Although deep neural networks enable impressive visual perception performance for autonomous driving, their robustness to varying weather conditions still requires attention. When adapting these models for changed environments, such as different weather conditions, they are prone to forgetting previously learned information. This catastrophic forgetting is typically addressed via incremental learning approaches which usually re-train the model by either keeping a memory bank of training samples or keeping a copy of the entire model or model parameters for each scenario. While these approaches show impressive results, they can be prone to scalability issues and their applicability for autonomous driving in all weather conditions has not been shown. In this paper we propose DISC -- Domain Incremental through Statistical Correction -- a simple online zero-forgetting approach which can incrementally learn new tasks (i.e weather conditions) without requiring re-training or expensive memory banks. The only information we store for each task are the statistical parameters as we categorize each domain by the change in first and second order statistics. Thus, as each task arrives, we simply 'plug and play' the statistical vectors for the corresponding task into the model and it immediately starts to perform well on that task. We show the efficacy of our approach by testing it for object detection in a challenging domain-incremental autonomous driving scenario where we encounter different adverse weather conditions, such as heavy rain, fog, and snow.
Keyword: mapping

Investigating Temporal Convolutional Neural Networks for Satellite Image Time Series Classification
Authors: James Brock, Zahraa S. Abdallah
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2204.08461
Pdf link: https://arxiv.org/pdf/2204.08461
Abstract Satellite Image Time Series (SITS) of the Earth's surface provide detailed land cover maps, with their quality in the spatial and temporal dimensions consistently improving. These image time series are integral for developing systems that aim to produce accurate, up-to-date land cover maps of the Earth's surface. Applications are wide-ranging, with notable examples including ecosystem mapping, vegetation process monitoring and anthropogenic land-use change tracking. Recently proposed methods for SITS classification have demonstrated respectable merit, but these methods tend to lack native mechanisms that exploit the temporal dimension of the data; commonly resulting in extensive data pre-processing prohibitively long training times. To overcome these shortcomings, this paper seeks to study and enhance the newly proposed method for SITS classification from literature; namely Temporal CNNs. Comprehensive experiments are carried out on two benchmark SITS datasets with the results demonstrating that Temporal CNNs display a superior or competitive performance to the benchmark algorithms for both datasets. Investigations into the Temporal CNNs architecture also highlighted the non-trivial task of optimising the model for a new dataset.
Beyond Being Real: A Sensorimotor Control Perspective on Interactions in Virtual Reality
Authors: Parastoo Abtahi, Sidney Q. Hough, James A. Landay, Sean Follmer
Subjects: Human-Computer Interaction (cs.HC)
Arxiv link: https://arxiv.org/abs/2204.08566
Pdf link: https://arxiv.org/pdf/2204.08566
Abstract We can create Virtual Reality (VR) interactions that have no equivalent in the real world by remapping spacetime or altering users' body representation, such as stretching the user's virtual arm for manipulation of distant objects or scaling up the user's avatar to enable rapid locomotion. Prior research has leveraged such approaches, what we call beyond-real techniques, to make interactions in VR more practical, efficient, ergonomic, and accessible. We present a survey categorizing prior movement-based VR interaction literature as reality-based, illusory, or beyond-real interactions. We survey relevant conferences (CHI, IEEE VR, VRST, UIST, and DIS) while focusing on selection, manipulation, locomotion, and navigation in VR. For beyond-real interactions, we describe the transformations that have been used by prior works to create novel remappings. We discuss open research questions through the lens of the human sensorimotor control system and highlight challenges that need to be addressed for effective utilization of beyond-real interactions in future VR applications, including plausibility, control, long-term adaptation, and individual differences.
A Thin Format Vision-Based Tactile Sensor with A Micro Lens Array (MLA)
Authors: Xia Chen, Guanlan Zhang, Michael Yu Wang, Hongyu Yu
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
Arxiv link: https://arxiv.org/abs/2204.08691
Pdf link: https://arxiv.org/pdf/2204.08691
Abstract Vision-based tactile sensors have been widely studied in the robotics field for high spatial resolution and compatibility with machine learning algorithms. However, the currently employed sensor's imaging system is bulky limiting its further application. Here we present a micro lens array (MLA) based vison system to achieve a low thickness format of the sensor package with high tactile sensing performance. Multiple micromachined micro lens units cover the whole elastic touching layer and provide a stitched clear tactile image, enabling high spatial resolution with a thin thickness of 5 mm. The thermal reflow and soft lithography method ensure the uniform spherical profile and smooth surface of micro lens. Both optical and mechanical characterization demonstrated the sensor's stable imaging and excellent tactile sensing, enabling precise 3D tactile information, such as displacement mapping and force distribution with an ultra compact-thin structure.
A Convolutional-Attentional Neural Framework for Structure-Aware Performance-Score Synchronization
Authors: Ruchit Agrawal, Daniel Wolff, Simon Dixon
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
Arxiv link: https://arxiv.org/abs/2204.08822
Pdf link: https://arxiv.org/pdf/2204.08822
Abstract Performance-score synchronization is an integral task in signal processing, which entails generating an accurate mapping between an audio recording of a performance and the corresponding musical score. Traditional synchronization methods compute alignment using knowledge-driven and stochastic approaches, and are typically unable to generalize well to different domains and modalities. We present a novel data-driven method for structure-aware performance-score synchronization. We propose a convolutional-attentional architecture trained with a custom loss based on time-series divergence. We conduct experiments for the audio-to-MIDI and audio-to-image alignment tasks pertained to different score modalities. We validate the effectiveness of our method via ablation studies and comparisons with state-of-the-art alignment approaches. We demonstrate that our approach outperforms previous synchronization methods for a variety of test settings across score modalities and acoustic conditions. Our method is also robust to structural differences between the performance and score sequences, which is a common limitation of standard alignment approaches.
Rendering Nighttime Image Via Cascaded Color and Brightness Compensation
Authors: Zhihao Li, Si Yi, Zhan Ma
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
Arxiv link: https://arxiv.org/abs/2204.08970
Pdf link: https://arxiv.org/pdf/2204.08970
Abstract Image signal processing (ISP) is crucial for camera imaging, and neural networks (NN) solutions are extensively deployed for daytime scenes. The lack of sufficient nighttime image dataset and insights on nighttime illumination characteristics poses a great challenge for high-quality rendering using existing NN ISPs. To tackle it, we first built a high-resolution nighttime RAW-RGB (NR2R) dataset with white balance and tone mapping annotated by expert professionals. Meanwhile, to best capture the characteristics of nighttime illumination light sources, we develop the CBUnet, a two-stage NN ISP to cascade the compensation of color and brightness attributes. Experiments show that our method has better visual quality compared to traditional ISP pipeline, and is ranked at the second place in the NTIRE 2022 Night Photography Rendering Challenge for two tracks by respective People's and Professional Photographer's choices. The code and relevant materials are avaiable on our website: https://njuvision.github.io/CBUnet.
Deep learning based closed-loop optimization of geothermal reservoir production
Authors: Nanzhe Wang, Haibin Chang, Xiangzhao Kong, Martin O. Saar, Dongxiao Zhang
Subjects: Machine Learning (cs.LG); Signal Processing (eess.SP)
Arxiv link: https://arxiv.org/abs/2204.08987
Pdf link: https://arxiv.org/pdf/2204.08987
Abstract To maximize the economic benefits of geothermal energy production, it is essential to optimize geothermal reservoir management strategies, in which geologic uncertainty should be considered. In this work, we propose a closed-loop optimization framework, based on deep learning surrogates, for the well control optimization of geothermal reservoirs. In this framework, we construct a hybrid convolution-recurrent neural network surrogate, which combines the convolution neural network (CNN) and long short-term memory (LSTM) recurrent network. The convolution structure can extract spatial information of geologic parameter fields and the recurrent structure can approximate sequence-to-sequence mapping. The trained model can predict time-varying production responses (rate, temperature, etc.) for cases with different permeability fields and well control sequences. In the closed-loop optimization framework, production optimization based on the differential evolution (DE) algorithm, and data assimilation based on the iterative ensemble smoother (IES), are performed alternately to achieve real-time well control optimization and geologic parameter estimation as the production proceeds. In addition, the averaged objective function over the ensemble of geologic parameter estimations is adopted to consider geologic uncertainty in the optimization process. Several geothermal reservoir development cases are designed to test the performance of the proposed production optimization framework. The results show that the proposed framework can achieve efficient and effective real-time optimization and data assimilation in the geothermal reservoir production process.
Keyword: localization

There is no result

zhuhu00 / Paper-Daily-Notice

New submissions for Wed, 20 Apr 22 #145

Keyword: SLAM

Keyword: Visual inertial

Keyword: livox

Keyword: loam

Keyword: Visual inertial odometry

Keyword: lidar

Shape-Aware Monocular 3D Object Detection

Proposal-free Lidar Panoptic Segmentation with Pillar-level Affinity

Sensor Data Fusion in Top-View Grid Maps using Evidential Reasoning with Advanced Conflict Resolution

Keyword: loop detection

Keyword: autonomous driving

Dynamic Point Cloud Denoising via Gradient Fields

An Efficient Domain-Incremental Learning Approach to Drive in All Weather Conditions

Keyword: mapping

Investigating Temporal Convolutional Neural Networks for Satellite Image Time Series Classification

Beyond Being Real: A Sensorimotor Control Perspective on Interactions in Virtual Reality

A Thin Format Vision-Based Tactile Sensor with A Micro Lens Array (MLA)

A Convolutional-Attentional Neural Framework for Structure-Aware Performance-Score Synchronization

Rendering Nighttime Image Via Cascaded Color and Brightness Compensation

Deep learning based closed-loop optimization of geothermal reservoir production

Keyword: localization