Abstract
Accurate pose estimation is a fundamental ability that all mobile robots must posses in order to traverse robustly in a given environment. Much like a human, this ability is dependent on the robot's understanding of a given scene. For Autonomous Vehicles (AV's), detailed 3D maps created beforehand are widely used to augment the perceptive abilities and estimate pose based on current sensor measurements. This approach however is less suited for rural communities that are sparsely connected and cover large areas. To deal with the challenge of localizing a vehicle in a rural setting, this paper presents a data-set of rural road scenes, along with an approach for fast segmentation of roads using LIDAR point clouds. The segmented point cloud in concert with road network information from Open Street Maps (OSM) is used for pose estimation. We propose two measurement models which are compared with state of the art methods for localization on OSM for tracking as well as global localization. The results show that the proposed algorithm is able to estimate pose within a 2 sq. km area with mean accuracy of 6.5 meters.
OpenStreetMap-based LiDAR Global Localization in Urban Environment without a Prior LiDAR Map
Authors: Younghun Cho, Giseop Kim, Sangmin Lee, Jee-Hwan Ryu
Abstract
Using publicly accessible maps, we propose a novel vehicle localization method that can be applied without using prior light detection and ranging (LiDAR) maps. Our method generates OSM descriptors by calculating the distances to buildings from a location in OpenStreetMap at a regular angle, and LiDAR descriptors by calculating the shortest distances to building points from the current location at a regular angle. Comparing the OSM descriptors and LiDAR descriptors yields a highly accurate vehicle localization result. Compared to methods that use prior LiDAR maps, our method presents two main advantages: (1) vehicle localization is not limited to only places with previously acquired LiDAR maps, and (2) our method is comparable to LiDAR map-based methods, and especially outperforms the other methods with respect to the top one candidate at KITTI dataset sequence 00.
Keyword: loop detection
There is no result
Keyword: autonomous driving
Sim-to-Real Domain Adaptation for Lane Detection and Classification in Autonomous Driving
Authors: Chuqing Hu, Sinclair Hudson, Martin Ethier, Mohammad Al-Sharman, Derek Rayside, William Melek
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Abstract
While supervised detection and classification frameworks in autonomous driving require large labelled datasets to converge, Unsupervised Domain Adaptation (UDA) approaches, facilitated by synthetic data generated from photo-real simulated environments, are considered low-cost and less time-consuming solutions. In this paper, we propose UDA schemes using adversarial discriminative and generative methods for lane detection and classification applications in autonomous driving. We also present Simulanes dataset generator to create a synthetic dataset that is naturalistic utilizing CARLA's vast traffic scenarios and weather conditions. The proposed UDA frameworks take the synthesized dataset with labels as the source domain, whereas the target domain is the unlabelled real-world data. Using adversarial generative and feature discriminators, the learnt models are tuned to predict the lane location and class in the target domain. The proposed techniques are evaluated using both real-world and our synthetic datasets. The results manifest that the proposed methods have shown superiority over other baseline schemes in terms of detection and classification accuracy and consistency. The ablation study reveals that the size of the simulation dataset plays important roles in the classification performance of the proposed methods. Our UDA frameworks are available at https://github.com/anita-hu/sim2real-lane-detection and our dataset generator is released at https://github.com/anita-hu/simulanes
Exploring the Devil in Graph Spectral Domain for 3D Point Cloud Attacks
Authors: Qianjiang Hu, Daizong Liu, Wei Hu
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
Abstract
3D dynamic point clouds provide a discrete representation of real-world objects or scenes in motion, which have been widely applied in immersive telepresence, autonomous driving, surveillance, \textit{etc}. However, point clouds acquired from sensors are usually perturbed by noise, which affects downstream tasks such as surface reconstruction and analysis. Although many efforts have been made for static point cloud denoising, few works address dynamic point cloud denoising. In this paper, we propose a novel gradient-based dynamic point cloud denoising method, exploiting the temporal correspondence for the estimation of gradient fields -- also a fundamental problem in dynamic point cloud processing and analysis. The gradient field is the gradient of the log-probability function of the noisy point cloud, based on which we perform gradient ascent so as to converge each point to the underlying clean surface. We estimate the gradient of each surface patch by exploiting the temporal correspondence, where the temporally corresponding patches are searched leveraging on rigid motion in classical mechanics. In particular, we treat each patch as a rigid object, which moves in the gradient field of an adjacent frame via force until reaching a balanced state, i.e., when the sum of gradients over the patch reaches 0. Since the gradient would be smaller when the point is closer to the underlying surface, the balanced patch would fit the underlying surface well, thus leading to the temporal correspondence. Finally, the position of each point in the patch is updated along the direction of the gradient averaged from corresponding patches in adjacent frames. Experimental results demonstrate that the proposed model outperforms state-of-the-art methods.
Keyword: mapping
Graph Neural Network-Based Scheduling for Multi-UAV-Enabled Communications in D2D Networks
Abstract
In this paper, we jointly design the power control and position dispatch for Multi-unmanned aerial vehicle (UAV)-enabled communication in device-to-device (D2D) networks. Our objective is to maximize the total transmission rate of downlink users (DUs). Meanwhile, the quality of service (QoS) of all D2D users must be satisfied. We comprehensively considered the interference among D2D communications and downlink transmissions. The original problem is strongly non-convex, which requires high computational complexity for traditional optimization methods. And to make matters worse, the results are not necessarily globally optimal. In this paper, we propose a novel graph neural networks (GNN) based approach that can map the considered system into a specific graph structure and achieve the optimal solution in a low complexity manner. Particularly, we first construct a GNN-based model for the proposed network, in which the transmission links and interference links are formulated as vertexes and edges, respectively. Then, by taking the channel state information and the coordinates of ground users as the inputs, as well as the location of UAVs and the transmission power of all transmitters as outputs, we obtain the mapping from inputs to outputs through training the parameters of GNN. Simulation results verified that the way to maximize the total transmission rate of DUs can be extracted effectively via the training on samples. Moreover, it also shows that the performance of proposed GNN-based method is better than that of traditional means.
Collision-free Path Planning in the Latent Space through cGANs
Abstract
We show a new method for collision-free path planning by cGANs by mapping its latent space to only the collision-free areas of the robot joint space. Our method simply provides this collision-free latent space after which any planner, using any optimization conditions, can be used to generate the most suitable paths on the fly. We successfully verified this method with a simulated two-link robot arm.
CommerceMM: Large-Scale Commerce MultiModal Representation Learning with Omni Retrieval
Authors: Licheng Yu, Jun Chen, Animesh Sinha, Mengjiao MJ Wang, Hugo Chen, Tamara L. Berg, Ning Zhang
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Multimedia (cs.MM); Social and Information Networks (cs.SI)
Abstract
We introduce CommerceMM - a multimodal model capable of providing a diverse and granular understanding of commerce topics associated to the given piece of content (image, text, image+text), and having the capability to generalize to a wide range of tasks, including Multimodal Categorization, Image-Text Retrieval, Query-to-Product Retrieval, Image-to-Product Retrieval, etc. We follow the pre-training + fine-tuning training regime and present 5 effective pre-training tasks on image-text pairs. To embrace more common and diverse commerce data with text-to-multimodal, image-to-multimodal, and multimodal-to-multimodal mapping, we propose another 9 novel cross-modal and cross-pair retrieval tasks, called Omni-Retrieval pre-training. The pre-training is conducted in an efficient manner with only two forward/backward updates for the combined 14 tasks. Extensive experiments and analysis show the effectiveness of each task. When combining all pre-training tasks, our model achieves state-of-the-art performance on 7 commerce-related downstream tasks after fine-tuning. Additionally, we propose a novel approach of modality randomization to dynamically adjust our model under different efficiency constraints.
Keyword: localization
Road Segmentation based Localization using Open Street Maps for Rural Roads
Abstract
Accurate pose estimation is a fundamental ability that all mobile robots must posses in order to traverse robustly in a given environment. Much like a human, this ability is dependent on the robot's understanding of a given scene. For Autonomous Vehicles (AV's), detailed 3D maps created beforehand are widely used to augment the perceptive abilities and estimate pose based on current sensor measurements. This approach however is less suited for rural communities that are sparsely connected and cover large areas. To deal with the challenge of localizing a vehicle in a rural setting, this paper presents a data-set of rural road scenes, along with an approach for fast segmentation of roads using LIDAR point clouds. The segmented point cloud in concert with road network information from Open Street Maps (OSM) is used for pose estimation. We propose two measurement models which are compared with state of the art methods for localization on OSM for tracking as well as global localization. The results show that the proposed algorithm is able to estimate pose within a 2 sq. km area with mean accuracy of 6.5 meters.
OpenStreetMap-based LiDAR Global Localization in Urban Environment without a Prior LiDAR Map
Authors: Younghun Cho, Giseop Kim, Sangmin Lee, Jee-Hwan Ryu
Abstract
Using publicly accessible maps, we propose a novel vehicle localization method that can be applied without using prior light detection and ranging (LiDAR) maps. Our method generates OSM descriptors by calculating the distances to buildings from a location in OpenStreetMap at a regular angle, and LiDAR descriptors by calculating the shortest distances to building points from the current location at a regular angle. Comparing the OSM descriptors and LiDAR descriptors yields a highly accurate vehicle localization result. Compared to methods that use prior LiDAR maps, our method presents two main advantages: (1) vehicle localization is not limited to only places with previously acquired LiDAR maps, and (2) our method is comparable to LiDAR map-based methods, and especially outperforms the other methods with respect to the top one candidate at KITTI dataset sequence 00.
Keyword: SLAM
There is no result
Keyword: Visual inertial
There is no result
Keyword: livox
There is no result
Keyword: loam
There is no result
Keyword: Visual inertial odometry
There is no result
Keyword: lidar
Road Segmentation based Localization using Open Street Maps for Rural Roads
OpenStreetMap-based LiDAR Global Localization in Urban Environment without a Prior LiDAR Map
Keyword: loop detection
There is no result
Keyword: autonomous driving
Sim-to-Real Domain Adaptation for Lane Detection and Classification in Autonomous Driving
Exploring the Devil in Graph Spectral Domain for 3D Point Cloud Attacks
Keyword: mapping
Graph Neural Network-Based Scheduling for Multi-UAV-Enabled Communications in D2D Networks
Collision-free Path Planning in the Latent Space through cGANs
CommerceMM: Large-Scale Commerce MultiModal Representation Learning with Omni Retrieval
Keyword: localization
Road Segmentation based Localization using Open Street Maps for Rural Roads
OpenStreetMap-based LiDAR Global Localization in Urban Environment without a Prior LiDAR Map