New submissions for Tue, 1 Feb 22

Keyword: SLAM

There is no result

Keyword: Visual inertial

There is no result

Keyword: livox

There is no result

Keyword: loam

There is no result

Keyword: Visual inertial odometry

There is no result

Keyword: lidar

Reconstruction of Power Lines from Point Clouds

Authors: Alexander Gribov, Khalid Duri
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computational Geometry (cs.CG)
Arxiv link: https://arxiv.org/abs/2201.12499
Pdf link: https://arxiv.org/pdf/2201.12499
Abstract This paper proposes a novel solution for constructing line features modeling each catenary curve present within a series of points representing multiple catenary curves. The solution can be applied to extract power lines from lidar point clouds, which can then be used in downstream applications like creating digital twin geospatial models and evaluating the encroachment of vegetation. This paper offers an example of how the results obtained by the proposed solution could be used to assess vegetation growth near transmission power lines based on freely available lidar data for the City of Utrecht, Netherlands [1].
Design of Outdoor Autonomous Moble Robot
Authors: I-Hsi Kao, Jian-An Su, Jau-Woei Perng
Subjects: Robotics (cs.RO); Systems and Control (eess.SY)
Arxiv link: https://arxiv.org/abs/2201.12605
Pdf link: https://arxiv.org/pdf/2201.12605
Abstract This study presents the design of a six-wheeled outdoor autonomous mobile robot. The main design goal of our robot is to increase its adaptability and flexibility when moving outdoors. This six-wheeled robot platform was equipped with some sensors, such as a global positioning system (GPS), high definition (HD) webcam, light detection and ranging (LiDAR), and rotary encoders. A personal mobile computer and 86Duino ONE microcontroller were used as the algorithm computing platform. In terms of control, the lateral offset and head angle offset of the robot were calculated using a differential GPS or a camera to detect structured and unstructured road boundaries. The lateral offset and head angle offset were fed to a fuzzy controller. The control input was designed by Q-learning of the differential speed between the left and right wheels. This made the robot track a reference route so that it could stay in its own lane. 2D LiDAR was also used to measure the relative distance from the front obstacle. The robot would immediately stop to avoid a collision when the distance between the robot and obstacle was less than a specific safety distance. A custom-designed rocker arm gave the robot the ability to climb a low step. Body balance could be maintained by controlling the angle of the rocker arm when the robot changed its pose. The autonomous mobile robot has been used for delivery service on our campus road by integrating the above system functionality.
TPC: Transformation-Specific Smoothing for Point Cloud Models
Authors: Wenda Chu, Linyi Li, Bo Li
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2201.12733
Pdf link: https://arxiv.org/pdf/2201.12733
Abstract Point cloud models with neural network architectures have achieved great success and have been widely used in safety-critical applications, such as Lidar-based recognition systems in autonomous vehicles. However, such models are shown vulnerable against adversarial attacks which aim to apply stealthy semantic transformations such as rotation and tapering to mislead model predictions. In this paper, we propose a transformation-specific smoothing framework TPC, which provides tight and scalable robustness guarantees for point cloud models against semantic transformation attacks. We first categorize common 3D transformations into three categories: additive (e.g., shearing), composable (e.g., rotation), and indirectly composable (e.g., tapering), and we present generic robustness certification strategies for all categories respectively. We then specify unique certification protocols for a range of specific semantic transformations and their compositions. Extensive experiments on several common 3D transformations show that TPC significantly outperforms the state of the art. For example, our framework boosts the certified accuracy against twisting transformation along z-axis (within 20$^\circ$) from 20.3$\%$ to 83.8$\%$.
Keyword: loop detection

There is no result

Keyword: autonomous driving

Achieving Efficient Distributed Machine Learning Using a Novel Non-Linear Class of Aggregation Functions
Authors: Haizhou Du, Ryan Yang, Yijian Chen, Qiao Xiang, Andre Wibisono, Wei Huang
Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC)
Arxiv link: https://arxiv.org/abs/2201.12488
Pdf link: https://arxiv.org/pdf/2201.12488
Abstract Distributed machine learning (DML) over time-varying networks can be an enabler for emerging decentralized ML applications such as autonomous driving and drone fleeting. However, the commonly used weighted arithmetic mean model aggregation function in existing DML systems can result in high model loss, low model accuracy, and slow convergence speed over time-varying networks. To address this issue, in this paper, we propose a novel non-linear class of model aggregation functions to achieve efficient DML over time-varying networks. Instead of taking a linear aggregation of neighboring models as most existing studies do, our mechanism uses a nonlinear aggregation, a weighted power-p mean (WPM) where p is a positive odd integer, as the aggregation function of local models from neighbors. The subsequent optimizing steps are taken using mirror descent defined by a Bregman divergence that maintains convergence to optimality. In this paper, we analyze properties of the WPM and rigorously prove convergence properties of our aggregation mechanism. Additionally, through extensive experiments, we show that when p > 1, our design significantly improves the convergence speed of the model and the scalability of DML under time-varying networks compared with arithmetic mean aggregation functions, with little additional 26computation overhead.
ApolloRL: a Reinforcement Learning Platform for Autonomous Driving
Authors: Fei Gao, Peng Geng, Jiaqi Guo, Yuan Liu, Dingfeng Guo, Yabo Su, Jie Zhou, Xiao Wei, Jin Li, Xu Liu
Subjects: Robotics (cs.RO); Machine Learning (cs.LG); Machine Learning (stat.ML)
Arxiv link: https://arxiv.org/abs/2201.12609
Pdf link: https://arxiv.org/pdf/2201.12609
Abstract We introduce ApolloRL, an open platform for research in reinforcement learning for autonomous driving. The platform provides a complete closed-loop pipeline with training, simulation, and evaluation components. It comes with 300 hours of real-world data in driving scenarios and popular baselines such as Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC) agents. We elaborate in this paper on the architecture and the environment defined in the platform. In addition, we discuss the performance of the baseline agents in the ApolloRL environment.
MVP-Net: Multiple View Pointwise Semantic Segmentation of Large-Scale Point Clouds
Authors: Chuanyu Luo, Xiaohan Li, Nuo Cheng, Han Li, Shengguang Lei, Pu Li
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Arxiv link: https://arxiv.org/abs/2201.12769
Pdf link: https://arxiv.org/pdf/2201.12769
Abstract Semantic segmentation of 3D point cloud is an essential task for autonomous driving environment perception. The pipeline of most pointwise point cloud semantic segmentation methods includes points sampling, neighbor searching, feature aggregation, and classification. Neighbor searching method like K-nearest neighbors algorithm, KNN, has been widely applied. However, the complexity of KNN is always a bottleneck of efficiency. In this paper, we propose an end-to-end neural architecture, Multiple View Pointwise Net, MVP-Net, to efficiently and directly infer large-scale outdoor point cloud without KNN or any complex pre/postprocessing. Instead, assumption-based sorting and multi-rotation of point cloud methods are introduced to point feature aggregation and receptive field expanding. Numerical experiments show that the proposed MVP-Net is 11 times faster than the most efficient pointwise semantic segmentation method RandLA-Net and achieves the same accuracy on the large-scale benchmark SemanticKITTI dataset.
A Safety-Critical Decision Making and Control Framework Combining Machine Learning and Rule-based Algorithms
Authors: Andrei Aksjonov, Ville Kyrki
Subjects: Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO); Systems and Control (eess.SY)
Arxiv link: https://arxiv.org/abs/2201.12819
Pdf link: https://arxiv.org/pdf/2201.12819
Abstract While artificial-intelligence-based methods suffer from lack of transparency, rule-based methods dominate in safety-critical systems. Yet, the latter cannot compete with the first ones in robustness to multiple requirements, for instance, simultaneously addressing safety, comfort, and efficiency. Hence, to benefit from both methods they must be joined in a single system. This paper proposes a decision making and control framework, which profits from advantages of both the rule- and machine-learning-based techniques while compensating for their disadvantages. The proposed method embodies two controllers operating in parallel, called Safety and Learned. A rule-based switching logic selects one of the actions transmitted from both controllers. The Safety controller is prioritized every time, when the Learned one does not meet the safety constraint, and also directly participates in the safe Learned controller training. Decision making and control in autonomous driving is chosen as the system case study, where an autonomous vehicle learns a multi-task policy to safely cross an unprotected intersection. Multiple requirements (i.e., safety, efficiency, and comfort) are set for vehicle operation. A numerical simulation is performed for the proposed framework validation, where its ability to satisfy the requirements and robustness to changing environment is successfully demonstrated.
Few-Shot Backdoor Attacks on Visual Object Tracking
Authors: Yiming Li, Haoxiang Zhong, Xingjun Ma, Yong Jiang, Shu-Tao Xia
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2201.13178
Pdf link: https://arxiv.org/pdf/2201.13178
Abstract Visual object tracking (VOT) has been widely adopted in mission-critical applications, such as autonomous driving and intelligent surveillance systems. In current practice, third-party resources such as datasets, backbone networks, and training platforms are frequently used to train high-performance VOT models. Whilst these resources bring certain convenience, they also introduce new security threats into VOT models. In this paper, we reveal such a threat where an adversary can easily implant hidden backdoors into VOT models by tempering with the training process. Specifically, we propose a simple yet effective few-shot backdoor attack (FSBA) that optimizes two losses alternately: 1) a \emph{feature loss} defined in the hidden feature space, and 2) the standard \emph{tracking loss}. We show that, once the backdoor is embedded into the target model by our FSBA, it can trick the model to lose track of specific objects even when the \emph{trigger} only appears in one or a few frames. We examine our attack in both digital and physical-world settings and show that it can significantly degrade the performance of state-of-the-art VOT trackers. We also show that our attack is resistant to potential defenses, highlighting the vulnerability of VOT models to potential backdoor attacks.
A Safe Control Architecture Based on a Model Predictive Control Supervisor for Autonomous Driving
Authors: Maryam Nezami, Georg Maennel, Hossam Seddik Abbas, Georg Schildbach
Subjects: Systems and Control (eess.SY)
Arxiv link: https://arxiv.org/abs/2201.13298
Pdf link: https://arxiv.org/pdf/2201.13298
Abstract This paper presents a novel, safe control architecture (SCA) for controlling an important class of systems: safety-critical systems. Ensuring the safety of control decisions has always been a challenge in automatic control. The proposed SCA aims to address this challenge by using a Model Predictive Controller (MPC) that acts as a supervisor for the operating controller, in the sense that the MPC constantly checks the safety of the control inputs generated by the operating controller and intervenes if the control input is predicted to lead to a hazardous situation in the foreseeable future invariably. Then an appropriate backup scheme can be activated, e.g., a degraded control mechanism, the transfer of the system to a safe state, or a warning signal issued to a human supervisor. For a proof of concept, the proposed SCA is applied to an autonomous driving scenario, where it is illustrated and compared in different obstacle avoidance scenarios. A major challenge of the SCA lies in the mismatch between the MPC prediction model and the real system, for which possible remedies are explored.
Keyword: mapping

Metric Hypertransformers are Universal Adapted Maps
Authors: Beatrice Acciaio, Anastasis Kratsios, Gudmund Pammer
Subjects: Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Metric Geometry (math.MG); Probability (math.PR); Computational Finance (q-fin.CP)
Arxiv link: https://arxiv.org/abs/2201.13094
Pdf link: https://arxiv.org/pdf/2201.13094
Abstract We introduce a universal class of geometric deep learning models, called metric hypertransformers (MHTs), capable of approximating any adapted map $F:\mathscr{X}^{\mathbb{Z}}\rightarrow \mathscr{Y}^{\mathbb{Z}}$ with approximable complexity, where $\mathscr{X}\subseteq \mathbb{R}^d$ and $\mathscr{Y}$ is any suitable metric space, and $\mathscr{X}^{\mathbb{Z}}$ (resp. $\mathscr{Y}^{\mathbb{Z}}$) capture all discrete-time paths on $\mathscr{X}$ (resp. $\mathscr{Y}$). Suitable spaces $\mathscr{Y}$ include various (adapted) Wasserstein spaces, all Fr\'{e}chet spaces admitting a Schauder basis, and a variety of Riemannian manifolds arising from information geometry. Even in the static case, where $f:\mathscr{X}\rightarrow \mathscr{Y}$ is a H\"{o}lder map, our results provide the first (quantitative) universal approximation theorem compatible with any such $\mathscr{X}$ and $\mathscr{Y}$. Our universal approximation theorems are quantitative, and they depend on the regularity of $F$, the choice of activation function, the metric entropy and diameter of $\mathscr{X}$, and on the regularity of the compact set of paths whereon the approximation is performed. Our guiding examples originate from mathematical finance. Notably, the MHT models introduced here are able to approximate a broad range of stochastic processes' kernels, including solutions to SDEs, many processes with arbitrarily long memory, and functions mapping sequential data to sequences of forward rate curves.
Fuzzy Segmentations of a String
Authors: Armen Kostanyan, Arevik Harmandayan
Subjects: Artificial Intelligence (cs.AI)
Arxiv link: https://arxiv.org/abs/2201.13427
Pdf link: https://arxiv.org/pdf/2201.13427
Abstract This article discusses a particular case of the data clustering problem, where it is necessary to find groups of adjacent text segments of the appropriate length that match a fuzzy pattern represented as a sequence of fuzzy properties. To solve this problem, a heuristic algorithm for finding a sufficiently large number of solutions is proposed. The key idea of the proposed algorithm is the use of the prefix structure to track the process of mapping text segments to fuzzy properties. An important special case of the text segmentation problem is the fuzzy string matching problem, when adjacent text segments have unit length and, accordingly, the fuzzy pattern is a sequence of fuzzy properties of text characters. It is proven that the heuristic segmentation algorithm in this case finds all text segments that match the fuzzy pattern. Finally, we consider the problem of a best segmentation of the entire text based on a fuzzy pattern, which is solved using the dynamic programming method. Keywords: fuzzy clustering, fuzzy string matching, approximate string matching
Keyword: localization

Automatic Segmentation of Left Ventricle in Cardiac Magnetic Resonance Images
Authors: Garvit Chhabra, J. H. Gagan, J. R. Harish Kumar
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
Arxiv link: https://arxiv.org/abs/2201.12805
Pdf link: https://arxiv.org/pdf/2201.12805
Abstract Segmentation of the left ventricle in cardiac magnetic resonance imaging MRI scans enables cardiologists to calculate the volume of the left ventricle and subsequently its ejection fraction. The ejection fraction is a measurement that expresses the percentage of blood leaving the heart with each contraction. Cardiologists often use ejection fraction to determine one's cardiac function. We propose multiscale template matching technique for detection and an elliptical active disc for automated segmentation of the left ventricle in MR images. The elliptical active disc optimizes the local energy function with respect to its five free parameters which define the disc. Gradient descent is used to minimize the energy function along with Green's theorem to optimize the computation expenses. We report validations on 320 scans containing 5,273 annotated slices which are publicly available through the Multi-Centre, Multi-Vendor, and Multi-Disease Cardiac Segmentation (M&Ms) Challenge. We achieved successful localization of the left ventricle in 89.63% of the cases and a Dice coefficient of 0.873 on diastole slices and 0.770 on systole slices. The proposed technique is based on traditional image processing techniques with a performance on par with the deep learning techniques.
A Dataset for Medical Instructional Video Classification and Question Answering
Authors: Deepak Gupta, Kush Attal, Dina Demner-Fushman
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
Arxiv link: https://arxiv.org/abs/2201.12888
Pdf link: https://arxiv.org/pdf/2201.12888
Abstract This paper introduces a new challenge and datasets to foster research toward designing systems that can understand medical videos and provide visual answers to natural language questions. We believe medical videos may provide the best possible answers to many first aids, medical emergency, and medical education questions. Toward this, we created the MedVidCL and MedVidQA datasets and introduce the tasks of Medical Video Classification (MVC) and Medical Visual Answer Localization (MVAL), two tasks that focus on cross-modal (medical language and medical video) understanding. The proposed tasks and datasets have the potential to support the development of sophisticated downstream applications that can benefit the public and medical practitioners. Our datasets consist of 6,117 annotated videos for the MVC task and 3,010 annotated questions and answers timestamps from 899 videos for the MVAL task. These datasets have been verified and corrected by medical informatics experts. We have also benchmarked each task with the created MedVidCL and MedVidQA datasets and proposed the multimodal learning methods that set competitive baselines for future research.
Rigidity Preserving Image Transformations and Equivariance in Perspective
Authors: Lucas Brynte, Georg Bökman, Axel Flinth, Fredrik Kahl
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2201.13065
Pdf link: https://arxiv.org/pdf/2201.13065
Abstract We characterize the class of image plane transformations which realize rigid camera motions and call these transformations `rigidity preserving'. In particular, 2D translations of pinhole images are not rigidity preserving. Hence, when using CNNs for 3D inference tasks, it can be beneficial to modify the inductive bias from equivariance towards translations to equivariance towards rigidity preserving transformations. We investigate how equivariance with respect to rigidity preserving transformations can be approximated in CNNs, and test our ideas on both 6D object pose estimation and visual localization. Experimentally, we improve on several competitive baselines.

zhuhu00 / Paper-Daily-Notice

New submissions for Tue, 1 Feb 22 #90

Keyword: SLAM

Keyword: Visual inertial

Keyword: livox

Keyword: loam

Keyword: Visual inertial odometry

Keyword: lidar

Reconstruction of Power Lines from Point Clouds

Design of Outdoor Autonomous Moble Robot

TPC: Transformation-Specific Smoothing for Point Cloud Models

Keyword: loop detection

Keyword: autonomous driving

Achieving Efficient Distributed Machine Learning Using a Novel Non-Linear Class of Aggregation Functions

ApolloRL: a Reinforcement Learning Platform for Autonomous Driving

MVP-Net: Multiple View Pointwise Semantic Segmentation of Large-Scale Point Clouds

A Safety-Critical Decision Making and Control Framework Combining Machine Learning and Rule-based Algorithms

Few-Shot Backdoor Attacks on Visual Object Tracking

A Safe Control Architecture Based on a Model Predictive Control Supervisor for Autonomous Driving

Keyword: mapping

Metric Hypertransformers are Universal Adapted Maps

Fuzzy Segmentations of a String

Keyword: localization

Automatic Segmentation of Left Ventricle in Cardiac Magnetic Resonance Images

A Dataset for Medical Instructional Video Classification and Question Answering

Rigidity Preserving Image Transformations and Equivariance in Perspective