New submissions for Fri, 15 Apr 22

Keyword: SLAM

There is no result

Keyword: Visual inertial

There is no result

Keyword: livox

There is no result

Keyword: loam

There is no result

Keyword: Visual inertial odometry

There is no result

Keyword: lidar

OccAM's Laser: Occlusion-based Attribution Maps for 3D Object Detectors on LiDAR Data

Authors: David Schinagl, Georg Krispel, Horst Possegger, Peter M. Roth, Horst Bischof
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2204.06577
Pdf link: https://arxiv.org/pdf/2204.06577
Abstract While 3D object detection in LiDAR point clouds is well-established in academia and industry, the explainability of these models is a largely unexplored field. In this paper, we propose a method to generate attribution maps for the detected objects in order to better understand the behavior of such models. These maps indicate the importance of each 3D point in predicting the specific objects. Our method works with black-box models: We do not require any prior knowledge of the architecture nor access to the model's internals, like parameters, activations or gradients. Our efficient perturbation-based approach empirically estimates the importance of each point by testing the model with randomly generated subsets of the input point cloud. Our sub-sampling strategy takes into account the special characteristics of LiDAR data, such as the depth-dependent point density. We show a detailed evaluation of the attribution maps and demonstrate that they are interpretable and highly informative. Furthermore, we compare the attribution maps of recent 3D object detection architectures to provide insights into their decision-making processes.
CroCo: Cross-Modal Contrastive learning for localization of Earth Observation data
Authors: Wei-Hsin Tseng, Hoàng-Ân Lê, Alexandre Boulch, Sébastien Lefèvre, Dirk Tiede
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2204.07052
Pdf link: https://arxiv.org/pdf/2204.07052
Abstract It is of interest to localize a ground-based LiDAR point cloud on remote sensing imagery. In this work, we tackle a subtask of this problem, i.e. to map a digital elevation model (DEM) rasterized from aerial LiDAR point cloud on the aerial imagery. We proposed a contrastive learning-based method that trains on DEM and high-resolution optical imagery and experiment the framework on different data sampling strategies and hyperparameters. In the best scenario, the Top-1 score of 0.71 and Top-5 score of 0.81 are obtained. The proposed method is promising for feature learning from RGB and DEM for localization and is potentially applicable to other data sources too. Source code will be released at https://github.com/wtseng530/AVLocalization.
Keyword: loop detection

There is no result

Keyword: autonomous driving

There is no result

Keyword: mapping

Agent-based Constraint Solving for Resource Allocation in Manycore Systems
Authors: Volker Wenzel, Lars Bauer, Wolfgang Schröder-Preikschat, Jörg Henkel
Subjects: Multiagent Systems (cs.MA)
Arxiv link: https://arxiv.org/abs/2204.06603
Pdf link: https://arxiv.org/pdf/2204.06603
Abstract For efficiency reasons, manycore systems are increasingly heterogeneous, which makes the mapping of complex workloads a key problem with a high optimization potential. Constraints express the application requirements like which core type to choose, how many cores to choose, exclusively or non-exclusively, using a certain core, etc. In this work, we propose a decentralized solution for solving application resource constraints by means of an agent-based approach in order to obtain scalability. We translate the constraints into a Distributed Constraint Optimization Problem (DCOP) and propose a local search algorithm RESMGM to solve them. For the first time, we demonstrate the viability and efficiency of the DCOP approach for heterogeneous manycore systems. Our RESMGM algorithm supports a far wider range of constraints than state-of-the-art, leading to superior results, but still has comparable overheads w.r.t. computation and communication.
Wassmap: Wasserstein Isometric Mapping for Image Manifold Learning
Authors: Keaton Hamm, Nick Henscheid, Shujie Kang
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
Arxiv link: https://arxiv.org/abs/2204.06645
Pdf link: https://arxiv.org/pdf/2204.06645
Abstract In this paper, we propose Wasserstein Isometric Mapping (Wassmap), a parameter-free nonlinear dimensionality reduction technique that provides solutions to some drawbacks in existing global nonlinear dimensionality reduction algorithms in imaging applications. Wassmap represents images via probability measures in Wasserstein space, then uses pairwise quadratic Wasserstein distances between the associated measures to produce a low-dimensional, approximately isometric embedding. We show that the algorithm is able to exactly recover parameters of some image manifolds including those generated by translations or dilations of a fixed generating measure. Additionally, we show that a discrete version of the algorithm retrieves parameters from manifolds generated from discrete measures by providing a theoretical bridge to transfer recovery results from functional data to discrete data. Testing of the proposed algorithms on various image data manifolds show that Wassmap yields good embeddings compared with other global techniques.
Realistic Video Sequences for Subjective QoE Analysis
Authors: Kerim Hodzic, Mirsad Cosovic, Sasa Mrdovic, Jason J. Quinlan, Darijo Raca
Subjects: Multimedia (cs.MM)
Arxiv link: https://arxiv.org/abs/2204.06829
Pdf link: https://arxiv.org/pdf/2204.06829
Abstract Multimedia streaming over the Internet (live and on demand) is the cornerstone of modern Internet carrying more than 60% of all traffic. With such high demand, delivering outstanding user experience is a crucial and challenging task. To evaluate user QoE many researchers deploy subjective quality assessments where participants watch and rate videos artificially infused with various temporal and spatial impairments. To aid current efforts in bridging the gap between the mapping of objective video QoE metrics to user experience, we developed DashReStreamer, an open-source framework for re-creating adaptively streamed video in real networks. DashReStreamer utilises a log created by a HAS algorithm run in an uncontrolled environment (i.e., wired or wireless networks), encoding visual changes and stall events in one video file. These videos are applicable for subjective QoE evaluation mimicking realistic network conditions. To supplement DashReStreamer, we re-create 234 realistic video clips, based on video logs collected from real mobile and wireless networks. In addition our dataset contains both video logs with all decisions made by the HASalgorithm and network bandwidth profile illustrating throughput distribution. We believe this dataset and framework will permit other researchers in their pursuit for the final frontier in understanding the impact of video QoE dynamics.
Gradient boosting for convex cone predict and optimize problems
Authors: Andrew Butler, Roy H. Kwon
Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
Arxiv link: https://arxiv.org/abs/2204.06895
Pdf link: https://arxiv.org/pdf/2204.06895
Abstract Many problems in engineering and statistics involve both predictive forecasting and decision-based optimization. Traditionally, predictive models are optimized independently from the final decision-based optimization problem. In contrast, a `smart, predict then optimize' (SPO) framework optimizes prediction models to explicitly minimize the final downstream decision loss. In this paper we present dboost, a gradient boosting algorithm for training prediction model ensembles to minimize decision regret. The dboost framework supports any convex optimization program that can be cast as convex quadratic cone program and gradient boosting is performed by implicit differentiation of a custom fixed-point mapping. To our knowledge, the dboost framework is the first general purpose implementation of gradient boosting to predict and optimize problems. Experimental results comparing with state-of-the-art SPO methods show that dboost can further reduce out-of-sample decision regret.
Learning Invariances with Generalised Input-Convex Neural Networks
Authors: Vitali Nesterov, Fabricio Arend Torres, Monika Nagy-Huber, Maxim Samarin, Volker Roth
Subjects: Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2204.07009
Pdf link: https://arxiv.org/pdf/2204.07009
Abstract Considering smooth mappings from input vectors to continuous targets, our goal is to characterise subspaces of the input domain, which are invariant under such mappings. Thus, we want to characterise manifolds implicitly defined by level sets. Specifically, this characterisation should be of a global parametric form, which is especially useful for different informed data exploration tasks, such as building grid-based approximations, sampling points along the level curves, or finding trajectories on the manifold. However, global parameterisations can only exist if the level sets are connected. For this purpose, we introduce a novel and flexible class of neural networks that generalise input-convex networks. These networks represent functions that are guaranteed to have connected level sets forming smooth manifolds on the input space. We further show that global parameterisations of these level sets can be always found efficiently. Lastly, we demonstrate that our novel technique for characterising invariances is a powerful generative data exploration tool in real-world applications, such as computational chemistry.
OMAD: On-device Mental Anomaly Detection for Substance and Non-Substance Users
Authors: Emon Dey, Nirmalya Roy
Subjects: Human-Computer Interaction (cs.HC)
Arxiv link: https://arxiv.org/abs/2204.07038
Pdf link: https://arxiv.org/pdf/2204.07038
Abstract Stay at home order during the COVID-19 helps flatten the curve but ironically, instigate mental health problems among the people who have Substance Use Disorders. Measuring the electrical activity signals in brain using off-the-shelf consumer wearable devices such as smart wristwatch and mapping them in real time to underlying mood, behavioral and emotional changes play striking roles in postulating mental health anomalies. In this work, we propose to implement a wearable, {\it On-device Mental Anomaly Detection (OMAD)} system to detect anomalous behaviors and activities that render to mental health problems and help clinicians to design effective intervention strategies. We propose an intrinsic artifact removal model on Electroencephalogram (EEG) signal to better correlate the fine-grained behavioral changes. We design model compression technique on the artifact removal and activity recognition (main) modules. We implement a magnitude-based weight pruning technique both on convolutional neural network and Multilayer Perceptron to employ the inference phase on Nvidia Jetson Nano; one of the tightest resource-constrained devices for wearables. We experimented with three different combinations of feature extractions and artifact removal approaches. We evaluate the performance of {\it OMAD} in terms of accuracy, F1 score, memory usage and running time for both unpruned and compressed models using EEG data from both control and treatment (alcoholic) groups for different object recognition tasks. Our artifact removal model and main activity detection model achieved about $\approx$ 93\% and 90\% accuracy, respectively with significant reduction in model size (70\%) and inference time (31\%).
Keyword: localization

Illumination-Invariant Active Camera Relocalization for Fine-Grained Change Detection in the Wild
Authors: Nan Li, Wei Feng, Qian Zhang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2204.06580
Pdf link: https://arxiv.org/pdf/2204.06580
Abstract Active camera relocalization (ACR) is a new problem in computer vision that significantly reduces the false alarm caused by image distortions due to camera pose misalignment in fine-grained change detection (FGCD). Despite the fruitful achievements that ACR can support, it still remains a challenging problem caused by the unstable results of relative pose estimation, especially for outdoor scenes, where the lighting condition is out of control, i.e., the twice observations may have highly varied illuminations. This paper studies an illumination-invariant active camera relocalization method, it improves both in relative pose estimation and scale estimation. We use plane segments as an intermediate representation to facilitate feature matching, thus further boosting pose estimation robustness and reliability under lighting variances. Moreover, we construct a linear system to obtain the absolute scale in each ACR iteration by minimizing the image warping error, thus, significantly reduce the time consume of ACR process, it is nearly $1.6$ times faster than the state-of-the-art ACR strategy. Our work greatly expands the feasibility of real-world fine-grained change monitoring tasks for cultural heritages. Extensive experiments tests and real-world applications verify the effectiveness and robustness of the proposed pose estimation method using for ACR tasks.
ViTOL: Vision Transformer for Weakly Supervised Object Localization
Authors: Saurav Gupta, Sourav Lakhotia, Abhay Rawat, Rahul Tallamraju
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2204.06772
Pdf link: https://arxiv.org/pdf/2204.06772
Abstract Weakly supervised object localization (WSOL) aims at predicting object locations in an image using only image-level category labels. Common challenges that image classification models encounter when localizing objects are, (a) they tend to look at the most discriminative features in an image that confines the localization map to a very small region, (b) the localization maps are class agnostic, and the models highlight objects of multiple classes in the same image and, (c) the localization performance is affected by background noise. To alleviate the above challenges we introduce the following simple changes through our proposed method ViTOL. We leverage the vision-based transformer for self-attention and introduce a patch-based attention dropout layer (p-ADL) to increase the coverage of the localization map and a gradient attention rollout mechanism to generate class-dependent attention maps. We conduct extensive quantitative, qualitative and ablation experiments on the ImageNet-1K and CUB datasets. We achieve state-of-the-art MaxBoxAcc-V2 localization scores of 70.47% and 73.17% on the two datasets respectively. Code is available on https://github.com/Saurav-31/ViTOL
On Random Number Generation for Kernel Applications
Authors: Kunal Abhishek, George Dharma Prakash Raj E
Subjects: Cryptography and Security (cs.CR)
Arxiv link: https://arxiv.org/abs/2204.06882
Pdf link: https://arxiv.org/pdf/2204.06882
Abstract An operating system kernel uses cryptographically secure pseudorandom number generator for creating address space localization randomization offsets to protect memory addresses to processes from exploration, storing users' password securely and creating cryptographic keys. The paper proposes a CSPRNG called KCS-PRNG which produces non-reproducible bitstreams. The proposed KCS-PRNG presents an efficient design uniquely configured with two new non-standard and verified elliptic curves and clock-controlled linear feedback shift registers and a novel method to consistently generate non-reproducible random bits of arbitrary lengths. The generated bit streams are statistically indistinguishable from true random bitstreams and provably secure, resilient to important attacks, exhibits backward and forward secrecy, exhibits exponential linear complexity, large period and huge key space.
Spatial Likelihood Voting with Self-Knowledge Distillation for Weakly Supervised Object Detection
Authors: Ze Chen, Zhihang Fu, Jianqiang Huang, Mingyuan Tao, Rongxin Jiang, Xiang Tian, Yaowu Chen, Xian-sheng Hua
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2204.06899
Pdf link: https://arxiv.org/pdf/2204.06899
Abstract Weakly supervised object detection (WSOD), which is an effective way to train an object detection model using only image-level annotations, has attracted considerable attention from researchers. However, most of the existing methods, which are based on multiple instance learning (MIL), tend to localize instances to the discriminative parts of salient objects instead of the entire content of all objects. In this paper, we propose a WSOD framework called the Spatial Likelihood Voting with Self-knowledge Distillation Network (SLV-SD Net). In this framework, we introduce a spatial likelihood voting (SLV) module to converge region proposal localization without bounding box annotations. Specifically, in every iteration during training, all the region proposals in a given image act as voters voting for the likelihood of each category in the spatial dimensions. After dilating the alignment on the area with large likelihood values, the voting results are regularized as bounding boxes, which are then used for the final classification and localization. Based on SLV, we further propose a self-knowledge distillation (SD) module to refine the feature representations of the given image. The likelihood maps generated by the SLV module are used to supervise the feature learning of the backbone network, encouraging the network to attend to wider and more diverse areas of the image. Extensive experiments on the PASCAL VOC 2007/2012 and MS-COCO datasets demonstrate the excellent performance of SLV-SD Net. In addition, SLV-SD Net produces new state-of-the-art results on these benchmarks.
CroCo: Cross-Modal Contrastive learning for localization of Earth Observation data
Authors: Wei-Hsin Tseng, Hoàng-Ân Lê, Alexandre Boulch, Sébastien Lefèvre, Dirk Tiede
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2204.07052
Pdf link: https://arxiv.org/pdf/2204.07052
Abstract It is of interest to localize a ground-based LiDAR point cloud on remote sensing imagery. In this work, we tackle a subtask of this problem, i.e. to map a digital elevation model (DEM) rasterized from aerial LiDAR point cloud on the aerial imagery. We proposed a contrastive learning-based method that trains on DEM and high-resolution optical imagery and experiment the framework on different data sampling strategies and hyperparameters. In the best scenario, the Top-1 score of 0.71 and Top-5 score of 0.81 are obtained. The proposed method is promising for feature learning from RGB and DEM for localization and is potentially applicable to other data sources too. Source code will be released at https://github.com/wtseng530/AVLocalization.
Reflective Fiber Faults Detection and Characterization Using Long-Short-Term Memory
Authors: Khouloud Abdelli, Helmut Griesser, Peter Ehrle, Carsten Tropschug, Stephan Pachnicke
Subjects: Networking and Internet Architecture (cs.NI); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2204.07058
Pdf link: https://arxiv.org/pdf/2204.07058
Abstract To reduce operation-and-maintenance expenses (OPEX) and to ensure optical network survivability, optical network operators need to detect and diagnose faults in a timely manner and with high accuracy. With the rapid advancement of telemetry technology and data analysis techniques, data-driven approaches leveraging telemetry data to tackle the fault diagnosis problem have been gaining popularity due to their quick implementation and deployment. In this paper, we propose a novel multi-task learning model based on long short-term memory (LSTM) to detect, locate, and estimate the reflectance of fiber reflective faults (events) including the connectors and the mechanical splices by extracting insights from monitored data obtained by the optical time domain reflectometry (OTDR) principle commonly used for troubleshooting of fiber optic cables or links. The experimental results prove that the proposed method: (i) achieves a good detection capability and high localization accuracy within short measurement time even for low SNR values; and (ii) outperforms conventionally employed techniques.
Machine Learning-based Anomaly Detection in Optical Fiber Monitoring
Authors: Khouloud Abdelli, Joo Yeon Cho, Florian Azendorf, Helmut Griesser, Carsten Tropschug, Stephan Pachnicke
Subjects: Networking and Internet Architecture (cs.NI); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2204.07059
Pdf link: https://arxiv.org/pdf/2204.07059
Abstract Secure and reliable data communication in optical networks is critical for high-speed Internet. However, optical fibers, serving as the data transmission medium providing connectivity to billons of users worldwide, are prone to a variety of anomalies resulting from hard failures (e.g., fiber cuts) and malicious physical attacks (e.g., optical eavesdropping (fiber tapping)) etc. Such anomalies may cause network disruption and thereby inducing huge financial and data losses, or compromise the confidentiality of optical networks by gaining unauthorized access to the carried data, or gradually degrade the network operations. Therefore, it is highly required to implement efficient anomaly detection, diagnosis, and localization schemes for enhancing the availability and reliability of optical networks. In this paper, we propose a data driven approach to accurately and quickly detect, diagnose, and localize fiber anomalies including fiber cuts, and optical eavesdropping attacks. The proposed method combines an autoencoder-based anomaly detection and an attention-based bidirectional gated recurrent unit algorithm, whereby the former is used for fault detection and the latter is adopted for fault diagnosis and localization once an anomaly is detected by the autoencoder. We verify the efficiency of our proposed approach by experiments under various anomaly scenarios using real operational data. The experimental results demonstrate that: (i) the autoencoder detects any fiber fault or anomaly with an F1 score of 96.86%; and (ii) the attention-based bidirectional gated recurrent unit algorithm identifies the the detected anomalies with an average accuracy of 98.2%, and localizes the faults with an average root mean square error of 0.19 m.
Neighborhood Attention Transformer
Authors: Ali Hassani, Steven Walton, Jiachen Li, Shen Li, Humphrey Shi
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2204.07143
Pdf link: https://arxiv.org/pdf/2204.07143
Abstract We present Neighborhood Attention Transformer (NAT), an efficient, accurate and scalable hierarchical transformer that works well on both image classification and downstream vision tasks. It is built upon Neighborhood Attention (NA), a simple and flexible attention mechanism that localizes the receptive field for each query to its nearest neighboring pixels. NA is a localization of self-attention, and approaches it as the receptive field size increases. It is also equivalent in FLOPs and memory usage to Swin Transformer's shifted window attention given the same receptive field size, while being less constrained. Furthermore, NA includes local inductive biases, which eliminate the need for extra operations such as pixel shifts. Experimental results on NAT are competitive; NAT-Tiny reaches 83.2% top-1 accuracy on ImageNet with only 4.3 GFLOPs and 28M parameters, 51.4% mAP on MS-COCO and 48.4% mIoU on ADE20k. We will open-source our checkpoints, training script, configurations, and our CUDA kernel at: https://github.com/SHI-Labs/Neighborhood-Attention-Transformer .

zhuhu00 / Paper-Daily-Notice

New submissions for Fri, 15 Apr 22 #142

Keyword: SLAM

Keyword: Visual inertial

Keyword: livox

Keyword: loam

Keyword: Visual inertial odometry

Keyword: lidar

OccAM's Laser: Occlusion-based Attribution Maps for 3D Object Detectors on LiDAR Data

CroCo: Cross-Modal Contrastive learning for localization of Earth Observation data

Keyword: loop detection

Keyword: autonomous driving

Keyword: mapping

Agent-based Constraint Solving for Resource Allocation in Manycore Systems

Wassmap: Wasserstein Isometric Mapping for Image Manifold Learning

Realistic Video Sequences for Subjective QoE Analysis

Gradient boosting for convex cone predict and optimize problems

Learning Invariances with Generalised Input-Convex Neural Networks

OMAD: On-device Mental Anomaly Detection for Substance and Non-Substance Users

Keyword: localization

Illumination-Invariant Active Camera Relocalization for Fine-Grained Change Detection in the Wild

ViTOL: Vision Transformer for Weakly Supervised Object Localization

On Random Number Generation for Kernel Applications

Spatial Likelihood Voting with Self-Knowledge Distillation for Weakly Supervised Object Detection

CroCo: Cross-Modal Contrastive learning for localization of Earth Observation data

Reflective Fiber Faults Detection and Characterization Using Long-Short-Term Memory

Machine Learning-based Anomaly Detection in Optical Fiber Monitoring

Neighborhood Attention Transformer