New submissions for Mon, 22 Jan 24

Keyword: detection

Intelligent Condition Monitoring of Industrial Plants: An Overview of Methodologies and Uncertainty Management Strategies

Authors: Authors: Maryam Ahang, Todd Charter, Oluwaseyi Ogunfowora, Maziyar Khadivi, Mostafa Abbasi, Homayoun Najjaran
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Signal Processing (eess.SP); Systems and Control (eess.SY)
Arxiv link: https://arxiv.org/abs/2401.10266
Pdf link: https://arxiv.org/pdf/2401.10266
Abstract Condition monitoring plays a significant role in the safety and reliability of modern industrial systems. Artificial intelligence (AI) approaches are gaining attention from academia and industry as a growing subject in industrial applications and as a powerful way of identifying faults. This paper provides an overview of intelligent condition monitoring and fault detection and diagnosis methods for industrial plants with a focus on the open-source benchmark Tennessee Eastman Process (TEP). In this survey, the most popular and state-of-the-art deep learning (DL) and machine learning (ML) algorithms for industrial plant condition monitoring, fault detection, and diagnosis are summarized and the advantages and disadvantages of each algorithm are studied. Challenges like imbalanced data, unlabelled samples and how deep learning models can handle them are also covered. Finally, a comparison of the accuracies and specifications of different algorithms utilizing the Tennessee Eastman Process (TEP) is conducted. This research will be beneficial for both researchers who are new to the field and experts, as it covers the literature on condition monitoring and state-of-the-art methods alongside the challenges and possible solutions to them.
HyperSense: Accelerating Hyper-Dimensional Computing for Intelligent Sensor Data Processing
Authors: Authors: Sanggeon Yun, Hanning Chen, Ryozo Masukawa, Hamza Errahmouni Barkam, Andrew Ding, Wenjun Huang, Arghavan Rezvani, Shaahin Angizi, Mohsen Imani
Subjects: Hardware Architecture (cs.AR); Artificial Intelligence (cs.AI)
Arxiv link: https://arxiv.org/abs/2401.10267
Pdf link: https://arxiv.org/pdf/2401.10267
Abstract Introducing HyperSense, our co-designed hardware and software system efficiently controls Analog-to-Digital Converter (ADC) modules' data generation rate based on object presence predictions in sensor data. Addressing challenges posed by escalating sensor quantities and data rates, HyperSense reduces redundant digital data using energy-efficient low-precision ADC, diminishing machine learning system costs. Leveraging neurally-inspired HyperDimensional Computing (HDC), HyperSense analyzes real-time raw low-precision sensor data, offering advantages in handling noise, memory-centricity, and real-time learning. Our proposed HyperSense model combines high-performance software for object detection with real-time hardware prediction, introducing the novel concept of Intelligent Sensor Control. Comprehensive software and hardware evaluations demonstrate our solution's superior performance, evidenced by the highest Area Under the Curve (AUC) and sharpest Receiver Operating Characteristic (ROC) curve among lightweight models. Hardware-wise, our FPGA-based domain-specific accelerator tailored for HyperSense achieves a 5.6x speedup compared to YOLOv4 on NVIDIA Jetson Orin while showing up to 92.1% energy saving compared to the conventional system. These results underscore HyperSense's effectiveness and efficiency, positioning it as a promising solution for intelligent sensing and real-time data processing across diverse applications.
CLAN: A Contrastive Learning based Novelty Detection Framework for Human Activity Recognition
Authors: Authors: Hyunju Kim, Dongman Lee
Subjects: Machine Learning (cs.LG); Signal Processing (eess.SP)
Arxiv link: https://arxiv.org/abs/2401.10288
Pdf link: https://arxiv.org/pdf/2401.10288
Abstract In ambient assisted living, human activity recognition from time series sensor data mainly focuses on predefined activities, often overlooking new activity patterns. We propose CLAN, a two-tower contrastive learning-based novelty detection framework with diverse types of negative pairs for human activity recognition. It is tailored to challenges with human activity characteristics, including the significance of temporal and frequency features, complex activity dynamics, shared features across activities, and sensor modality variations. The framework aims to construct invariant representations of known activity robust to the challenges. To generate suitable negative pairs, it selects data augmentation methods according to the temporal and frequency characteristics of each dataset. It derives the key representations against meaningless dynamics by contrastive and classification losses-based representation learning and score function-based novelty detection that accommodate dynamic numbers of the different types of augmented samples. The proposed two-tower model extracts the representations in terms of time and frequency, mutually enhancing expressiveness for distinguishing between new and known activities, even when they share common features. Experiments on four real-world human activity datasets show that CLAN surpasses the best performance of existing novelty detection methods, improving by 8.3%, 13.7%, and 53.3% in AUROC, balanced accuracy, and FPR@TPR0.95 metrics respectively.
Design and development of opto-neural processors for simulation of neural networks trained in image detection for potential implementation in hybrid robotics
Authors: Authors: Sanjana Shetty
Subjects: Emerging Technologies (cs.ET); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Arxiv link: https://arxiv.org/abs/2401.10289
Pdf link: https://arxiv.org/pdf/2401.10289
Abstract Neural networks have been employed for a wide range of processing applications like image processing, motor control, object detection and many others. Living neural networks offer advantages of lower power consumption, faster processing, and biological realism. Optogenetics offers high spatial and temporal control over biological neurons and presents potential in training live neural networks. This work proposes a simulated living neural network trained indirectly by backpropagating STDP based algorithms using precision activation by optogenetics achieving accuracy comparable to traditional neural network training algorithms.
A Hierarchical Framework with Spatio-Temporal Consistency Learning for Emergence Detection in Complex Adaptive Systems
Authors: Authors: Siyuan Chen, Xin Du, Jiahai Wang
Subjects: Multiagent Systems (cs.MA); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2401.10300
Pdf link: https://arxiv.org/pdf/2401.10300
Abstract Emergence, a global property of complex adaptive systems (CASs) constituted by interactive agents, is prevalent in real-world dynamic systems, e.g., network-level traffic congestions. Detecting its formation and evaporation helps to monitor the state of a system, allowing to issue a warning signal for harmful emergent phenomena. Since there is no centralized controller of CAS, detecting emergence based on each agent's local observation is desirable but challenging. Existing works are unable to capture emergence-related spatial patterns, and fail to model the nonlinear relationships among agents. This paper proposes a hierarchical framework with spatio-temporal consistency learning to solve these two problems by learning the system representation and agent representations, respectively. Especially, spatio-temporal encoders are tailored to capture agents' nonlinear relationships and the system's complex evolution. Representations of the agents and the system are learned by preserving the intrinsic spatio-temporal consistency in a self-supervised manner. Our method achieves more accurate detection than traditional methods and deep learning methods on three datasets with well-known yet hard-to-detect emergent behaviors. Notably, our hierarchical framework is generic, which can employ other deep learning methods for agent-level and system-level detection.
MELODY: Robust Semi-Supervised Hybrid Model for Entity-Level Online Anomaly Detection with Multivariate Time Series
Authors: Authors: Jingchao Ni, Gauthier Guinet, Peihong Jiang, Laurent Callot, Andrey Kan
Subjects: Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2401.10338
Pdf link: https://arxiv.org/pdf/2401.10338
Abstract In large IT systems, software deployment is a crucial process in online services as their code is regularly updated. However, a faulty code change may degrade the target service's performance and cause cascading outages in downstream services. Thus, software deployments should be comprehensively monitored, and their anomalies should be detected timely. In this paper, we study the problem of anomaly detection for deployments. We begin by identifying the challenges unique to this anomaly detection problem, which is at entity-level (e.g., deployments), relative to the more typical problem of anomaly detection in multivariate time series (MTS). The unique challenges include the heterogeneity of deployments, the low latency tolerance, the ambiguous anomaly definition, and the limited supervision. To address them, we propose a novel framework, semi-supervised hybrid Model for Entity-Level Online Detection of anomalY (MELODY). MELODY first transforms the MTS of different entities to the same feature space by an online feature extractor, then uses a newly proposed semi-supervised deep one-class model for detecting anomalous entities. We evaluated MELODY on real data of cloud services with 1.2M+ time series. The relative F1 score improvement of MELODY over the state-of-the-art methods ranges from 7.6% to 56.5%. The user evaluation suggests MELODY is suitable for monitoring deployments in large online systems.
Inconsistent dialogue responses and how to recover from them
Authors: Authors: Mian Zhang, Lifeng Jin, Linfeng Song, Haitao Mi, Dong Yu
Subjects: Computation and Language (cs.CL)
Arxiv link: https://arxiv.org/abs/2401.10353
Pdf link: https://arxiv.org/pdf/2401.10353
Abstract One critical issue for chat systems is to stay consistent about preferences, opinions, beliefs and facts of itself, which has been shown a difficult problem. In this work, we study methods to assess and bolster utterance consistency of chat systems. A dataset is first developed for studying the inconsistencies, where inconsistent dialogue responses, explanations of the inconsistencies, and recovery utterances are authored by annotators. This covers the life span of inconsistencies, namely introduction, understanding, and resolution. Building on this, we introduce a set of tasks centered on dialogue consistency, specifically focused on its detection and resolution. Our experimental findings indicate that our dataset significantly helps the progress in identifying and resolving conversational inconsistencies, and current popular large language models like ChatGPT which are good at resolving inconsistencies however still struggle with detection.
Keeping Deep Learning Models in Check: A History-Based Approach to Mitigate Overfitting
Authors: Authors: Hao Li, Gopi Krishnan Rajbahadur, Dayi Lin, Cor-Paul Bezemer, Zhen Ming (Jack)Jiang
Subjects: Software Engineering (cs.SE); Artificial Intelligence (cs.AI)
Arxiv link: https://arxiv.org/abs/2401.10359
Pdf link: https://arxiv.org/pdf/2401.10359
Abstract In software engineering, deep learning models are increasingly deployed for critical tasks such as bug detection and code review. However, overfitting remains a challenge that affects the quality, reliability, and trustworthiness of software systems that utilize deep learning models. Overfitting can be (1) prevented (e.g., using dropout or early stopping) or (2) detected in a trained model (e.g., using correlation-based approaches). Both overfitting detection and prevention approaches that are currently used have constraints (e.g., requiring modification of the model structure, and high computing resources). In this paper, we propose a simple, yet powerful approach that can both detect and prevent overfitting based on the training history (i.e., validation losses). Our approach first trains a time series classifier on training histories of overfit models. This classifier is then used to detect if a trained model is overfit. In addition, our trained classifier can be used to prevent overfitting by identifying the optimal point to stop a model's training. We evaluate our approach on its ability to identify and prevent overfitting in real-world samples. We compare our approach against correlation-based detection approaches and the most commonly used prevention approach (i.e., early stopping). Our approach achieves an F1 score of 0.91 which is at least 5% higher than the current best-performing non-intrusive overfitting detection approach. Furthermore, our approach can stop training to avoid overfitting at least 32% of the times earlier than early stopping and has the same or a better rate of returning the best model.
Agricultural Object Detection with You Look Only Once (YOLO) Algorithm: A Bibliometric and Systematic Literature Review
Authors: Authors: Chetan M Badgujar, Alwin Poulose, Hao Gan
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
Arxiv link: https://arxiv.org/abs/2401.10379
Pdf link: https://arxiv.org/pdf/2401.10379
Abstract Vision is a major component in several digital technologies and tools used in agriculture. The object detector, You Look Only Once (YOLO), has gained popularity in agriculture in a relatively short span due to its state-of-the-art performance. YOLO offers real-time detection with good accuracy and is implemented in various agricultural tasks, including monitoring, surveillance, sensing, automation, and robotics. The research and application of YOLO in agriculture are accelerating rapidly but are fragmented and multidisciplinary. Moreover, the performance characteristics (i.e., accuracy, speed, computation) of the object detector influence the rate of technology implementation and adoption in agriculture. Thus, the study aims to collect extensive literature to document and critically evaluate the advances and application of YOLO for agricultural object recognition. First, we conducted a bibliometric review of 257 articles to understand the scholarly landscape of YOLO in agricultural domain. Secondly, we conducted a systematic review of 30 articles to identify current knowledge, gaps, and modifications in YOLO for specific agricultural tasks. The study critically assesses and summarizes the information on YOLO's end-to-end learning approach, including data acquisition, processing, network modification, integration, and deployment. We also discussed task-specific YOLO algorithm modification and integration to meet the agricultural object or environment-specific challenges. In general, YOLO-integrated digital tools and technologies show the potential for real-time, automated monitoring, surveillance, and object handling to reduce labor, production cost, and environmental impact while maximizing resource efficiency. The study provides detailed documentation and significantly advances the existing knowledge on applying YOLO in agriculture, which can greatly benefit the scientific community.
Bypassing a Reactive Jammer via NOMA-Based Transmissions in Critical Missions
Authors: Authors: Mohammadreza Amini, Ghazal Asemian, Michel Kulhandjian, Burak Kantarci, Claude D'Amours, Melike Erol-Kantarci
Subjects: Cryptography and Security (cs.CR); Networking and Internet Architecture (cs.NI)
Arxiv link: https://arxiv.org/abs/2401.10387
Pdf link: https://arxiv.org/pdf/2401.10387
Abstract Wireless networks can be vulnerable to radio jamming attacks. The quality of service under a jamming attack is not guaranteed and the service requirements such as reliability, latency, and effective rate, specifically in mission-critical military applications, can be deeply affected by the jammer's actions. This paper analyzes the effect of a reactive jammer. Particularly, reliability, average transmission delay, and the effective sum rate (ESR) for a NOMA-based scheme with finite blocklength transmissions are mathematically derived taking the detection probability of the jammer into account. Furthermore, the effect of UEs' allocated power and blocklength on the network metrics is explored. Contrary to the existing literature, results show that gNB can mitigate the impact of reactive jamming by decreasing transmit power, making the transmissions covert at the jammer side. Finally, an optimization problem is formulated to maximize the ESR under reliability, delay, and transmit power constraints. It is shown that by adjusting the allocated transmit power to UEs by gNB, the gNB can bypass the jammer effect to fulfill the 0.99999 reliability and the latency of 5ms without the need for packet re-transmission.
Analyzing and Mitigating Bias for Vulnerable Classes: Towards Balanced Representation in Dataset
Authors: Authors: Dewant Katare, David Solans Noguero, Souneil Park, Nicolas Kourtellis, Marijn Janssen, Aaron Yi Ding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2401.10397
Pdf link: https://arxiv.org/pdf/2401.10397
Abstract The accuracy and fairness of perception systems in autonomous driving are crucial, particularly for vulnerable road users. Mainstream research has looked into improving the performance metrics for classification accuracy. However, the hidden traits of bias inheritance in the AI models, class imbalances and disparities in the datasets are often overlooked. In this context, our study examines the class imbalances for vulnerable road users by focusing on class distribution analysis, performance evaluation, and bias impact assessment. We identify the concern of imbalances in class representation, leading to potential biases in detection accuracy. Utilizing popular CNN models and Vision Transformers (ViTs) with the nuScenes dataset, our performance evaluation reveals detection disparities for underrepresented classes. We propose a methodology for model optimization and bias mitigation, which includes data augmentation, resampling, and metric-specific learning. Using the proposed mitigation approaches, we see improvement in IoU(%) and NDS(%) metrics from 71.3 to 75.6 and 80.6 to 83.7 respectively, for the CNN model. Similarly, for ViT, we observe improvement in IoU and NDS metrics from 74.9 to 79.2 and 83.8 to 87.1 respectively. This research contributes to developing more reliable models and datasets, enhancing inclusiveness for minority classes.
Focaler-IoU: More Focused Intersection over Union Loss
Authors: Authors: Hao Zhang, Shuaijie Zhang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2401.10525
Pdf link: https://arxiv.org/pdf/2401.10525
Abstract Bounding box regression plays a crucial role in the field of object detection, and the positioning accuracy of object detection largely depends on the loss function of bounding box regression. Existing researchs improve regression performance by utilizing the geometric relationship between bounding boxes, while ignoring the impact of difficult and easy sample distribution on bounding box regression. In this article, we analyzed the impact of difficult and easy sample distribution on regression results, and then proposed Focaler-IoU, which can improve detector performance in different detection tasks by focusing on different regression samples. Finally, comparative experiments were conducted using existing advanced detectors and regression methods for different detection tasks, and the detection performance was further improved by using the method proposed in this paper.Code is available at \url{https://github.com/malagoutou/Focaler-IoU}.
PhoGAD: Graph-based Anomaly Behavior Detection with Persistent Homology Optimization
Authors: Authors: Ziqi Yuan, Haoyi Zhou, Tianyu Chen, Jianxin Li
Subjects: Machine Learning (cs.LG); Social and Information Networks (cs.SI)
Arxiv link: https://arxiv.org/abs/2401.10547
Pdf link: https://arxiv.org/pdf/2401.10547
Abstract A multitude of toxic online behaviors, ranging from network attacks to anonymous traffic and spam, have severely disrupted the smooth operation of networks. Due to the inherent sender-receiver nature of network behaviors, graph-based frameworks are commonly used for detecting anomalous behaviors. However, in real-world scenarios, the boundary between normal and anomalous behaviors tends to be ambiguous. The local heterophily of graphs interferes with the detection, and existing methods based on nodes or edges introduce unwanted noise into representation results, thereby impacting the effectiveness of detection. To address these issues, we propose PhoGAD, a graph-based anomaly detection framework. PhoGAD leverages persistent homology optimization to clarify behavioral boundaries. Building upon this, the weights of adjacent edges are designed to mitigate the effects of local heterophily. Subsequently, to tackle the noise problem, we conduct a formal analysis and propose a disentangled representation-based explicit embedding method, ultimately achieving anomaly behavior detection. Experiments on intrusion, traffic, and spam datasets verify that PhoGAD has surpassed the performance of state-of-the-art (SOTA) frameworks in detection efficacy. Notably, PhoGAD demonstrates robust detection even with diminished anomaly proportions, highlighting its applicability to real-world scenarios. The analysis of persistent homology demonstrates its effectiveness in capturing the topological structure formed by normal edge features. Additionally, ablation experiments validate the effectiveness of the innovative mechanisms integrated within PhoGAD.
A Critical Reflection on the Use of Toxicity Detection Algorithms in Proactive Content Moderation Systems
Authors: Authors: Mark Warner, Angelika Strohmayer, Matthew Higgs, Lynne Coventry
Subjects: Human-Computer Interaction (cs.HC)
Arxiv link: https://arxiv.org/abs/2401.10629
Pdf link: https://arxiv.org/pdf/2401.10629
Abstract Toxicity detection algorithms, originally designed with reactive content moderation in mind, are increasingly being deployed into proactive end-user interventions to moderate content. Through a socio-technical lens and focusing on contexts in which they are applied, we explore the use of these algorithms in proactive moderation systems. Placing a toxicity detection algorithm in an imagined virtual mobile keyboard, we critically explore how such algorithms could be used to proactively reduce the sending of toxic content. We present findings from design workshops conducted with four distinct stakeholder groups and find concerns around how contextual complexities may exasperate inequalities around content moderation processes. Whilst only specific user groups are likely to directly benefit from these interventions, we highlight the potential for other groups to misuse them to circumvent detection, validate and gamify hate, and manipulate algorithmic models to exasperate harm.
An Effective Index for Truss-based Community Search on Large Directed Graphs
Authors: Authors: Wei Ai, CanHao Xie, Tao Meng, Yinghao Wu, KeQin Li
Subjects: Social and Information Networks (cs.SI); Artificial Intelligence (cs.AI)
Arxiv link: https://arxiv.org/abs/2401.10641
Pdf link: https://arxiv.org/pdf/2401.10641
Abstract Community search is a derivative of community detection that enables online and personalized discovery of communities and has found extensive applications in massive real-world networks. Recently, there needs to be more focus on the community search issue within directed graphs, even though substantial research has been carried out on undirected graphs. The recently proposed D-truss model has achieved good results in the quality of retrieved communities. However, existing D-truss-based work cannot perform efficient community searches on large graphs because it consumes too many computing resources to retrieve the maximal D-truss. To overcome this issue, we introduce an innovative merge relation known as D-truss-connected to capture the inherent density and cohesiveness of edges within D-truss. This relation allows us to partition all the edges in the original graph into a series of D-truss-connected classes. Then, we construct a concise and compact index, ConDTruss, based on D-truss-connected. Using ConDTruss, the efficiency of maximum D-truss retrieval will be greatly improved, making it a theoretically optimal approach. Experimental evaluations conducted on large directed graph certificate the effectiveness of our proposed method.
BadODD: Bangladeshi Autonomous Driving Object Detection Dataset
Authors: Authors: Mirza Nihal Baig, Rony Hajong, Mahdi Murshed Patwary, Mohammad Shahidur Rahman, Husne Ara Chowdhury
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2401.10659
Pdf link: https://arxiv.org/pdf/2401.10659
Abstract We propose a comprehensive dataset for object detection in diverse driving environments across 9 districts in Bangladesh. The dataset, collected exclusively from smartphone cameras, provided a realistic representation of real-world scenarios, including day and night conditions. Most existing datasets lack suitable classes for autonomous navigation on Bangladeshi roads, making it challenging for researchers to develop models that can handle the intricacies of road scenarios. To address this issue, the authors proposed a new set of classes based on characteristics rather than local vehicle names. The dataset aims to encourage the development of models that can handle the unique challenges of Bangladeshi road scenarios for the effective deployment of autonomous vehicles. The dataset did not consist of any online images to simulate real-world conditions faced by autonomous vehicles. The classification of vehicles is challenging because of the diverse range of vehicles on Bangladeshi roads, including those not found elsewhere in the world. The proposed classification system is scalable and can accommodate future vehicles, making it a valuable resource for researchers in the autonomous vehicle sector.
PTPsec: Securing the Precision Time Protocol Against Time Delay Attacks Using Cyclic Path Asymmetry Analysis
Authors: Authors: Andreas Finkenzeller, Oliver Butowski, Emanuel Regnath, Mohammad Hamad, Sebastian Steinhorst
Subjects: Cryptography and Security (cs.CR); Networking and Internet Architecture (cs.NI)
Arxiv link: https://arxiv.org/abs/2401.10664
Pdf link: https://arxiv.org/pdf/2401.10664
Abstract High-precision time synchronization is a vital prerequisite for many modern applications and technologies, including Smart Grids, Time-Sensitive Networking (TSN), and 5G networks. Although the Precision Time Protocol (PTP) can accomplish this requirement in trusted environments, it becomes unreliable in the presence of specific cyber attacks. Mainly, time delay attacks pose the highest threat to the protocol, enabling attackers to diverge targeted clocks undetected. With the increasing danger of cyber attacks, especially against critical infrastructure, there is a great demand for effective countermeasures to secure both time synchronization and the applications that depend on it. However, current solutions are not sufficiently capable of mitigating sophisticated delay attacks. For example, they lack proper integration into the PTP protocol, scalability, or sound evaluation with the required microsecond-level accuracy. This work proposes an approach to detect and counteract delay attacks against PTP based on cyclic path asymmetry measurements over redundant paths. For that, we provide a method to find redundant paths in arbitrary networks and show how this redundancy can be exploited to reveal and mitigate undesirable asymmetries on the synchronization path that cause the malicious clock divergence. Furthermore, we propose PTPsec, a secure PTP protocol and its implementation based on the latest IEEE 1588-2019 standard. With PTPsec, we advance the conventional PTP to support reliable delay attack detection and mitigation. We validate our approach on a hardware testbed, which includes an attacker capable of performing static and incremental delay attacks at a microsecond precision. Our experimental results show that all attack scenarios can be reliably detected and mitigated with minimal detection time.
Deep Learning-based Embedded Intrusion Detection System for Automotive CAN
Authors: Authors: Shashwat Khandelwal, Eashan Wadhwa, Shreejith Shanker
Subjects: Cryptography and Security (cs.CR); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2401.10674
Pdf link: https://arxiv.org/pdf/2401.10674
Abstract Rising complexity of in-vehicle electronics is enabling new capabilities like autonomous driving and active safety. However, rising automation also increases risk of security threats which is compounded by lack of in-built security measures in legacy networks like CAN, allowing attackers to observe, tamper and modify information shared over such broadcast networks. Various intrusion detection approaches have been proposed to detect and tackle such threats, with machine learning models proving highly effective. However, deploying machine learning models will require high processing power through high-end processors or GPUs to perform them close to line rate. In this paper, we propose a hybrid FPGA-based ECU approach that can transparently integrate IDS functionality through a dedicated off-the-shelf hardware accelerator that implements a deep-CNN intrusion detection model. Our results show that the proposed approach provides an average accuracy of over 99% across multiple attack datasets with 0.64% false detection rates while consuming 94% less energy and achieving 51.8% reduction in per-message processing latency when compared to IDS implementations on GPUs.
A Lightweight Multi-Attack CAN Intrusion Detection System on Hybrid FPGAs
Authors: Authors: Shashwat Khandelwal, Shreejith Shanker
Subjects: Cryptography and Security (cs.CR); Machine Learning (cs.LG); Systems and Control (eess.SY)
Arxiv link: https://arxiv.org/abs/2401.10689
Pdf link: https://arxiv.org/pdf/2401.10689
Abstract Rising connectivity in vehicles is enabling new capabilities like connected autonomous driving and advanced driver assistance systems (ADAS) for improving the safety and reliability of next-generation vehicles. This increased access to in-vehicle functions compromises critical capabilities that use legacy invehicle networks like Controller Area Network (CAN), which has no inherent security or authentication mechanism. Intrusion detection and mitigation approaches, particularly using machine learning models, have shown promising results in detecting multiple attack vectors in CAN through their ability to generalise to new vectors. However, most deployments require dedicated computing units like GPUs to perform line-rate detection, consuming much higher power. In this paper, we present a lightweight multi-attack quantised machine learning model that is deployed using Xilinx's Deep Learning Processing Unit IP on a Zynq Ultrascale+ (XCZU3EG) FPGA, which is trained and validated using the public CAN Intrusion Detection dataset. The quantised model detects denial of service and fuzzing attacks with an accuracy of above 99 % and a false positive rate of 0.07%, which are comparable to the state-of-the-art techniques in the literature. The Intrusion Detection System (IDS) execution consumes just 2.0 W with software tasks running on the ECU and achieves a 25 % reduction in per-message processing latency over the state-of-the-art implementations. This deployment allows the ECU function to coexist with the IDS with minimal changes to the tasks, making it ideal for real-time IDS in in-vehicle systems.
Explainable and Transferable Adversarial Attack for ML-Based Network Intrusion Detectors
Authors: Authors: Hangsheng Zhang, Dongqi Han, Yinlong Liu, Zhiliang Wang, Jiyan Sun, Shangyuan Zhuang, Jiqiang Liu, Jinsong Dong
Subjects: Cryptography and Security (cs.CR)
Arxiv link: https://arxiv.org/abs/2401.10691
Pdf link: https://arxiv.org/pdf/2401.10691
Abstract espite being widely used in network intrusion detection systems (NIDSs), machine learning (ML) has proven to be highly vulnerable to adversarial attacks. White-box and black-box adversarial attacks of NIDS have been explored in several studies. However, white-box attacks unrealistically assume that the attackers have full knowledge of the target NIDSs. Meanwhile, existing black-box attacks can not achieve high attack success rate due to the weak adversarial transferability between models (e.g., neural networks and tree models). Additionally, neither of them explains why adversarial examples exist and why they can transfer across models. To address these challenges, this paper introduces ETA, an Explainable Transfer-based Black-Box Adversarial Attack framework. ETA aims to achieve two primary objectives: 1) create transferable adversarial examples applicable to various ML models and 2) provide insights into the existence of adversarial examples and their transferability within NIDSs. Specifically, we first provide a general transfer-based adversarial attack method applicable across the entire ML space. Following that, we exploit a unique insight based on cooperative game theory and perturbation interpretations to explain adversarial examples and adversarial transferability. On this basis, we propose an Important-Sensitive Feature Selection (ISFS) method to guide the search for adversarial examples, achieving stronger transferability and ensuring traffic-space constraints.
Real-Time Zero-Day Intrusion Detection System for Automotive Controller Area Network on FPGAs
Authors: Authors: Shashwat Khandelwal, Shreejith Shanker
Subjects: Cryptography and Security (cs.CR); Machine Learning (cs.LG); Systems and Control (eess.SY)
Arxiv link: https://arxiv.org/abs/2401.10724
Pdf link: https://arxiv.org/pdf/2401.10724
Abstract Increasing automation in vehicles enabled by increased connectivity to the outside world has exposed vulnerabilities in previously siloed automotive networks like controller area networks (CAN). Attributes of CAN such as broadcast-based communication among electronic control units (ECUs) that lowered deployment costs are now being exploited to carry out active injection attacks like denial of service (DoS), fuzzing, and spoofing attacks. Research literature has proposed multiple supervised machine learning models deployed as Intrusion detection systems (IDSs) to detect such malicious activity; however, these are largely limited to identifying previously known attack vectors. With the ever-increasing complexity of active injection attacks, detecting zero-day (novel) attacks in these networks in real-time (to prevent propagation) becomes a problem of particular interest. This paper presents an unsupervised-learning-based convolutional autoencoder architecture for detecting zero-day attacks, which is trained only on benign (attack-free) CAN messages. We quantise the model using Vitis-AI tools from AMD/Xilinx targeting a resource-constrained Zynq Ultrascale platform as our IDS-ECU system for integration. The proposed model successfully achieves equal or higher classification accuracy (> 99.5%) on unseen DoS, fuzzing, and spoofing attacks from a publicly available attack dataset when compared to the state-of-the-art unsupervised learning-based IDSs. Additionally, by cleverly overlapping IDS operation on a window of CAN messages with the reception, the model is able to meet line-rate detection (0.43 ms per window) of high-speed CAN, which when coupled with the low energy consumption per inference, makes this architecture ideally suited for detecting zero-day attacks on critical CAN networks.
Removal and Selection: Improving RGB-Infrared Object Detection via Coarse-to-Fine Fusion
Authors: Authors: Tianyi Zhao, Maoxun Yuan, Xingxing Wei
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2401.10731
Pdf link: https://arxiv.org/pdf/2401.10731
Abstract Object detection in visible (RGB) and infrared (IR) images has been widely applied in recent years. Leveraging the complementary characteristics of RGB and IR images, the object detector provides reliable and robust object localization from day to night. Existing fusion strategies directly inject RGB and IR images into convolution neural networks, leading to inferior detection performance. Since the RGB and IR features have modality-specific noise, these strategies will worsen the fused features along with the propagation. Inspired by the mechanism of human brain processing multimodal information, this work introduces a new coarse-to-fine perspective to purify and fuse two modality features. Specifically, following this perspective, we design a Redundant Spectrum Removal module to coarsely remove interfering information within each modality and a Dynamic Feature Selection module to finely select the desired features for feature fusion. To verify the effectiveness of the coarse-to-fine fusion strategy, we construct a new object detector called Removal and Selection Detector (RSDet). Extensive experiments on three RGB-IR object detection datasets verify the superior performance of our method.
HiCD: Change Detection in Quality-Varied Images via Hierarchical Correlation Distillation
Authors: Authors: Chao Pang, Xingxing Weng, Jiang Wu, Qiang Wang, Gui-Song Xia
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2401.10752
Pdf link: https://arxiv.org/pdf/2401.10752
Abstract Advanced change detection techniques primarily target image pairs of equal and high quality. However, variations in imaging conditions and platforms frequently lead to image pairs with distinct qualities: one image being high-quality, while the other being low-quality. These disparities in image quality present significant challenges for understanding image pairs semantically and extracting change features, ultimately resulting in a notable decline in performance. To tackle this challenge, we introduce an innovative training strategy grounded in knowledge distillation. The core idea revolves around leveraging task knowledge acquired from high-quality image pairs to guide the model's learning process when dealing with image pairs that exhibit differences in quality. Additionally, we develop a hierarchical correlation distillation approach (involving self-correlation, cross-correlation, and global correlation). This approach compels the student model to replicate the correlations inherent in the teacher model, rather than focusing solely on individual features. This ensures effective knowledge transfer while maintaining the student model's training flexibility.
Starlit: Privacy-Preserving Federated Learning to Enhance Financial Fraud Detection
Authors: Authors: Aydin Abadi, Bradley Doyle, Francesco Gini, Kieron Guinamard, Sasi Kumar Murakonda, Jack Liddell, Paul Mellor, Steven J. Murdoch, Mohammad Naseri, Hector Page, George Theodorakopoulos, Suzanne Weller
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR)
Arxiv link: https://arxiv.org/abs/2401.10765
Pdf link: https://arxiv.org/pdf/2401.10765
Abstract Federated Learning (FL) is a data-minimization approach enabling collaborative model training across diverse clients with local data, avoiding direct data exchange. However, state-of-the-art FL solutions to identify fraudulent financial transactions exhibit a subset of the following limitations. They (1) lack a formal security definition and proof, (2) assume prior freezing of suspicious customers' accounts by financial institutions (limiting the solutions' adoption), (3) scale poorly, involving either $O(n^2)$ computationally expensive modular exponentiation (where $n$ is the total number of financial institutions) or highly inefficient fully homomorphic encryption, (4) assume the parties have already completed the identity alignment phase, hence excluding it from the implementation, performance evaluation, and security analysis, and (5) struggle to resist clients' dropouts. This work introduces Starlit, a novel scalable privacy-preserving FL mechanism that overcomes these limitations. It has various applications, such as enhancing financial fraud detection, mitigating terrorism, and enhancing digital health. We implemented Starlit and conducted a thorough performance analysis using synthetic data from a key player in global financial transactions. The evaluation indicates Starlit's scalability, efficiency, and accuracy.
Measuring the Impact of Scene Level Objects on Object Detection: Towards Quantitative Explanations of Detection Decisions
Authors: Authors: Lynn Vonder Haar, Timothy Elvira, Luke Newcomb, Omar Ochoa
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2401.10790
Pdf link: https://arxiv.org/pdf/2401.10790
Abstract Although accuracy and other common metrics can provide a useful window into the performance of an object detection model, they lack a deeper view of the model's decision process. Regardless of the quality of the training data and process, the features that an object detection model learns cannot be guaranteed. A model may learn a relationship between certain background context, i.e., scene level objects, and the presence of the labeled classes. Furthermore, standard performance verification and metrics would not identify this phenomenon. This paper presents a new black box explainability method for additional verification of object detection models by finding the impact of scene level objects on the identification of the objects within the image. By comparing the accuracies of a model on test data with and without certain scene level objects, the contributions of these objects to the model's performance becomes clearer. The experiment presented here will assess the impact of buildings and people in image context on the detection of emergency road vehicles by a fine-tuned YOLOv8 model. A large increase in accuracy in the presence of a scene level object will indicate the model's reliance on that object to make its detections. The results of this research lead to providing a quantitative explanation of the object detection model's decision process, enabling a deeper understanding of the model's performance.
Endovascular Detection of Catheter-Thrombus Contact by Vacuum Excitation
Authors: Authors: Jared Lawson, Madison Veliky, Colette P. Abah, Mary S. Dietrich, Rohan Chitale, Nabil Simaan
Subjects: Robotics (cs.RO)
Arxiv link: https://arxiv.org/abs/2401.10804
Pdf link: https://arxiv.org/pdf/2401.10804
Abstract Objective: The objective of this work is to introduce and demonstrate the effectiveness of a novel sensing modality for contact detection between an off-the-shelf aspiration catheter and a thrombus. Methods: A custom robotic actuator with a pressure sensor was used to generate an oscillatory vacuum excitation and sense the pressure inside the extracorporeal portion of the catheter. Vacuum pressure profiles and robotic motion data were used to train a support vector machine (SVM) classification model to detect contact between the aspiration catheter tip and a mock thrombus. Validation consisted of benchtop accuracy verification, as well as user study comparison to the current standard of angiographic presentation. Results: Benchtop accuracy of the sensing modality was shown to be 99.67%. The user study demonstrated statistically significant improvement in identifying catheter-thrombus contact compared to the current standard. The odds ratio of successful detection of clot contact was 2.86 (p=0.03) when using the proposed sensory method compared to without it. Conclusion: The results of this work indicate that the proposed sensing modality can offer intraoperative feedback to interventionalists that can improve their ability to detect contact between the distal tip of a catheter and a thrombus. Significance: By offering a relatively low-cost technology that affords off-the-shelf aspiration catheters as clot-detecting sensors, interventionalists can improve the first-pass effect of the mechanical thrombectomy procedure while reducing procedural times and mental burden.
Using LLMs to discover emerging coded antisemitic hate-speech emergence in extremist social media
Authors: Authors: Dhanush Kikkisetti, Raza Ul Mustafa, Wendy Melillo, Roberto Corizzo, Zois Boukouvalas, Jeff Gill, Nathalie Japkowicz
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2401.10841
Pdf link: https://arxiv.org/pdf/2401.10841
Abstract Online hate speech proliferation has created a difficult problem for social media platforms. A particular challenge relates to the use of coded language by groups interested in both creating a sense of belonging for its users and evading detection. Coded language evolves quickly and its use varies over time. This paper proposes a methodology for detecting emerging coded hate-laden terminology. The methodology is tested in the context of online antisemitic discourse. The approach considers posts scraped from social media platforms, often used by extremist users. The posts are scraped using seed expressions related to previously known discourse of hatred towards Jews. The method begins by identifying the expressions most representative of each post and calculating their frequency in the whole corpus. It filters out grammatically incoherent expressions as well as previously encountered ones so as to focus on emergent well-formed terminology. This is followed by an assessment of semantic similarity to known antisemitic terminology using a fine-tuned large language model, and subsequent filtering out of the expressions that are too distant from known expressions of hatred. Emergent antisemitic expressions containing terms clearly relating to Jewish topics are then removed to return only coded expressions of hatred.
Event detection from novel data sources: Leveraging satellite imagery alongside GPS traces
Authors: Authors: Ekin Ugurel, Steffen Coenen, Minda Zhou Chen, Cynthia Chen
Subjects: Computer Vision and Pattern Recognition (cs.CV); Social and Information Networks (cs.SI)
Arxiv link: https://arxiv.org/abs/2401.10890
Pdf link: https://arxiv.org/pdf/2401.10890
Abstract Rapid identification and response to breaking events, particularly those that pose a threat to human life such as natural disasters or conflicts, is of paramount importance. The prevalence of mobile devices and the ubiquity of network connectivity has generated a massive amount of temporally- and spatially-stamped data. Numerous studies have used mobile data to derive individual human mobility patterns for various applications. Similarly, the increasing number of orbital satellites has made it easier to gather high-resolution images capturing a snapshot of a geographical area in sub-daily temporal frequency. We propose a novel data fusion methodology integrating satellite imagery with privacy-enhanced mobile data to augment the event inference task, whether in real-time or historical. In the absence of boots on the ground, mobile data is able to give an approximation of human mobility, proximity to one another, and the built environment. On the other hand, satellite imagery can provide visual information on physical changes to the built and natural environment. The expected use cases for our methodology include small-scale disaster detection (i.e., tornadoes, wildfires, and floods) in rural regions, search and rescue operation augmentation for lost hikers in remote wilderness areas, and identification of active conflict areas and population displacement in war-torn states. Our implementation is open-source on GitHub: https://github.com/ekinugurel/SatMobFusion.
Keyword: face recognition

There is no result

Keyword: augmentation

CLAN: A Contrastive Learning based Novelty Detection Framework for Human Activity Recognition
Authors: Authors: Hyunju Kim, Dongman Lee
Subjects: Machine Learning (cs.LG); Signal Processing (eess.SP)
Arxiv link: https://arxiv.org/abs/2401.10288
Pdf link: https://arxiv.org/pdf/2401.10288
Abstract In ambient assisted living, human activity recognition from time series sensor data mainly focuses on predefined activities, often overlooking new activity patterns. We propose CLAN, a two-tower contrastive learning-based novelty detection framework with diverse types of negative pairs for human activity recognition. It is tailored to challenges with human activity characteristics, including the significance of temporal and frequency features, complex activity dynamics, shared features across activities, and sensor modality variations. The framework aims to construct invariant representations of known activity robust to the challenges. To generate suitable negative pairs, it selects data augmentation methods according to the temporal and frequency characteristics of each dataset. It derives the key representations against meaningless dynamics by contrastive and classification losses-based representation learning and score function-based novelty detection that accommodate dynamic numbers of the different types of augmented samples. The proposed two-tower model extracts the representations in terms of time and frequency, mutually enhancing expressiveness for distinguishing between new and known activities, even when they share common features. Experiments on four real-world human activity datasets show that CLAN surpasses the best performance of existing novelty detection methods, improving by 8.3%, 13.7%, and 53.3% in AUROC, balanced accuracy, and FPR@TPR0.95 metrics respectively.
Analyzing and Mitigating Bias for Vulnerable Classes: Towards Balanced Representation in Dataset
Authors: Authors: Dewant Katare, David Solans Noguero, Souneil Park, Nicolas Kourtellis, Marijn Janssen, Aaron Yi Ding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2401.10397
Pdf link: https://arxiv.org/pdf/2401.10397
Abstract The accuracy and fairness of perception systems in autonomous driving are crucial, particularly for vulnerable road users. Mainstream research has looked into improving the performance metrics for classification accuracy. However, the hidden traits of bias inheritance in the AI models, class imbalances and disparities in the datasets are often overlooked. In this context, our study examines the class imbalances for vulnerable road users by focusing on class distribution analysis, performance evaluation, and bias impact assessment. We identify the concern of imbalances in class representation, leading to potential biases in detection accuracy. Utilizing popular CNN models and Vision Transformers (ViTs) with the nuScenes dataset, our performance evaluation reveals detection disparities for underrepresented classes. We propose a methodology for model optimization and bias mitigation, which includes data augmentation, resampling, and metric-specific learning. Using the proposed mitigation approaches, we see improvement in IoU(%) and NDS(%) metrics from 71.3 to 75.6 and 80.6 to 83.7 respectively, for the CNN model. Similarly, for ViT, we observe improvement in IoU and NDS metrics from 74.9 to 79.2 and 83.8 to 87.1 respectively. This research contributes to developing more reliable models and datasets, enhancing inclusiveness for minority classes.
Learning High-Quality and General-Purpose Phrase Representations
Authors: Authors: Lihu Chen, Gaël Varoquaux, Fabian M. Suchanek
Subjects: Computation and Language (cs.CL)
Arxiv link: https://arxiv.org/abs/2401.10407
Pdf link: https://arxiv.org/pdf/2401.10407
Abstract Phrase representations play an important role in data science and natural language processing, benefiting various tasks like Entity Alignment, Record Linkage, Fuzzy Joins, and Paraphrase Classification. The current state-of-the-art method involves fine-tuning pre-trained language models for phrasal embeddings using contrastive learning. However, we have identified areas for improvement. First, these pre-trained models tend to be unnecessarily complex and require to be pre-trained on a corpus with context sentences. Second, leveraging the phrase type and morphology gives phrase representations that are both more precise and more flexible. We propose an improved framework to learn phrase representations in a context-free fashion. The framework employs phrase type classification as an auxiliary task and incorporates character-level information more effectively into the phrase representation. Furthermore, we design three granularities of data augmentation to increase the diversity of training samples. Our experiments across a wide range of tasks show that our approach generates superior phrase embeddings compared to previous methods while requiring a smaller model size. The code is available at \faGithub~ \url{https://github.com/tigerchen52/PEARL} \end{abstract}
Exploring Color Invariance through Image-Level Ensemble Learning
Authors: Authors: Yunpeng Gong, Jiaquan Li, Lifei Chen, Min Jiang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2401.10512
Pdf link: https://arxiv.org/pdf/2401.10512
Abstract In the field of computer vision, the persistent presence of color bias, resulting from fluctuations in real-world lighting and camera conditions, presents a substantial challenge to the robustness of models. This issue is particularly pronounced in complex wide-area surveillance scenarios, such as person re-identification and industrial dust segmentation, where models often experience a decline in performance due to overfitting on color information during training, given the presence of environmental variations. Consequently, there is a need to effectively adapt models to cope with the complexities of camera conditions. To address this challenge, this study introduces a learning strategy named Random Color Erasing, which draws inspiration from ensemble learning. This strategy selectively erases partial or complete color information in the training data without disrupting the original image structure, thereby achieving a balanced weighting of color features and other features within the neural network. This approach mitigates the risk of overfitting and enhances the model's ability to handle color variation, thereby improving its overall robustness. The approach we propose serves as an ensemble learning strategy, characterized by robust interpretability. A comprehensive analysis of this methodology is presented in this paper. Across various tasks such as person re-identification and semantic segmentation, our approach consistently improves strong baseline methods. Notably, in comparison to existing methods that prioritize color robustness, our strategy significantly enhances performance in cross-domain scenarios. The code available at \url{https://github.com/layumi/Person\_reID\_baseline\_pytorch/blob/master/random\_erasing.py} or \url{https://github.com/finger-monkey/Data-Augmentation}.
Adversarially Robust Signed Graph Contrastive Learning from Balance Augmentation
Authors: Authors: Jialong Zhou, Xing Ai, Yuni Lai, Kai Zhou
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR)
Arxiv link: https://arxiv.org/abs/2401.10590
Pdf link: https://arxiv.org/pdf/2401.10590
Abstract Signed graphs consist of edges and signs, which can be separated into structural information and balance-related information, respectively. Existing signed graph neural networks (SGNNs) typically rely on balance-related information to generate embeddings. Nevertheless, the emergence of recent adversarial attacks has had a detrimental impact on the balance-related information. Similar to how structure learning can restore unsigned graphs, balance learning can be applied to signed graphs by improving the balance degree of the poisoned graph. However, this approach encounters the challenge "Irreversibility of Balance-related Information" - while the balance degree improves, the restored edges may not be the ones originally affected by attacks, resulting in poor defense effectiveness. To address this challenge, we propose a robust SGNN framework called Balance Augmented-Signed Graph Contrastive Learning (BA-SGCL), which combines Graph Contrastive Learning principles with balance augmentation techniques. Experimental results demonstrate that BA-SGCL not only enhances robustness against existing adversarial attacks but also achieves superior performance on link sign prediction task across various datasets.
Data Augmentation for Traffic Classification
Authors: Authors: Chao Wang, Alessandro Finamore, Pietro Michiardi, Massimo Gallo, Dario Rossi
Subjects: Machine Learning (cs.LG); Networking and Internet Architecture (cs.NI)
Arxiv link: https://arxiv.org/abs/2401.10754
Pdf link: https://arxiv.org/pdf/2401.10754
Abstract Data Augmentation (DA) -- enriching training data by adding synthetic samples -- is a technique widely adopted in Computer Vision (CV) and Natural Language Processing (NLP) tasks to improve models performance. Yet, DA has struggled to gain traction in networking contexts, particularly in Traffic Classification (TC) tasks. In this work, we fulfill this gap by benchmarking 18 augmentation functions applied to 3 TC datasets using packet time series as input representation and considering a variety of training conditions. Our results show that (i) DA can reap benefits previously unexplored with (ii) augmentations acting on time series sequence order and masking being a better suit for TC and (iii) simple latent space analysis can provide hints about why augmentations have positive or negative effects.
Event detection from novel data sources: Leveraging satellite imagery alongside GPS traces
Authors: Authors: Ekin Ugurel, Steffen Coenen, Minda Zhou Chen, Cynthia Chen
Subjects: Computer Vision and Pattern Recognition (cs.CV); Social and Information Networks (cs.SI)
Arxiv link: https://arxiv.org/abs/2401.10890
Pdf link: https://arxiv.org/pdf/2401.10890
Abstract Rapid identification and response to breaking events, particularly those that pose a threat to human life such as natural disasters or conflicts, is of paramount importance. The prevalence of mobile devices and the ubiquity of network connectivity has generated a massive amount of temporally- and spatially-stamped data. Numerous studies have used mobile data to derive individual human mobility patterns for various applications. Similarly, the increasing number of orbital satellites has made it easier to gather high-resolution images capturing a snapshot of a geographical area in sub-daily temporal frequency. We propose a novel data fusion methodology integrating satellite imagery with privacy-enhanced mobile data to augment the event inference task, whether in real-time or historical. In the absence of boots on the ground, mobile data is able to give an approximation of human mobility, proximity to one another, and the built environment. On the other hand, satellite imagery can provide visual information on physical changes to the built and natural environment. The expected use cases for our methodology include small-scale disaster detection (i.e., tornadoes, wildfires, and floods) in rural regions, search and rescue operation augmentation for lost hikers in remote wilderness areas, and identification of active conflict areas and population displacement in war-torn states. Our implementation is open-source on GitHub: https://github.com/ekinugurel/SatMobFusion.
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Authors: Authors: Lihe Yang, Bingyi Kang, Zilong Huang, Xiaogang Xu, Jiashi Feng, Hengshuang Zhao
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2401.10891
Pdf link: https://arxiv.org/pdf/2401.10891
Abstract This work presents Depth Anything, a highly practical solution for robust monocular depth estimation. Without pursuing novel technical modules, we aim to build a simple yet powerful foundation model dealing with any images under any circumstances. To this end, we scale up the dataset by designing a data engine to collect and automatically annotate large-scale unlabeled data (~62M), which significantly enlarges the data coverage and thus is able to reduce the generalization error. We investigate two simple yet effective strategies that make data scaling-up promising. First, a more challenging optimization target is created by leveraging data augmentation tools. It compels the model to actively seek extra visual knowledge and acquire robust representations. Second, an auxiliary supervision is developed to enforce the model to inherit rich semantic priors from pre-trained encoders. We evaluate its zero-shot capabilities extensively, including six public datasets and randomly captured photos. It demonstrates impressive generalization ability. Further, through fine-tuning it with metric depth information from NYUv2 and KITTI, new SOTAs are set. Our better depth model also results in a better depth-conditioned ControlNet. Our models are released at https://github.com/LiheYoung/Depth-Anything.

LeeKyungwook / get-arxiv-noti

New submissions for Mon, 22 Jan 24 #943

Keyword: detection

Intelligent Condition Monitoring of Industrial Plants: An Overview of Methodologies and Uncertainty Management Strategies

HyperSense: Accelerating Hyper-Dimensional Computing for Intelligent Sensor Data Processing

CLAN: A Contrastive Learning based Novelty Detection Framework for Human Activity Recognition

Design and development of opto-neural processors for simulation of neural networks trained in image detection for potential implementation in hybrid robotics

A Hierarchical Framework with Spatio-Temporal Consistency Learning for Emergence Detection in Complex Adaptive Systems

MELODY: Robust Semi-Supervised Hybrid Model for Entity-Level Online Anomaly Detection with Multivariate Time Series

Inconsistent dialogue responses and how to recover from them

Keeping Deep Learning Models in Check: A History-Based Approach to Mitigate Overfitting

Agricultural Object Detection with You Look Only Once (YOLO) Algorithm: A Bibliometric and Systematic Literature Review

Bypassing a Reactive Jammer via NOMA-Based Transmissions in Critical Missions

Analyzing and Mitigating Bias for Vulnerable Classes: Towards Balanced Representation in Dataset

Focaler-IoU: More Focused Intersection over Union Loss

PhoGAD: Graph-based Anomaly Behavior Detection with Persistent Homology Optimization

A Critical Reflection on the Use of Toxicity Detection Algorithms in Proactive Content Moderation Systems

An Effective Index for Truss-based Community Search on Large Directed Graphs

BadODD: Bangladeshi Autonomous Driving Object Detection Dataset

PTPsec: Securing the Precision Time Protocol Against Time Delay Attacks Using Cyclic Path Asymmetry Analysis

Deep Learning-based Embedded Intrusion Detection System for Automotive CAN

A Lightweight Multi-Attack CAN Intrusion Detection System on Hybrid FPGAs

Explainable and Transferable Adversarial Attack for ML-Based Network Intrusion Detectors

Real-Time Zero-Day Intrusion Detection System for Automotive Controller Area Network on FPGAs

Removal and Selection: Improving RGB-Infrared Object Detection via Coarse-to-Fine Fusion

HiCD: Change Detection in Quality-Varied Images via Hierarchical Correlation Distillation

Starlit: Privacy-Preserving Federated Learning to Enhance Financial Fraud Detection

Measuring the Impact of Scene Level Objects on Object Detection: Towards Quantitative Explanations of Detection Decisions

Endovascular Detection of Catheter-Thrombus Contact by Vacuum Excitation

Using LLMs to discover emerging coded antisemitic hate-speech emergence in extremist social media

Event detection from novel data sources: Leveraging satellite imagery alongside GPS traces

Keyword: face recognition

Keyword: augmentation

CLAN: A Contrastive Learning based Novelty Detection Framework for Human Activity Recognition

Analyzing and Mitigating Bias for Vulnerable Classes: Towards Balanced Representation in Dataset

Learning High-Quality and General-Purpose Phrase Representations

Exploring Color Invariance through Image-Level Ensemble Learning

Adversarially Robust Signed Graph Contrastive Learning from Balance Augmentation

Data Augmentation for Traffic Classification

Event detection from novel data sources: Leveraging satellite imagery alongside GPS traces

Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data