New submissions for Thursday, 15 August 2024 (showing 272 of 272 entries )

Keyword: differential privacy

Practical Considerations for Differential Privacy

Authors: Kareem Amin, Alex Kulesza, Sergei Vassilvitskii
Subjects: Subjects: Cryptography and Security (cs.CR)
Arxiv link: https://arxiv.org/abs/2408.07614
Pdf link: https://arxiv.org/pdf/2408.07614
Abstract Differential privacy is the gold standard for statistical data release. Used by governments, companies, and academics, its mathematically rigorous guarantees and worst-case assumptions on the strength and knowledge of attackers make it a robust and compelling framework for reasoning about privacy. However, even with landmark successes, differential privacy has not achieved widespread adoption in everyday data use and data protection. In this work we examine some of the practical obstacles that stand in the way.
Keyword: privacy

OFL-W3: A One-shot Federated Learning System on Web 3.0
Authors: Linshan Jiang, Moming Duan, Bingsheng He, Yulin Sun, Peishen Yan, Yang Hua, Tao Song
Subjects: Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Arxiv link: https://arxiv.org/abs/2408.07096
Pdf link: https://arxiv.org/pdf/2408.07096
Abstract Federated Learning (FL) addresses the challenges posed by data silos, which arise from privacy, security regulations, and ownership concerns. Despite these barriers, FL enables these isolated data repositories to participate in collaborative learning without compromising privacy or security. Concurrently, the advancement of blockchain technology and decentralized applications (DApps) within Web 3.0 heralds a new era of transformative possibilities in web development. As such, incorporating FL into Web 3.0 paves the path for overcoming the limitations of data silos through collaborative learning. However, given the transaction speed constraints of core blockchains such as Ethereum (ETH) and the latency in smart contracts, employing one-shot FL, which minimizes client-server interactions in traditional FL to a single exchange, is considered more apt for Web 3.0 environments. This paper presents a practical one-shot FL system for Web 3.0, termed OFL-W3. OFL-W3 capitalizes on blockchain technology by utilizing smart contracts for managing transactions. Meanwhile, OFL-W3 utilizes the Inter-Planetary File System (IPFS) coupled with Flask communication, to facilitate backend server operations to use existing one-shot FL algorithms. With the integration of the incentive mechanism, OFL-W3 showcases an effective implementation of one-shot FL on Web 3.0, offering valuable insights and future directions for AI combined with Web 3.0 studies.
FedMADE: Robust Federated Learning for Intrusion Detection in IoT Networks Using a Dynamic Aggregation Method
Authors: Shihua Sun, Pragya Sharma, Kenechukwu Nwodo, Angelos Stavrou, Haining Wang
Subjects: Subjects: Cryptography and Security (cs.CR); Networking and Internet Architecture (cs.NI)
Arxiv link: https://arxiv.org/abs/2408.07152
Pdf link: https://arxiv.org/pdf/2408.07152
Abstract The rapid proliferation of Internet of Things (IoT) devices across multiple sectors has escalated serious network security concerns. This has prompted ongoing research in Machine Learning (ML)-based Intrusion Detection Systems (IDSs) for cyber-attack classification. Traditional ML models require data transmission from IoT devices to a centralized server for traffic analysis, raising severe privacy concerns. To address this issue, researchers have studied Federated Learning (FL)-based IDSs that train models across IoT devices while keeping their data localized. However, the heterogeneity of data, stemming from distinct vulnerabilities of devices and complexity of attack vectors, poses a significant challenge to the effectiveness of FL models. While current research focuses on adapting various ML models within the FL framework, they fail to effectively address the issue of attack class imbalance among devices, which significantly degrades the classification accuracy of minority attacks. To overcome this challenge, we introduce FedMADE, a novel dynamic aggregation method, which clusters devices by their traffic patterns and aggregates local models based on their contributions towards overall performance. We evaluate FedMADE against other FL algorithms designed for non-IID data and observe up to 71.07% improvement in minority attack classification accuracy. We further show that FedMADE is robust to poisoning attacks and incurs only a 4.7% (5.03 seconds) latency overhead in each communication round compared to FedAvg, without increasing the computational load of IoT devices.
Using Advanced LLMs to Enhance Smaller LLMs: An Interpretable Knowledge Distillation Approach
Authors: Tong Wang, K. Sudhir, Dat Hong
Subjects: Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2408.07238
Pdf link: https://arxiv.org/pdf/2408.07238
Abstract Advanced Large language models (LLMs) like GPT-4 or LlaMa 3 provide superior performance in complex human-like interactions. But they are costly, or too large for edge devices such as smartphones and harder to self-host, leading to security and privacy concerns. This paper introduces a novel interpretable knowledge distillation approach to enhance the performance of smaller, more economical LLMs that firms can self-host. We study this problem in the context of building a customer service agent aimed at achieving high customer satisfaction through goal-oriented dialogues. Unlike traditional knowledge distillation, where the "student" model learns directly from the "teacher" model's responses via fine-tuning, our interpretable "strategy" teaching approach involves the teacher providing strategies to improve the student's performance in various scenarios. This method alternates between a "scenario generation" step and a "strategies for improvement" step, creating a customized library of scenarios and optimized strategies for automated prompting. The method requires only black-box access to both student and teacher models; hence it can be used without manipulating model parameters. In our customer service application, the method improves performance, and the learned strategies are transferable to other LLMs and scenarios beyond the training set. The method's interpretabilty helps safeguard against potential harms through human audit.
At Least Factor-of-Two Optimization for RWLE-Based Homomorphic Encryption
Authors: Jonathan Ly
Subjects: Subjects: Cryptography and Security (cs.CR)
Arxiv link: https://arxiv.org/abs/2408.07304
Pdf link: https://arxiv.org/pdf/2408.07304
Abstract Many modern applications that deal with sensitive data, such as healthcare and government services, outsource computation to cloud platforms. In such untrusted environments, privacy is of vital importance. One solution to this problem is homomorphic encryption (HE), a family of cryptographic schemes that support certain algebraic operations on encrypted data without the need for decryption. However, despite major advancements, encryption in modern HE schemes still comes with a non-trivial computational overhead that can hamper data-intensive workloads. To resolve this, recent research has shown that leveraging caching techniques, such as Rache, can significantly enhance the performance of HE schemes while maintaining security. Rache unfortunately displays a key limitation in the time complexity of its caching procedure, which scales with the size of the plaintext space. Smuche is another caching scheme that simultaneously improves the scalability of the caching procedure and turns the encryption process into a constant-time operation, utilizing only a single scalar multiplication. Even still, more can be done. In this paper, we present an encryption method we call ``Zinc" which entirely forgoes the multiple caching process, replacing it with a single scalar addition, and then injecting randomness that takes constant time with respect to the plaintext space. This injection of randomness is similar to Smuche, and a great improvement from Rache, allowing Zinc to achieve efficiency without compromising security. We implement the scheme using Microsoft SEAL and compare its performance to vanilla CKKS.
Efficient Edge AI: Deploying Convolutional Neural Networks on FPGA with the Gemmini Accelerator
Authors: Federico Nicolas Peccia, Svetlana Pavlitska, Tobias Fleck, Oliver Bringmann
Subjects: Subjects: Hardware Architecture (cs.AR); Artificial Intelligence (cs.AI); Distributed, Parallel, and Cluster Computing (cs.DC)
Arxiv link: https://arxiv.org/abs/2408.07404
Pdf link: https://arxiv.org/pdf/2408.07404
Abstract The growing concerns regarding energy consumption and privacy have prompted the development of AI solutions deployable on the edge, circumventing the substantial CO2 emissions associated with cloud servers and mitigating risks related to sharing sensitive data. But deploying Convolutional Neural Networks (CNNs) on non-off-the-shelf edge devices remains a complex and labor-intensive task. In this paper, we present and end-to-end workflow for deployment of CNNs on Field Programmable Gate Arrays (FPGAs) using the Gemmini accelerator, which we modified for efficient implementation on FPGAs. We describe how we leverage the use of open source software on each optimization step of the deployment process, the customizations we added to them and its impact on the final system's performance. We were able to achieve real-time performance by deploying a YOLOv7 model on a Xilinx ZCU102 FPGA with an energy efficiency of 36.5 GOP/s/W. Our FPGA-based solution demonstrates superior power efficiency compared with other embedded hardware devices, and even outperforms other FPGA reference implementations. Finally, we present how this kind of solution can be integrated into a wider system, by testing our proposed platform in a traffic monitoring scenario.
Exploring the Impact of Passthrough on VR Exergaming in Public Environments: A Field Study
Authors: Zixuan Guo, Hanxiao Deng, Hongyu Wang, Angel J. Y. Tan, Wenge Xu, Hai-Ning Liang
Subjects: Subjects: Human-Computer Interaction (cs.HC)
Arxiv link: https://arxiv.org/abs/2408.07468
Pdf link: https://arxiv.org/pdf/2408.07468
Abstract Sedentary behavior is becoming increasingly prevalent in daily work and study environments. VR exergaming has emerged as a promising solution in these places of work and study. However, private spaces in these environments are not easy, and engaging in VR exergaming in public settings presents its own set of challenges (e.g., safety, social acceptance, isolation, and privacy protection). The recent development of Passthrough functionality in VR headsets allows users to maintain awareness of their surroundings, enhancing safety and convenience. Despite its potential benefits, little is known about how Passthrough could affect user performance and experience and solve the challenges of playing VR exergames in real-world public environments. To our knowledge, this work is the first to conduct a field study in an underground passageway on a university campus to explore the use of Passthrough in a real-world public environment, with a disturbance-free closed room as a baseline. Results indicate that enabling Passthrough in a public environment improves performance without compromising presence. Moreover, Passthrough can increase social acceptance, especially among individuals with higher levels of self-consciousness. These findings highlight Passthrough's potential to encourage VR exergaming adoption in public environments, with promising implications for overall health and well-being.
A First Look at Related Website Sets
Authors: Stephen McQuistin (University of St Andrews), Peter Snyder (Brave Software), Hamed Haddadi (Imperial College London, Brave Software), Gareth Tyson (Hong Kong University of Science & Technology (GZ))
Subjects: Subjects: Networking and Internet Architecture (cs.NI)
Arxiv link: https://arxiv.org/abs/2408.07495
Pdf link: https://arxiv.org/pdf/2408.07495
Abstract We present the first measurement of the user-effect and privacy impact of "Related Website Sets," a recent proposal to reduce browser privacy protections between two sites if those sites are related to each other. An assumption (both explicitly and implicitly) underpinning the Related Website Sets proposal is that users can accurately determine if two sites are related via the same entity. In this work, we probe this assumption via measurements and a user study of 30 participants, to assess the ability of Web users to determine if two sites are (according to the Related Website Sets feature) related to each other. We find that this is largely not the case. Our findings indicate that 42 (36.8%) of the user determinations in our study are incorrect in privacy-harming ways, where users think that sites are not related, but would be treated as related (and so due less privacy protections) by the Related Website Sets feature. Additionally, 22 (73.3%) of participants made at least one incorrect evaluation during the study. We also characterise the Related Website Sets list, its composition over time, and its governance.
FedQUIT: On-Device Federated Unlearning via a Quasi-Competent Virtual Teacher
Authors: Alessio Mora, Lorenzo Valerio, Paolo Bellavista, Andrea Passarella
Subjects: Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC)
Arxiv link: https://arxiv.org/abs/2408.07587
Pdf link: https://arxiv.org/pdf/2408.07587
Abstract Federated Learning (FL) promises better privacy guarantees for individuals' data when machine learning models are collaboratively trained. When an FL participant exercises its right to be forgotten, i.e., to detach from the FL framework it has participated and to remove its past contributions to the global model, the FL solution should perform all the necessary steps to make it possible without sacrificing the overall performance of the global model, which is not supported in state-of-the-art related solutions nowadays. In this paper, we propose FedQUIT, a novel algorithm that uses knowledge distillation to scrub the contribution of the forgetting data from an FL global model while preserving its generalization ability. FedQUIT directly works on clients' devices and does not require sharing additional information if compared with a regular FL process, nor does it assume the availability of publicly available proxy data. Our solution is efficient, effective, and applicable in both centralized and federated settings. Our experimental results show that, on average, FedQUIT requires less than 2.5% additional communication rounds to recover generalization performances after unlearning, obtaining a sanitized global model whose predictions are comparable to those of a global model that has never seen the data to be forgotten.
Practical Considerations for Differential Privacy
Authors: Kareem Amin, Alex Kulesza, Sergei Vassilvitskii
Subjects: Subjects: Cryptography and Security (cs.CR)
Arxiv link: https://arxiv.org/abs/2408.07614
Pdf link: https://arxiv.org/pdf/2408.07614
Abstract Differential privacy is the gold standard for statistical data release. Used by governments, companies, and academics, its mathematically rigorous guarantees and worst-case assumptions on the strength and knowledge of attackers make it a robust and compelling framework for reasoning about privacy. However, even with landmark successes, differential privacy has not achieved widespread adoption in everyday data use and data protection. In this work we examine some of the practical obstacles that stand in the way.
Keyword: machine learning

Overcoming Imbalanced Safety Data Using Extended Accident Triangle
Authors: Kailai Sun, Tianxiang Lan, Yang Miang Goh, Yueng-Hsiang Huang
Subjects: Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
Arxiv link: https://arxiv.org/abs/2408.07094
Pdf link: https://arxiv.org/pdf/2408.07094
Abstract There is growing interest in using safety analytics and machine learning to support the prevention of workplace incidents, especially in high-risk industries like construction and trucking. Although existing safety analytics studies have made remarkable progress, they suffer from imbalanced datasets, a common problem in safety analytics, resulting in prediction inaccuracies. This can lead to management problems, e.g., incorrect resource allocation and improper interventions. To overcome the imbalanced data problem, we extend the theory of accident triangle to claim that the importance of data samples should be based on characteristics such as injury severity, accident frequency, and accident type. Thus, three oversampling methods are proposed based on assigning different weights to samples in the minority class. We find robust improvements among different machine learning algorithms. For the lack of open-source safety datasets, we are sharing three imbalanced datasets, e.g., a 9-year nationwide construction accident record dataset, and their corresponding codes.
The Potential of Combined Learning Strategies to Enhance Energy Efficiency of Spiking Neuromorphic Systems
Authors: Ali Shiri Sichani, Sai Kankatala
Subjects: Subjects: Neural and Evolutionary Computing (cs.NE)
Arxiv link: https://arxiv.org/abs/2408.07150
Pdf link: https://arxiv.org/pdf/2408.07150
Abstract Ensuring energy-efficient design in neuromorphic computing systems necessitates a tailored architecture combined with algorithmic approaches. This manuscript focuses on enhancing brain-inspired perceptual computing machines through a novel combined learning approach for Convolutional Spiking Neural Networks (CSNNs). CSNNs present a promising alternative to traditional power-intensive and complex machine learning methods like backpropagation, offering energy-efficient spiking neuron processing inspired by the human brain. The proposed combined learning method integrates Pair-based Spike Timing-Dependent Plasticity (PSTDP) and power law-dependent Spike-timing-dependent plasticity (STDP) to adjust synaptic efficacies, enabling the utilization of stochastic elements like memristive devices to enhance energy efficiency and improve perceptual computing accuracy. By reducing learning parameters while maintaining accuracy, these systems consume less energy and have reduced area overhead, making them more suitable for hardware implementation. The research delves into neuromorphic design architectures, focusing on CSNNs to provide a general framework for energy-efficient computing hardware. Various CSNN architectures are evaluated to assess how less trainable parameters can maintain acceptable accuracy in perceptual computing systems, positioning them as viable candidates for neuromorphic architecture. Comparisons with previous work validate the achievements and methodology of the proposed architecture.
FedMADE: Robust Federated Learning for Intrusion Detection in IoT Networks Using a Dynamic Aggregation Method
Authors: Shihua Sun, Pragya Sharma, Kenechukwu Nwodo, Angelos Stavrou, Haining Wang
Subjects: Subjects: Cryptography and Security (cs.CR); Networking and Internet Architecture (cs.NI)
Arxiv link: https://arxiv.org/abs/2408.07152
Pdf link: https://arxiv.org/pdf/2408.07152
Abstract The rapid proliferation of Internet of Things (IoT) devices across multiple sectors has escalated serious network security concerns. This has prompted ongoing research in Machine Learning (ML)-based Intrusion Detection Systems (IDSs) for cyber-attack classification. Traditional ML models require data transmission from IoT devices to a centralized server for traffic analysis, raising severe privacy concerns. To address this issue, researchers have studied Federated Learning (FL)-based IDSs that train models across IoT devices while keeping their data localized. However, the heterogeneity of data, stemming from distinct vulnerabilities of devices and complexity of attack vectors, poses a significant challenge to the effectiveness of FL models. While current research focuses on adapting various ML models within the FL framework, they fail to effectively address the issue of attack class imbalance among devices, which significantly degrades the classification accuracy of minority attacks. To overcome this challenge, we introduce FedMADE, a novel dynamic aggregation method, which clusters devices by their traffic patterns and aggregates local models based on their contributions towards overall performance. We evaluate FedMADE against other FL algorithms designed for non-IID data and observe up to 71.07% improvement in minority attack classification accuracy. We further show that FedMADE is robust to poisoning attacks and incurs only a 4.7% (5.03 seconds) latency overhead in each communication round compared to FedAvg, without increasing the computational load of IoT devices.
A New Dataset, Notation Software, and Representation for Computational Schenkerian Analysis
Authors: Stephen Ni-Hahn, Weihan Xu, Jerry Yin, Rico Zhu, Simon Mak, Yue Jiang, Cynthia Rudin
Subjects: Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI)
Arxiv link: https://arxiv.org/abs/2408.07184
Pdf link: https://arxiv.org/pdf/2408.07184
Abstract Schenkerian Analysis (SchA) is a uniquely expressive method of music analysis, combining elements of melody, harmony, counterpoint, and form to describe the hierarchical structure supporting a work of music. However, despite its powerful analytical utility and potential to improve music understanding and generation, SchA has rarely been utilized by the computer music community. This is in large part due to the paucity of available high-quality data in a computer-readable format. With a larger corpus of Schenkerian data, it may be possible to infuse machine learning models with a deeper understanding of musical structure, thus leading to more "human" results. To encourage further research in Schenkerian analysis and its potential benefits for music informatics and generation, this paper presents three main contributions: 1) a new and growing dataset of SchAs, the largest in human- and computer-readable formats to date (>140 excerpts), 2) a novel software for visualization and collection of SchA data, and 3) a novel, flexible representation of SchA as a heterogeneous-edge graph data structure.
Image-Based Leopard Seal Recognition: Approaches and Challenges in Current Automated Systems
Authors: Jorge Yero Salazar, Pablo Rivas, Renato Borras-Chavez, Sarah Kienle
Subjects: Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2408.07269
Pdf link: https://arxiv.org/pdf/2408.07269
Abstract This paper examines the challenges and advancements in recognizing seals within their natural habitats using conventional photography, underscored by the emergence of machine learning technologies. We used the leopard seal, \emph{Hydrurga leptonyx}, a key species within Antarctic ecosystems, to review the different available methods found. As apex predators, Leopard seals are characterized by their significant ecological role and elusive nature so studying them is crucial to understand the health of their ecosystem. Traditional methods of monitoring seal species are often constrained by the labor-intensive and time-consuming processes required for collecting data, compounded by the limited insights these methods provide. The advent of machine learning, particularly through the application of vision transformers, heralds a new era of efficiency and precision in species monitoring. By leveraging state-of-the-art approaches in detection, segmentation, and recognition within digital imaging, this paper presents a synthesis of the current landscape, highlighting both the cutting-edge methodologies and the predominant challenges faced in accurately identifying seals through photographic data.
Scene-wise Adaptive Network for Dynamic Cold-start Scenes Optimization in CTR Prediction
Authors: Wenhao Li, Jie Zhou, Chuan Luo, Chao Tang, Kun Zhang, Shixiong Zhao
Subjects: Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2408.07278
Pdf link: https://arxiv.org/pdf/2408.07278
Abstract In the realm of modern mobile E-commerce, providing users with nearby commercial service recommendations through location-based online services has become increasingly vital. While machine learning approaches have shown promise in multi-scene recommendation, existing methodologies often struggle to address cold-start problems in unprecedented scenes: the increasing diversity of commercial choices, along with the short online lifespan of scenes, give rise to the complexity of effective recommendations in online and dynamic scenes. In this work, we propose Scene-wise Adaptive Network (SwAN), a novel approach that emphasizes high-performance cold-start online recommendations for new scenes. Our approach introduces several crucial capabilities, including scene similarity learning, user-specific scene transition cognition, scene-specific information construction for the new scene, and enhancing the diverged logical information between scenes. We demonstrate SwAN's potential to optimize dynamic multi-scene recommendation problems by effectively online handling cold-start recommendations for any newly arrived scenes. More encouragingly, SwAN has been successfully deployed in Meituan's online catering recommendation service, which serves millions of customers per day, and SwAN has achieved a 5.64% CTR index improvement relative to the baselines and a 5.19% increase in daily order volume proportion.
A Quantum-Inspired Analysis of Human Disambiguation Processes
Authors: Daphne Wang
Subjects: Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Logic in Computer Science (cs.LO); Quantum Physics (quant-ph)
Arxiv link: https://arxiv.org/abs/2408.07402
Pdf link: https://arxiv.org/pdf/2408.07402
Abstract Formal languages are essential for computer programming and are constructed to be easily processed by computers. In contrast, natural languages are much more challenging and instigated the field of Natural Language Processing (NLP). One major obstacle is the ubiquity of ambiguities. Recent advances in NLP have led to the development of large language models, which can resolve ambiguities with high accuracy. At the same time, quantum computers have gained much attention in recent years as they can solve some computational problems faster than classical computers. This new computing paradigm has reached the fields of machine learning and NLP, where hybrid classical-quantum learning algorithms have emerged. However, more research is needed to identify which NLP tasks could benefit from a genuine quantum advantage. In this thesis, we applied formalisms arising from foundational quantum mechanics, such as contextuality and causality, to study ambiguities arising from linguistics. By doing so, we also reproduced psycholinguistic results relating to the human disambiguation process. These results were subsequently used to predict human behaviour and outperformed current NLP methods.
Real-world validation of safe reinforcement learning, model predictive control and decision tree-based home energy management systems
Authors: Julian Ruddick, Glenn Ceusters, Gilles Van Kriekinge, Evgenii Genov, Thierry Coosemans, Maarten Messagie
Subjects: Subjects: Systems and Control (eess.SY); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Arxiv link: https://arxiv.org/abs/2408.07435
Pdf link: https://arxiv.org/pdf/2408.07435
Abstract Recent advancements in machine learning based energy management approaches, specifically reinforcement learning with a safety layer (OptLayerPolicy) and a metaheuristic algorithm generating a decision tree control policy (TreeC), have shown promise. However, their effectiveness has only been demonstrated in computer simulations. This paper presents the real-world validation of these methods, comparing against model predictive control and simple rule-based control benchmark. The experiments were conducted on the electrical installation of 4 reproductions of residential houses, which all have their own battery, photovoltaic and dynamic load system emulating a non-controllable electrical load and a controllable electric vehicle charger. The results show that the simple rules, TreeC, and model predictive control-based methods achieved similar costs, with a difference of only 0.6%. The reinforcement learning based method, still in its training phase, obtained a cost 25.5\% higher to the other methods. Additional simulations show that the costs can be further reduced by using a more representative training dataset for TreeC and addressing errors in the model predictive control implementation caused by its reliance on accurate data from various sources. The OptLayerPolicy safety layer allows safe online training of a reinforcement learning agent in the real-world, given an accurate constraint function formulation. The proposed safety layer method remains error-prone, nonetheless, it is found beneficial for all investigated methods. The TreeC method, which does require building a realistic simulation for training, exhibits the safest operational performance, exceeding the grid limit by only 27.1 Wh compared to 593.9 Wh for reinforcement learning.
Achieving Data Efficient Neural Networks with Hybrid Concept-based Models
Authors: Tobias A. Opsahl, Vegard Antun
Subjects: Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2408.07438
Pdf link: https://arxiv.org/pdf/2408.07438
Abstract Most datasets used for supervised machine learning consist of a single label per data point. However, in cases where more information than just the class label is available, would it be possible to train models more efficiently? We introduce two novel model architectures, which we call hybrid concept-based models, that train using both class labels and additional information in the dataset referred to as concepts. In order to thoroughly assess their performance, we introduce ConceptShapes, an open and flexible class of datasets with concept labels. We show that the hybrid concept-based models outperform standard computer vision models and previously proposed concept-based models with respect to accuracy, especially in sparse data settings. We also introduce an algorithm for performing adversarial concept attacks, where an image is perturbed in a way that does not change a concept-based model's concept predictions, but changes the class prediction. The existence of such adversarial examples raises questions about the interpretable qualities promised by concept-based models.
Fact or Fiction? Improving Fact Verification with Knowledge Graphs through Simplified Subgraph Retrievals
Authors: Tobias A. Opsahl
Subjects: Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2408.07453
Pdf link: https://arxiv.org/pdf/2408.07453
Abstract Despite recent success in natural language processing (NLP), fact verification still remains a difficult task. Due to misinformation spreading increasingly fast, attention has been directed towards automatically verifying the correctness of claims. In the domain of NLP, this is usually done by training supervised machine learning models to verify claims by utilizing evidence from trustworthy corpora. We present efficient methods for verifying claims on a dataset where the evidence is in the form of structured knowledge graphs. We use the FactKG dataset, which is constructed from the DBpedia knowledge graph extracted from Wikipedia. By simplifying the evidence retrieval process, from fine-tuned language models to simple logical retrievals, we are able to construct models that both require less computational resources and achieve better test-set accuracy.
Domain-invariant Representation Learning via Segment Anything Model for Blood Cell Classification
Authors: Yongcheng Li, Lingcong Cai, Ying Lu, Cheng Lin, Yupeng Zhang, Jingyan Jiang, Genan Dai, Bowen Zhang, Jingzhou Cao, Xiangzhong Zhang, Xiaomao Fan
Subjects: Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2408.07467
Pdf link: https://arxiv.org/pdf/2408.07467
Abstract Accurate classification of blood cells is of vital significance in the diagnosis of hematological disorders. However, in real-world scenarios, domain shifts caused by the variability in laboratory procedures and settings, result in a rapid deterioration of the model's generalization performance. To address this issue, we propose a novel framework of domain-invariant representation learning (DoRL) via segment anything model (SAM) for blood cell classification. The DoRL comprises two main components: a LoRA-based SAM (LoRA-SAM) and a cross-domain autoencoder (CAE). The advantage of DoRL is that it can extract domain-invariant representations from various blood cell datasets in an unsupervised manner. Specifically, we first leverage the large-scale foundation model of SAM, fine-tuned with LoRA, to learn general image embeddings and segment blood cells. Additionally, we introduce CAE to learn domain-invariant representations across different-domain datasets while mitigating images' artifacts. To validate the effectiveness of domain-invariant representations, we employ five widely used machine learning classifiers to construct blood cell classification models. Experimental results on two public blood cell datasets and a private real dataset demonstrate that our proposed DoRL achieves a new state-of-the-art cross-domain performance, surpassing existing methods by a significant margin. The source code can be available at the URL (this https URL).
Image Scaling Attack Simulation: A Measure of Stealth and Detectability
Authors: Devon A. Kelly, Sarah A. Flanery, Christiana Chamon
Subjects: Subjects: Human-Computer Interaction (cs.HC)
Arxiv link: https://arxiv.org/abs/2408.07513
Pdf link: https://arxiv.org/pdf/2408.07513
Abstract Cybersecurity practices require effort to be maintained, and one weakness is a lack of awareness regarding potential attacks not only in the usage of machine learning models, but also in their development process. Previous studies have determined that preprocessing attacks, such as image scaling attacks, have been difficult to detect by humans (through visual response) and computers (through entropic algorithms). However, these studies fail to address the real-world performance and detectability of these attacks. The purpose of this work is to analyze the relationship between awareness of image scaling attacks with respect to demographic background and experience. We conduct a survey where we gather the subjects' demographics, analyze the subjects' experience in cybersecurity, record their responses to a poorly-performing convolutional neural network model that has been unknowingly hindered by an image scaling attack of a used dataset, and document their reactions after it is revealed that the images used within the broken models have been attacked. We find in this study that the overall detection rate of the attack is low enough to be viable in a workplace or academic setting, and even after discovery, subjects cannot conclusively determine benign images from attacked images.
PolyCL: Contrastive Learning for Polymer Representation Learning via Explicit and Implicit Augmentations
Authors: Jiajun Zhou, Yijie Yang, Austin M. Mroz, Kim E. Jelfs
Subjects: Subjects: Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2408.07556
Pdf link: https://arxiv.org/pdf/2408.07556
Abstract Polymers play a crucial role in a wide array of applications due to their diverse and tunable properties. Establishing the relationship between polymer representations and their properties is crucial to the computational design and screening of potential polymers via machine learning. The quality of the representation significantly influences the effectiveness of these computational methods. Here, we present a self-supervised contrastive learning paradigm, PolyCL, for learning high-quality polymer representation without the need for labels. Our model combines explicit and implicit augmentation strategies for improved learning performance. The results demonstrate that our model achieves either better, or highly competitive, performances on transfer learning tasks as a feature extractor without an overcomplicated training strategy or hyperparameter optimisation. Further enhancing the efficacy of our model, we conducted extensive analyses on various augmentation combinations used in contrastive learning. This led to identifying the most effective combination to maximise PolyCL's performance.
FedQUIT: On-Device Federated Unlearning via a Quasi-Competent Virtual Teacher
Authors: Alessio Mora, Lorenzo Valerio, Paolo Bellavista, Andrea Passarella
Subjects: Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC)
Arxiv link: https://arxiv.org/abs/2408.07587
Pdf link: https://arxiv.org/pdf/2408.07587
Abstract Federated Learning (FL) promises better privacy guarantees for individuals' data when machine learning models are collaboratively trained. When an FL participant exercises its right to be forgotten, i.e., to detach from the FL framework it has participated and to remove its past contributions to the global model, the FL solution should perform all the necessary steps to make it possible without sacrificing the overall performance of the global model, which is not supported in state-of-the-art related solutions nowadays. In this paper, we propose FedQUIT, a novel algorithm that uses knowledge distillation to scrub the contribution of the forgetting data from an FL global model while preserving its generalization ability. FedQUIT directly works on clients' devices and does not require sharing additional information if compared with a regular FL process, nor does it assume the availability of publicly available proxy data. Our solution is efficient, effective, and applicable in both centralized and federated settings. Our experimental results show that, on average, FedQUIT requires less than 2.5% additional communication rounds to recover generalization performances after unlearning, obtaining a sanitized global model whose predictions are comparable to those of a global model that has never seen the data to be forgotten.
Interpretable Graph Neural Networks for Heterogeneous Tabular Data
Authors: Amr Alkhatib, Henrik Boström
Subjects: Subjects: Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2408.07661
Pdf link: https://arxiv.org/pdf/2408.07661
Abstract Many machine learning algorithms for tabular data produce black-box models, which prevent users from understanding the rationale behind the model predictions. In their unconstrained form, graph neural networks fall into this category, and they have further limited abilities to handle heterogeneous data. To overcome these limitations, an approach is proposed, called IGNH (Interpretable Graph Neural Network for Heterogeneous tabular data), which handles both categorical and numerical features, while constraining the learning process to generate exact feature attributions together with the predictions. A large-scale empirical investigation is presented, showing that the feature attributions provided by IGNH align with Shapley values that are computed post hoc. Furthermore, the results show that IGNH outperforms two powerful machine learning algorithms for tabular data, Random Forests and TabNet, while reaching a similar level of performance as XGBoost.
Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities
Authors: Enneng Yang, Li Shen, Guibing Guo, Xingwei Wang, Xiaochun Cao, Jie Zhang, Dacheng Tao
Subjects: Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2408.07666
Pdf link: https://arxiv.org/pdf/2408.07666
Abstract Model merging is an efficient empowerment technique in the machine learning community that does not require the collection of raw training data and does not require expensive computation. As model merging becomes increasingly prevalent across various fields, it is crucial to understand the available model merging techniques comprehensively. However, there is a significant gap in the literature regarding a systematic and thorough review of these techniques. This survey provides a comprehensive overview of model merging methods and theories, their applications in various domains and settings, and future research directions. Specifically, we first propose a new taxonomic approach that exhaustively discusses existing model merging methods. Secondly, we discuss the application of model merging techniques in large language models, multimodal large language models, and 10+ machine learning subfields, including continual learning, multi-task learning, few-shot learning, etc. Finally, we highlight the remaining challenges of model merging and discuss future research directions. A comprehensive list of papers about model merging is available at \url{this https URL}.

qiaoyuet / arxiv_daily

New submissions for Thursday, 15 August 2024 (showing 272 of 272 entries ) #154

Keyword: differential privacy

Practical Considerations for Differential Privacy

Keyword: privacy

OFL-W3: A One-shot Federated Learning System on Web 3.0

FedMADE: Robust Federated Learning for Intrusion Detection in IoT Networks Using a Dynamic Aggregation Method

Using Advanced LLMs to Enhance Smaller LLMs: An Interpretable Knowledge Distillation Approach

At Least Factor-of-Two Optimization for RWLE-Based Homomorphic Encryption

Efficient Edge AI: Deploying Convolutional Neural Networks on FPGA with the Gemmini Accelerator

Exploring the Impact of Passthrough on VR Exergaming in Public Environments: A Field Study

A First Look at Related Website Sets

FedQUIT: On-Device Federated Unlearning via a Quasi-Competent Virtual Teacher

Practical Considerations for Differential Privacy

Keyword: machine learning

Overcoming Imbalanced Safety Data Using Extended Accident Triangle

The Potential of Combined Learning Strategies to Enhance Energy Efficiency of Spiking Neuromorphic Systems

FedMADE: Robust Federated Learning for Intrusion Detection in IoT Networks Using a Dynamic Aggregation Method

A New Dataset, Notation Software, and Representation for Computational Schenkerian Analysis

Image-Based Leopard Seal Recognition: Approaches and Challenges in Current Automated Systems

Scene-wise Adaptive Network for Dynamic Cold-start Scenes Optimization in CTR Prediction

A Quantum-Inspired Analysis of Human Disambiguation Processes

Real-world validation of safe reinforcement learning, model predictive control and decision tree-based home energy management systems

Achieving Data Efficient Neural Networks with Hybrid Concept-based Models

Fact or Fiction? Improving Fact Verification with Knowledge Graphs through Simplified Subgraph Retrievals

Domain-invariant Representation Learning via Segment Anything Model for Blood Cell Classification

Image Scaling Attack Simulation: A Measure of Stealth and Detectability

PolyCL: Contrastive Learning for Polymer Representation Learning via Explicit and Implicit Augmentations

FedQUIT: On-Device Federated Unlearning via a Quasi-Competent Virtual Teacher

Interpretable Graph Neural Networks for Heterogeneous Tabular Data

Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities