Abstract
This research paper presents the findings of two experimental studies that explore the use of ChatGPT as a tool for theory prototyping. The objective of the studies is to assess ChatGPT's ability to comprehend theoretical concepts and differentiate between constructs. During the experiments, duplicated responses were identified in both Study 1 and Study 2, with duplicate response rates of 26.25% and 40% respectively. The results of the experiments indicate that ChatGPT can generate responses aligned with the constructs of the Technology Acceptance Model (TAM). The loading and reliability coefficients demonstrate the validity of the models, with Study 1 achieving an R-squared value of 82% and Study 2 achieving 71%. In Study 2, two items with negative wording exhibited low loadings and were subsequently removed from the model. Both studies exhibit reasonable discriminant validity despite high correlations among the TAM constructs. The experiments reveal potential biases in the generated samples, particularly regarding gender and usage experiences. These biases may impact the responses of constructs and should be considered when interpreting ChatGPT's conceptual capabilities. In sum, ChatGPT shows promise as a tool for theory prototyping, generating relevant responses aligned with theoretical constructs. However, further investigation is needed to address limitations such as duplicated responses, variations in prompts, and the generalizability of findings to different contexts.
Towards Environmentally Equitable AI via Geographical Load Balancing
Authors: Pengfei Li, Jianyi Yang, Adam Wierman, Shaolei Ren
Subjects: Artificial Intelligence (cs.AI); Computers and Society (cs.CY)
Abstract
Fueled by the soaring popularity of large language and foundation models, the accelerated growth of artificial intelligence (AI) models' enormous environmental footprint has come under increased scrutiny. While many approaches have been proposed to make AI more energy-efficient and environmentally friendly, environmental inequity -- the fact that AI's environmental footprint can be disproportionately higher in certain regions than in others -- has emerged, raising social-ecological justice concerns. This paper takes a first step toward addressing AI's environmental inequity by balancing its regional negative environmental impact. Concretely, we focus on the carbon and water footprints of AI model inference and propose equity-aware geographical load balancing (GLB) to explicitly address AI's environmental impacts on the most disadvantaged regions. We run trace-based simulations by considering a set of 10 geographically-distributed data centers that serve inference requests for a large language AI model. The results demonstrate that existing GLB approaches may amplify environmental inequity while our proposed equity-aware GLB can significantly reduce the regional disparity in terms of carbon and water footprints.
Modelling human seat contact interaction for vibration comfort
Authors: Raj Desai, Marko Cvetković, Georgios Papaioannou, Riender Happee
Abstract
The seat to head vibration transmissibility depends on various characteristics of the seat and the human body. One of these, is the contact interaction, which transmits vibrational energy from the seat to the body. To enhance ride comfort, seat designers should be able to accurately simulate seat contact without the need for extensive experiments. Here, the contact area, pressure, friction and seat and body deformation in compression and shear play a significant role. To address these challenges, the aim of this paper is to define appropriate contact models to improve the prediction capabilities of a seated human body model with regards to experimental data. A computationally efficient multibody (MB) model is evaluated interacting with finite element (FE) and MB backrest models, using several contact models. Outcomes are evaluated in the frequency domain for 3D vibration transmission from seat to pelvis, trunk, head and knees. Results illustrate that both FE and MB backrest models allowing compression and shear provide realistic results.
Estimating See and Be Seen Performance with an Airborne Visual Acquisition Model
Authors: Ngaire Underhill, Evan Maki, Bilal Gill, Andrew Weinert
Subjects: Computational Engineering, Finance, and Science (cs.CE); Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO); Image and Video Processing (eess.IV)
Abstract
Separation provision and collision avoidance to avoid other air traffic are fundamental components of the layered conflict management system to ensure safe and efficient operations. Pilots have visual-based separation responsibilities to see and be seen to maintain separation between aircraft. To safely integrate into the airspace, drones should be required to have a minimum level of performance based on the safety achieved as baselined by crewed aircraft seen and be seen interactions. Drone interactions with crewed aircraft should not be more hazardous than interactions between traditional aviation aircraft. Accordingly, there is need for a methodology to design and evaluate detect and avoid systems, to be equipped by drones to mitigate the risk of a midair collision, where the methodology explicitly addresses, both semantically and mathematically, the appropriate operating rules associated with see and be seen. In response, we simulated how onboard pilots safely operate through see and be seen interactions using an updated visual acquisition model that was originally developed by J.W. Andrews decades ago. Monte Carlo simulations were representative two aircraft flying under visual flight rules and results were analyzed with respect to drone detect and avoid performance standards.
On Pseudolinear Codes for Correcting Adversarial Errors
Authors: Eric Ruzomberka, Homa Nikbakht, Christopher G. Brinton, H. Vincent Poor
Abstract
We consider error-correction coding schemes for adversarial wiretap channels (AWTCs) in which the channel can a) read a fraction of the codeword bits up to a bound $r$ and b) flip a fraction of the bits up to a bound $p$. The channel can freely choose the locations of the bit reads and bit flips via a process with unbounded computational power. Codes for the AWTC are of broad interest in the area of information security, as they can provide data resiliency in settings where an attacker has limited access to a storage or transmission medium. We investigate a family of non-linear codes known as pseudolinear codes, which were first proposed by Guruswami and Indyk (FOCS 2001) for constructing list-decodable codes independent of the AWTC setting. Unlike general non-linear codes, pseudolinear codes admit efficient encoders and have succinct representations. We focus on unique decoding and show that random pseudolinear codes can achieve rates up to the binary symmetric channel (BSC) capacity $1-H_2(p)$ for any $p,r$ in the less noisy region: $p<1/2$ and $r<1-H_2(p)$ where $H_2(\cdot)$ is the binary entropy function. Thus, pseudolinear codes are the first known optimal-rate binary code family for the less noisy AWTC that admit efficient encoders. The above result can be viewed as a derandomization result of random general codes in the AWTC setting, which in turn opens new avenues for applying derandomization techniques to randomized constructions of AWTC codes. Our proof applies a novel concentration inequality for sums of random variables with limited independence which may be of interest as an analysis tool more generally.
Improved Efficiency and Accuracy of the Magnetic Polarizability Tensor Spectral Signature Object Characterisation for Metal Detection
Abstract
Magnetic polarizability tensors (MPTs) provide an economical characterisation of conducting metallic objects and can aid in the solution of metal detection inverse problems, such as scrap metal sorting, searching for unexploded ordnance in areas of former conflict, and security screening at event venues and transport hubs. Previous work has established explicit formulae for their coefficients and a rigorous mathematical theory for the characterisation they provide. In order to assist with efficient computation of MPT spectral signatures of different objects to enable the construction of large dictionaries of characterisations for classification approaches, this work proposes a new, highly-efficient, strategy for predicting MPT coefficients. This is achieved by solving an eddy current type problem using hp-finite elements in combination with a proper orthogonal decomposition reduced order modelling (ROM) methodology and offers considerable computational savings over our previous approach. Furthermore, an adaptive approach is described for generating new frequency snapshots to further improve the accuracy of the ROM. To improve the resolution of highly conducting and magnetic objects, a recipe is proposed to choose the number and thicknesses of prismatic boundary layers for accurate resolution of thin skin depths in such problems. The paper includes a series of challenging examples to demonstrate the success of the proposed methodologies.
A Novel Approach to Identify Security Controls in Source Code
Authors: Ahmet Okutan, Ali Shokri, Viktoria Koscinski, Mohamad Fazelinia, Mehdi Mirakhorli
Abstract
Secure by Design has become the mainstream development approach ensuring that software systems are not vulnerable to cyberattacks. Architectural security controls need to be carefully monitored over the software development life cycle to avoid critical design flaws. Unfortunately, functional requirements usually get in the way of the security features, and the development team may not correctly address critical security requirements. Identifying tactic-related code pieces in a software project enables an efficient review of the security controls' implementation as well as a resilient software architecture. This paper enumerates a comprehensive list of commonly used security controls and creates a dataset for each one of them by pulling related and unrelated code snippets from the open API of the StackOverflow question and answer platform. It uses the state-of-the-art NLP technique Bidirectional Encoder Representations from Transformers (BERT) and the Tactic Detector from our prior work to show that code pieces that implement security controls could be identified with high confidence. The results show that our model trained on tactic-related and unrelated code snippets derived from StackOverflow is able to identify tactic-related code pieces with F-Measure values above 0.9.
Image Reconstruction using Enhanced Vision Transformer
Authors: Nikhil Verma, Deepkamal Kaur, Lydia Chau
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Abstract
Removing noise from images is a challenging and fundamental problem in the field of computer vision. Images captured by modern cameras are inevitably degraded by noise which limits the accuracy of any quantitative measurements on those images. In this project, we propose a novel image reconstruction framework which can be used for tasks such as image denoising, deblurring or inpainting. The model proposed in this project is based on Vision Transformer (ViT) that takes 2D images as input and outputs embeddings which can be used for reconstructing denoised images. We incorporate four additional optimization techniques in the framework to improve the model reconstruction capability, namely Locality Sensitive Attention (LSA), Shifted Patch Tokenization (SPT), Rotary Position Embeddings (RoPE) and adversarial loss function inspired from Generative Adversarial Networks (GANs). LSA, SPT and RoPE enable the transformer to learn from the dataset more efficiently, while the adversarial loss function enhances the resolution of the reconstructed images. Based on our experiments, the proposed architecture outperforms the benchmark U-Net model by more than 3.5\% structural similarity (SSIM) for the reconstruction tasks of image denoising and inpainting. The proposed enhancements further show an improvement of \textasciitilde5\% SSIM over the benchmark for both tasks.
$\mathrm{SAM^{Med}}$: A medical image annotation framework based on large vision model
Abstract
Recently, large vision model, Segment Anything Model (SAM), has revolutionized the computer vision field, especially for image segmentation. SAM presented a new promptable segmentation paradigm that exhibit its remarkable zero-shot generalization ability. An extensive researches have explore the potential and limits of SAM in various downstream tasks. In this study, we presents $\mathrm{SAM^{Med}}$, an enhanced framework for medical image annotation that leverages the capabilities of SAM. $\mathrm{SAM^{Med}}$ framework consisted of two submodules, namely $\mathrm{SAM^{assist}}$ and $\mathrm{SAM^{auto}}$. The $\mathrm{SAM^{assist}}$ demonstrates the generalization ability of SAM to the downstream medical segmentation task using the prompt-learning approach. Results show a significant improvement in segmentation accuracy with only approximately 5 input points. The $\mathrm{SAM^{auto}}$ model aims to accelerate the annotation process by automatically generating input prompts. The proposed SAP-Net model achieves superior segmentation performance with only five annotated slices, achieving an average Dice coefficient of 0.80 and 0.82 for kidney and liver segmentation, respectively. Overall, $\mathrm{SAM^{Med}}$ demonstrates promising results in medical image annotation. These findings highlight the potential of leveraging large-scale vision models in medical image annotation tasks.
Area, Delay, and Energy-Efficient Full Dadda Multiplier
Authors: Muteen Munawar, Zain Shabbir, Muhammad Akram
Subjects: Systems and Control (eess.SY); Hardware Architecture (cs.AR)
Abstract
The Dadda algorithm is a parallel structured multiplier, which is quite faster as compared to array multipliers, i.e., Booth, Braun, Baugh-Wooley, etc. However, it consumes more power and needs a larger number of gates for hardware implementation. In this paper, a modified-Dadda algorithm-based multiplier is designed using a proposed half-adder-based carry-select adder with a binary to excess-1 converter and an improved ripple-carry adder (RCA). The proposed design is simulated in different technologies, i.e., Taiwan Semiconductor Manufacturing Company (TSMC) 50nm, 90nm, and 120nm, and on different GHz frequencies, i.e., 0.5, 1, 2, and 3.33GHz. Specifically, the 4-bit circuit of the proposed design in TSMCs 50nm technology consumes 25uW of power at 3.33GHz with 76ps of delay. The simulation results reveal that the design is faster, more power-energy-efficient, and requires a smaller number of transistors for implementation as compared to some closely related works. The proposed design can be a promising candidate for low-power and low-cost digital controllers. In the end, the design has been compared with recent relevant works in the literature.
Stack More Layers Differently: High-Rank Training Through Low-Rank Updates
Authors: Vladislav Lialin, Namrata Shivagunde, Sherin Muckatira, Anna Rumshisky
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)
Abstract
Despite the dominance and effectiveness of scaling, resulting in large networks with hundreds of billions of parameters, the necessity to train overparametrized models remains poorly understood, and alternative approaches do not necessarily make it cheaper to train high-performance models. In this paper, we explore low-rank training techniques as an alternative approach to training large neural networks. We introduce a novel method called ReLoRA, which utilizes low-rank updates to train high-rank networks. We apply ReLoRA to pre-training transformer language models with up to 350M parameters and demonstrate comparable performance to regular neural network training. Furthermore, we observe that the efficiency of ReLoRA increases with model size, making it a promising approach for training multi-billion-parameter networks efficiently. Our findings shed light on the potential of low-rank training techniques and their implications for scaling laws.
A Personalized Reinforcement Learning Summarization Service for Learning Structure from Unstructured Data
Abstract
The exponential growth of textual data has created a crucial need for tools that assist users in extracting meaningful insights. Traditional document summarization approaches often fail to meet individual user requirements and lack structure for efficient information processing. To address these limitations, we propose Summation, a hierarchical personalized concept-based summarization approach. It synthesizes documents into a concise hierarchical concept map and actively engages users by learning and adapting to their preferences. Using a Reinforcement Learning algorithm, Summation generates personalized summaries for unseen documents on specific topics. This framework enhances comprehension, enables effective navigation, and empowers users to extract meaningful insights from large document collections aligned with their unique requirements.
A Machine-Learned Ranking Algorithm for Dynamic and Personalised Car Pooling Services
Authors: Mattia Giovanni Campana, Franca Delmastro, Raffaele Bruno
Subjects: Information Retrieval (cs.IR); Machine Learning (cs.LG)
Abstract
Car pooling is expected to significantly help in reducing traffic congestion and pollution in cities by enabling drivers to share their cars with travellers with similar itineraries and time schedules. A number of car pooling matching services have been designed in order to efficiently find successful ride matches in a given pool of drivers and potential passengers. However, it is now recognised that many non-monetary aspects and social considerations, besides simple mobility needs, may influence the individual willingness of sharing a ride, which are difficult to predict. To address this problem, in this study we propose GoTogether, a recommender system for car pooling services that leverages on learning-to-rank techniques to automatically derive the personalised ranking model of each user from the history of her choices (i.e., the type of accepted or rejected shared rides). Then, GoTogether builds the list of recommended rides in order to maximise the success rate of the offered matches. To test the performance of our scheme we use real data from Twitter and Foursquare sources in order to generate a dataset of plausible mobility patterns and ride requests in a metropolitan area. The results show that the proposed solution quickly obtain an accurate prediction of the personalised user's choice model both in static and dynamic conditions.
SepHRNet: Generating High-Resolution Crop Maps from Remote Sensing imagery using HRNet with Separable Convolution
Abstract
The accurate mapping of crop production is crucial for ensuring food security, effective resource management, and sustainable agricultural practices. One way to achieve this is by analyzing high-resolution satellite imagery. Deep Learning has been successful in analyzing images, including remote sensing imagery. However, capturing intricate crop patterns is challenging due to their complexity and variability. In this paper, we propose a novel Deep learning approach that integrates HRNet with Separable Convolutional layers to capture spatial patterns and Self-attention to capture temporal patterns of the data. The HRNet model acts as a backbone and extracts high-resolution features from crop images. Spatially separable convolution in the shallow layers of the HRNet model captures intricate crop patterns more effectively while reducing the computational cost. The multi-head attention mechanism captures long-term temporal dependencies from the encoded vector representation of the images. Finally, a CNN decoder generates a crop map from the aggregated representation. Adaboost is used on top of this to further improve accuracy. The proposed algorithm achieves a high classification accuracy of 97.5\% and IoU of 55.2\% in generating crop maps. We evaluate the performance of our pipeline on the Zuericrop dataset and demonstrate that our results outperform state-of-the-art models such as U-Net++, ResNet50, VGG19, InceptionV3, DenseNet, and EfficientNet. This research showcases the potential of Deep Learning for Earth Observation Systems.
Minimum Cost Loop Nests for Contraction of a Sparse Tensor with a Tensor Network
Authors: Raghavendra Kanakagiri, Edgar Solomonik
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Mathematical Software (cs.MS); Performance (cs.PF); Programming Languages (cs.PL)
Abstract
Sparse tensor decomposition and completion are common in numerous applications, ranging from machine learning to computational quantum chemistry. Typically, the main bottleneck in optimization of these models are contractions of a single large sparse tensor with a network of several dense matrices or tensors (SpTTN). Prior works on high-performance tensor decomposition and completion have focused on performance and scalability optimizations for specific SpTTN kernels. We present algorithms and a runtime system for identifying and executing the most efficient loop nest for any SpTTN kernel. We consider both enumeration of such loop nests for autotuning and efficient algorithms for finding the lowest cost loop-nest for simpler metrics, such as buffer size or cache miss models. Our runtime system identifies the best choice of loop nest without user guidance, and also provides a distributed-memory parallelization of SpTTN kernels. We evaluate our framework using both real-world and synthetic tensors. Our results demonstrate that our approach outperforms available generalized state-of-the-art libraries and matches the performance of specialized codes.
Formal and Fuzzing Amplification: Targeting Vulnerability Detection in 5G and Beyond
Abstract
Softwarization and virtualization in 5G and beyond require rigorous testing against vulnerabilities and unintended emergent behaviors for critical infrastructure and network security assurance. Formal methods operates efficiently in protocol-level abstract specification models, and fuzz testing offers comprehensive experimental evaluation of system implementations. In this paper, we propose a novel framework that leverages the respective advantages and coverage of both formal and fuzzing methods to efficiently detect vulnerabilities from protocol logic to implementation stacks hierarchically. The detected attack traces from the formal verification results in critical protocols guide the case generation of fuzz testing, and the feedbacks from fuzz testing further broaden the scope of the formal verification. We examine the proposed framework with the 5G Non Standard-Alone (NSA) security processes, focusing on the Radio Resource Control (RRC) connection process. We first identify protocol-level vulnerabilities of user credentials via formal methods. Following this, we implement bit-level fuzzing to evaluate potential impacts and risks of integrity-vulnerable identifier variation. Concurrently, we conduct command-level mutation-based fuzzing by fixing the assumption identifier to assess the potential impacts and risks of confidentiality-vulnerable identifiers. During this approach, we established 1 attack model and detected 53 vulnerabilities. The vulnerabilities identified used to fortify protocol-level assumptions could further refine search space for the following detection cycles. Consequently, it addresses the prevalent scalability challenges in detecting vulnerabilities and unintended emergent behaviors in large-scale systems in 5G and beyond.
Making the Nyström method highly accurate for low-rank approximations
Abstract
The Nystr\"om method is a convenient heuristic method to obtain low-rank approximations to kernel matrices in nearly linear complexity. Existing studies typically use the method to approximate positive semidefinite matrices with low or modest accuracies. In this work, we propose a series of heuristic strategies to make the Nystr\"om method reach high accuracies for nonsymmetric and/or rectangular matrices. The resulting methods (called high-accuracy Nystr\"om methods) treat the Nystr\"om method and a skinny rank-revealing factorization as a fast pivoting strategy in a progressive alternating direction refinement process. Two refinement mechanisms are used: alternating the row and column pivoting starting from a small set of randomly chosen columns, and adaptively increasing the number of samples until a desired rank or accuracy is reached. A fast subset update strategy based on the progressive sampling of Schur complements is further proposed to accelerate the refinement process. Efficient randomized accuracy control is also provided. Relevant accuracy and singular value analysis is given to support some of the heuristics. Extensive tests with various kernel functions and data sets show how the methods can quickly reach prespecified high accuracies in practice, sometimes with quality close to SVDs, using only small numbers of progressive sampling steps.
Design of an energy aware petaflops class high performance cluster based on power architecture
Authors: W. A. Ahmad, A. Bartolini, F. Beneventi, L. Benini, A. Borghesi, M. Cicala, P. Forestieri, C. Gianfreda, D. Gregori, A. Libri, F. Spiga, S. Tinti
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Abstract
In this paper we present D.A.V.I.D.E. (Development for an Added Value Infrastructure Designed in Europe), an innovative and energy efficient High Performance Computing cluster designed by E4 Computer Engineering for PRACE (Partnership for Advanced Computing in Europe). D.A.V.I.D.E. is built using best-in-class components (IBM's POWER8-NVLink CPUs, NVIDIA TESLA P100 GPUs, Mellanox InfiniBand EDR 100 Gb/s networking) plus custom hardware and an innovative system middleware software. D.A.V.I.D.E. features (i) a dedicated power monitor interface, built around the BeagleBone Black Board that allows high frequency sampling directly from the power backplane and scalable integration with the internal node telemetry and system level power management software; (ii) a custom-built chassis, based on OpenRack form factor, and liquid cooling that allows the system to be used in modern, energy efficient, datacenter; (iii) software components designed for enabling fine grain power monitoring, power management (i.e. power capping and energy aware job scheduling) and application power profiling, based on dedicated machine learning components. Software APIs are offered to developers and users to tune the computing node performance and power consumption around on the application requirements. The first pilot system that we will deploy at the beginning of 2017, will demonstrate key HPC applications from different fields ported and optimized for this innovative platform.
Neuro-Inspired Efficient Map Building via Fragmentation and Recall
Authors: Jaedong Hwang, Zhang-Wei Hong, Eric Chen, Akhilan Boopathy, Pulkit Agrawal, Ila Fiete
Abstract
Animals and robots navigate through environments by building and refining maps of the space. These maps enable functions including navigating back to home, planning, search, and foraging. In large environments, exploration of the space is a hard problem: agents can become stuck in local regions. Here, we use insights from neuroscience to propose and apply the concept of Fragmentation-and-Recall (FarMap), with agents solving the mapping problem by building local maps via a surprisal-based clustering of space, which they use to set subgoals for spatial exploration. Agents build and use a local map to predict their observations; high surprisal leads to a ``fragmentation event'' that truncates the local map. At these events, the recent local map is placed into long-term memory (LTM), and a different local map is initialized. If observations at a fracture point match observations in one of the stored local maps, that map is recalled (and thus reused) from LTM. The fragmentation points induce a natural online clustering of the larger space, forming a set of intrinsic potential subgoals that are stored in LTM as a topological graph. Agents choose their next subgoal from the set of near and far potential subgoals from within the current local map or LTM, respectively. Thus, local maps guide exploration locally, while LTM promotes global exploration. We evaluate FarMap on complex procedurally-generated spatial environments to demonstrate that this mapping strategy much more rapidly covers the environment (number of agent steps and wall clock time) and is more efficient in active memory usage, without loss of performance.
Verifi-Chain: A Credentials Verifier using Blockchain and IPFS
Authors: Tasfia Rahman, Sumaiya Islam Mouno, Arunangshu Mojumder Raatul, Abul Kalam Al Azad, Nafees Mansoor
Subjects: Cryptography and Security (cs.CR); Distributed, Parallel, and Cluster Computing (cs.DC)
Abstract
Submitting fake certificates is a common problem in Southeast Asia, which prevents qualified candidates from getting the jobs they deserve. When applying for a job, students must provide academic credentials as proof of their qualifications, acquired both inside and outside the classroom. Verifying academic documents before hiring is crucial to prevent fraud. Employing blockchain technology has the potential to address this issue. Blockchain provides an electronic certificate that is tamper-proof and non-repudiable, making it difficult for students to manipulate their academic credentials. This paper presents a prototype for an academic credential verification model that leverages the security features of blockchain and IPFS (Interplanetary File System). Certificates are temporarily stored in a database before being transferred to IPFS, where a unique hash code is generated using a hashing algorithm. This hash code serves as the certificate's unique identity and is stored in the blockchain nodes. Companies can verify an applicant's credentials by searching for the applicant and accessing their already verified certificates. Utilizing IPFS as a middleman storage platform lowers the expenses of directly storing massive data on the blockchain. To sum it up, the proposed solution would make the process of certificate verification more efficient, secure, and cost-effective. It would save time and resources that would otherwise be used to manually verify certificates.
Differentiable Forward Projector for X-ray Computed Tomography
Abstract
Data-driven deep learning has been successfully applied to various computed tomographic reconstruction problems. The deep inference models may outperform existing analytical and iterative algorithms, especially in ill-posed CT reconstruction. However, those methods often predict images that do not agree with the measured projection data. This paper presents an accurate differentiable forward and back projection software library to ensure the consistency between the predicted images and the original measurements. The software library efficiently supports various projection geometry types while minimizing the GPU memory footprint requirement, which facilitates seamless integration with existing deep learning training and inference pipelines. The proposed software is available as open source: https://github.com/LLNL/LEAP.
DeepMapping: The Case for Learned Data Mapping for Compression and Efficient Query Processing
Abstract
Storing tabular data in a way that balances storage and query efficiencies is a long standing research question in the database community. While there are several lossless compression techniques in the literature, in this work we argue and show that a novel Deep Learned Data Mapping (or DeepMapping) abstraction, which relies on the impressive memorization capabilities of deep neural networks, can provide better storage cost, better latency, and better run-time memory footprint, all at the same time. Our proposed DeepMapping abstraction transforms a data set into multiple key-value mappings and constructs a multi-tasking neural network model that outputs the corresponding values for a given input key. In order to deal with the memorization errors, DeepMapping couples the learned neural network with a light-weight auxiliary data structure capable of correcting errors. The auxiliary structure further enables DeepMapping to efficiently deal with insertions, deletions, and updates, without having to re-train the mapping. Since the shape of the network has a significant impact on the overall size of the DeepMapping structure, we further propose a multi-task hybrid architecture search strategy to identify DeepMapping architectures that strike a desirable balance among memorization capacity, size, and efficiency. Extensive experiments with synthetic and benchmark datasets, including TPC-H and TPC-DS, demonstrated that the proposed DeepMapping approach can significantly reduce the latency of the key-based queries, while simultaneously improving both offline and run-time storage requirements against several cutting-edge competitors.
Knowledge-Driven Resource Allocation for D2D Networks: A WMMSE Unrolled Graph Neural Network Approach
Authors: Hao Yang, Nan Cheng, Ruijin Sun, Wei Quan, Rong Chai, Khalid Aldubaikhy, Abdullah Alqasir, Xuemin Shen
Abstract
This paper proposes an novel knowledge-driven approach for resource allocation in device-to-device (D2D) networks using a graph neural network (GNN) architecture. To meet the millisecond-level timeliness and scalability required for the dynamic network environment, our proposed approach incorporates the deep unrolling of the weighted minimum mean square error (WMMSE) algorithm, referred to as domain knowledge, into GNN, thereby reducing computational delay and sample complexity while adapting to various data distributions. Specifically, the aggregation and update functions in the GNN architecture are designed by utilizing the summation and power calculation components of the WMMSE algorithm, which leads to improved model generalization and interpretabiliy. Theoretical analysis of the proposed approach reveals its capability to simplify intricate end-to-end mappings and diminish the model exploration space, resulting in increased network expressiveness and enhanced optimization performance. Simulation results demonstrate the robustness, scalability, and strong performance of the proposed knowledge-driven resource allocation approach across diverse communication topologies without retraining. Our findings contribute to the development of efficient and scalable wireless resource management solutions for distributed and dynamic networks with strict latency requirements.
Efficient Task Offloading Algorithm for Digital Twin in Edge/Cloud Computing Environment
Abstract
In the era of Internet of Things (IoT), Digital Twin (DT) is envisioned to empower various areas as a bridge between physical objects and the digital world. Through virtualization and simulation techniques, multiple functions can be achieved by leveraging computing resources. In this process, Mobile Cloud Computing (MCC) and Mobile Edge Computing (MEC) have become two of the key factors to achieve real-time feedback. However, current works only considered edge servers or cloud servers in the DT system models. Besides, The models ignore the DT with not only one data resource. In this paper, we propose a new DT system model considering a heterogeneous MEC/MCC environment. Each DT in the model is maintained in one of the servers via multiple data collection devices. The offloading decision-making problem is also considered and a new offloading scheme is proposed based on Distributed Deep Learning (DDL). Simulation results demonstrate that our proposed algorithm can effectively and efficiently decrease the system's average latency and energy consumption. Significant improvement is achieved compared with the baselines under the dynamic environment of DTs.
FIS-ONE: Floor Identification System with One Label for Crowdsourced RF Signals
Authors: Weipeng Zhuo, Ka Ho Chiu, Jierun Chen, Ziqi Zhao, S.-H. Gary Chan, Sangtae Ha, Chul-Ho Lee
Subjects: Networking and Internet Architecture (cs.NI); Machine Learning (cs.LG); Signal Processing (eess.SP)
Abstract
Floor labels of crowdsourced RF signals are crucial for many smart-city applications, such as multi-floor indoor localization, geofencing, and robot surveillance. To build a prediction model to identify the floor number of a new RF signal upon its measurement, conventional approaches using the crowdsourced RF signals assume that at least few labeled signal samples are available on each floor. In this work, we push the envelope further and demonstrate that it is technically feasible to enable such floor identification with only one floor-labeled signal sample on the bottom floor while having the rest of signal samples unlabeled. We propose FIS-ONE, a novel floor identification system with only one labeled sample. FIS-ONE consists of two steps, namely signal clustering and cluster indexing. We first build a bipartite graph to model the RF signal samples and obtain a latent representation of each node (each signal sample) using our attention-based graph neural network model so that the RF signal samples can be clustered more accurately. Then, we tackle the problem of indexing the clusters with proper floor labels, by leveraging the observation that signals from an access point can be detected on different floors, i.e., signal spillover. Specifically, we formulate a cluster indexing problem as a combinatorial optimization problem and show that it is equivalent to solving a traveling salesman problem, whose (near-)optimal solution can be found efficiently. We have implemented FIS-ONE and validated its effectiveness on the Microsoft dataset and in three large shopping malls. Our results show that FIS-ONE outperforms other baseline algorithms significantly, with up to 23% improvement in adjusted rand index and 25% improvement in normalized mutual information using only one floor-labeled signal sample.
Prompt Generate Train (PGT): A framework for few-shot domain adaptation, alignment, and uncertainty calibration of a retriever augmented generation (RAG) model for domain specific open book question-answering
Abstract
We present a framework - Prompt, Generate, Train (PGT) - to efficiently develop a generative question-answering model for open-book question-answering over a proprietary collection of text documents. The framework adapts a retriever augmented generation model to the target domain using supervised finetuning and reinforcement learning with synthetic feedback in a few-shot setting. This yields an aligned, uncertainty calibrated model that is competitive with GPT-4 based in-context retrieval augmented generation in generating relevant answers at lower serving costs. The synthetic generation pipeline generates high quality synthetic training data musing a medium sized LLM, Flan-T5 XXL, and a novel consistency filtering scheme. The pipeline is designed to generate both abstractive and extractive questions that span the entire corpus. Using samples from this dataset, the framework fine-tunes a smaller RAG model comprising a dense retriever and a smaller sized LLM on samples from the dataset. In parallel, the framework trains a Reward model to score domain grounded answers higher than hallucinated answers. In the next phase, the framework aligns to the RAG model with the target domain using reinforcement learning. This step improves the RAG model's ability to generate grounded answers and ignore out of domain questions. In the final phase, the framework calibrates the model uncertainty for extractive question-answers. This is a desirable feature since the model can be integrated into a cascading system where the RAG model's answer is surfaced only when the model is confident of its answer.
SwiFT: Swin 4D fMRI Transformer
Authors: Peter Yongho Kim, Junbeom Kwon, Sunghwan Joo, Sangyoon Bae, Donggyu Lee, Yoonho Jung, Shinjae Yoo, Jiook Cha, Taesup Moon
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Abstract
The modeling of spatiotemporal brain dynamics from high-dimensional data, such as 4D functional MRI, is a formidable task in neuroscience. To address this challenge, we present SwiFT (Swin 4D fMRI Transformer), a Swin Transformer architecture that can learn brain dynamics directly from 4D functional brain MRI data in a memory and computation-efficient manner. SwiFT achieves this by implementing a 4D window multi-head self-attention mechanism and absolute positional embeddings. We evaluate SwiFT using multiple largest-scale human functional brain imaging datasets in tasks such as predicting sex, age, and cognitive intelligence. Our experimental outcomes reveal that SwiFT consistently outperforms recent state-of-the-art models. To the best of our knowledge, SwiFT is the first Swin Transformer architecture that can process dimensional spatiotemporal brain functional data in an end-to-end fashion. Furthermore, due to the end-to-end learning capability, we also show that contrastive loss-based self-supervised pre-training of SwiFT is also feasible for achieving improved performance on a downstream task. We believe that our work holds substantial potential in facilitating scalable learning of functional brain imaging in neuroscience research by reducing the hurdles associated with applying Transformer models to high-dimensional fMRI.
Introducing Packet-Level Analysis in Programmable Data Planes to Advance Network Intrusion Detection
Authors: Roberto Doriguzzi-Corin, Luis Augusto Dias Knob, Luca Mendozzi, Domenico Siracusa, Marco Savi
Subjects: Cryptography and Security (cs.CR); Networking and Internet Architecture (cs.NI)
Abstract
Programmable data planes offer precise control over the low-level processing steps applied to network packets, serving as a valuable tool for analysing malicious flows in the field of intrusion detection. Albeit with limitations on physical resources and capabilities, they allow for the efficient extraction of detailed traffic information, which can then be utilised by Machine Learning (ML) algorithms responsible for identifying security threats. In addressing resource constraints, existing solutions in the literature rely on compressing network data through the collection of statistical traffic features in the data plane. While this compression saves memory resources in switches and minimises the burden on the control channel between the data and the control plane, it also results in a loss of information available to the Network Intrusion Detection System (NIDS), limiting access to packet payload, categorical features, and the semantic understanding of network communications, such as the behaviour of packets within traffic flows. This paper proposes P4DDLe, a framework that exploits the flexibility of P4-based programmable data planes for packet-level feature extraction and pre-processing. P4DDLe leverages the programmable data plane to extract raw packet features from the network traffic, categorical features included, and to organise them in a way that the semantics of traffic flows is preserved. To minimise memory and control channel overheads, P4DDLe selectively processes and filters packet-level data, so that all and only the relevant features required by the NIDS are collected. The experimental evaluation with recent Distributed Denial of Service (DDoS) attack data demonstrates that the proposed approach is very efficient in collecting compact and high-quality representations of network flows, ensuring precise detection of DDoS attacks.
Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition
Authors: Wenxuan Wang, Guodong Ma, Yuke Li, Binbin Du
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
Abstract
Multilingual speech recognition for both monolingual and code-switching speech is a challenging task. Recently, based on the Mixture of Experts (MoE), many works have made good progress in multilingual and code-switching ASR, but present huge computational complexity with the increase of supported languages. In this work, we propose a computation-efficient network named Language-Routing Mixture of Experts (LR-MoE) for multilingual and code-switching ASR. LR-MoE extracts language-specific representations through the Mixture of Language Experts (MLE), which is guided to learn by a frame-wise language routing mechanism. The weight-shared frame-level language identification (LID) network is jointly trained as the shared pre-router of each MoE layer. Experiments show that the proposed method significantly improves multilingual and code-switching speech recognition performances over baseline with comparable computational efficiency.
VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models
Abstract
Large language models (LLMs) are shown to possess a wealth of actionable knowledge that can be extracted for robot manipulation in the form of reasoning and planning. Despite the progress, most still rely on pre-defined motion primitives to carry out the physical interactions with the environment, which remains a major bottleneck. In this work, we aim to synthesize robot trajectories, i.e., a dense sequence of 6-DoF end-effector waypoints, for a large variety of manipulation tasks given an open-set of instructions and an open-set of objects. We achieve this by first observing that LLMs excel at inferring affordances and constraints given a free-form language instruction. More importantly, by leveraging their code-writing capabilities, they can interact with a visual-language model (VLM) to compose 3D value maps to ground the knowledge into the observation space of the agent. The composed value maps are then used in a model-based planning framework to zero-shot synthesize closed-loop robot trajectories with robustness to dynamic perturbations. We further demonstrate how the proposed framework can benefit from online experiences by efficiently learning a dynamics model for scenes that involve contact-rich interactions. We present a large-scale study of the proposed method in both simulated and real-robot environments, showcasing the ability to perform a large variety of everyday manipulation tasks specified in free-form natural language. Project website: https://voxposer.github.io
Transformers in Reinforcement Learning: A Survey
Authors: Pranav Agarwal, Aamer Abdul Rahman, Pierre-Luc St-Charles, Simon J.D. Prince, Samira Ebrahimi Kahou
Abstract
Transformers have significantly impacted domains like natural language processing, computer vision, and robotics, where they improve performance compared to other neural networks. This survey explores how transformers are used in reinforcement learning (RL), where they are seen as a promising solution for addressing challenges such as unstable training, credit assignment, lack of interpretability, and partial observability. We begin by providing a brief domain overview of RL, followed by a discussion on the challenges of classical RL algorithms. Next, we delve into the properties of the transformer and its variants and discuss the characteristics that make them well-suited to address the challenges inherent in RL. We examine the application of transformers to various aspects of RL, including representation learning, transition and reward function modeling, and policy optimization. We also discuss recent research that aims to enhance the interpretability and efficiency of transformers in RL, using visualization techniques and efficient training strategies. Often, the transformer architecture must be tailored to the specific needs of a given application. We present a broad overview of how transformers have been adapted for several applications, including robotics, medicine, language modeling, cloud computing, and combinatorial optimization. We conclude by discussing the limitations of using transformers in RL and assess their potential for catalyzing future breakthroughs in this field.
Flexible and Fully Quantized Ultra-Lightweight TinyissimoYOLO for Ultra-Low-Power Edge Systems
Authors: Julian Moosmann, Hanna Mueller, Nicky Zimmerman, Georg Rutishauser, Luca Benini, Michele Magno
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
Abstract
This paper deploys and explores variants of TinyissimoYOLO, a highly flexible and fully quantized ultra-lightweight object detection network designed for edge systems with a power envelope of a few milliwatts. With experimental measurements, we present a comprehensive characterization of the network's detection performance, exploring the impact of various parameters, including input resolution, number of object classes, and hidden layer adjustments. We deploy variants of TinyissimoYOLO on state-of-the-art ultra-low-power extreme edge platforms, presenting an in-depth a comparison on latency, energy efficiency, and their ability to efficiently parallelize the workload. In particular, the paper presents a comparison between a novel parallel RISC-V processor (GAP9 from Greenwaves) with and without use of its on-chip hardware accelerator, an ARM Cortex-M7 core (STM32H7 from ST Microelectronics), two ARM Cortex-M4 cores (STM32L4 from STM and Apollo4b from Ambiq), and a multi-core platform with a CNN hardware accelerator (Analog Devices MAX78000). Experimental results show that the GAP9's hardware accelerator achieves the lowest inference latency and energy at 2.12ms and 150uJ respectively, which is around 2x faster and 20% more efficient than the next best platform, the MAX78000. The hardware accelerator of GAP9 can even run an increased resolution version of TinyissimoYOLO with 112x112 pixels and 10 detection classes within 3.2ms, consuming 245uJ. To showcase the competitiveness of a versatile general-purpose system we also deployed and profiled a multi-core implementation on GAP9 at different operating points, achieving 11.3ms with the lowest-latency and 490uJ with the most energy-efficient configuration. With this paper, we demonstrate the suitability and flexibility of TinyissimoYOLO on state-of-the-art detection datasets for real-time ultra-low-power edge inference.
Unsupervised Optical Flow Estimation with Dynamic Timing Representation for Spike Camera
Abstract
Efficiently selecting an appropriate spike stream data length to extract precise information is the key to the spike vision tasks. To address this issue, we propose a dynamic timing representation for spike streams. Based on multi-layers architecture, it applies dilated convolutions on temporal dimension to extract features on multi-temporal scales with few parameters. And we design layer attention to dynamically fuse these features. Moreover, we propose an unsupervised learning method for optical flow estimation in a spike-based manner to break the dependence on labeled data. In addition, to verify the robustness, we also build a spike-based synthetic validation dataset for extreme scenarios in autonomous driving, denoted as SSES dataset. It consists of various corner cases. Experiments show that our method can predict optical flow from spike streams in different high-speed scenes, including real scenes. For instance, our method gets $15\%$ and $19\%$ error reduction from the best spike-based work, SCFlow, in $\Delta t=10$ and $\Delta t=20$ respectively which are the same settings as the previous works.
An Effective and Efficient Time-aware Entity Alignment Framework via Two-aspect Three-view Label Propagation
Authors: Li Cai, Xin Mao, Youshao Xiao, Changxu Wu, Man Lan
Abstract
Entity alignment (EA) aims to find the equivalent entity pairs between different knowledge graphs (KGs), which is crucial to promote knowledge fusion. With the wide use of temporal knowledge graphs (TKGs), time-aware EA (TEA) methods appear to enhance EA. Existing TEA models are based on Graph Neural Networks (GNN) and achieve state-of-the-art (SOTA) performance, but it is difficult to transfer them to large-scale TKGs due to the scalability issue of GNN. In this paper, we propose an effective and efficient non-neural EA framework between TKGs, namely LightTEA, which consists of four essential components: (1) Two-aspect Three-view Label Propagation, (2) Sparse Similarity with Temporal Constraints, (3) Sinkhorn Operator, and (4) Temporal Iterative Learning. All of these modules work together to improve the performance of EA while reducing the time consumption of the model. Extensive experiments on public datasets indicate that our proposed model significantly outperforms the SOTA methods for EA between TKGs, and the time consumed by LightTEA is only dozens of seconds at most, no more than 10% of the most efficient TEA method.
Semantic Communications System with Model Division Multiple Access and Controllable Coding Rate for Point Cloud
Abstract
Point cloud, as a 3D representation, is widely used in autonomous driving, virtual reality (VR), and augmented reality (AR). However, traditional communication systems think that the point cloud's semantic information is irrelevant to communication, which hinders the efficient transmission of point clouds in the era of artificial intelligence (AI). This paper proposes a point cloud based semantic communication system (PCSC), which uses AI-based encoding techniques to extract the semantic information of the point cloud and joint source-channel coding (JSCC) technology to overcome the distortion caused by noise channels and solve the "cliff effect" in traditional communication. In addition, the system realizes the controllable coding rate without fine-tuning the network. The method analyzes the coded semantic vector's importance and discards semantically-unimportant information, thereby improving the transmission efficiency. Besides, PCSC and the recently proposed non-orthogonal model division multiple access (MDMA) technology are combined to design a point cloud MDMA transmission system (M-PCSC) for multi-user transmission. Relevant experimental results show that the proposed method outperforms the traditional method 10dB in the same channel bandwidth ratio under the PSNR D1 and PSNR D2 metrics. In terms of transmission, the proposed method can effectively solve the "cliff effect" in the traditional methods.
On the Design of Nonlinear MPC and LPVMPC for Obstacle Avoidance in Autonomous Driving
Authors: Maryam Nezami, Dimitrios S. Karachalios, Georg Schildbach, Hossam S. Abbas
Abstract
In this study, we are concerned with autonomous driving missions when a static obstacle blocks a given reference trajectory. To provide a realistic control design, we employ a model predictive control (MPC) utilizing nonlinear state-space dynamic models of a car with linear tire forces, allowing for optimal path planning and tracking to overtake the obstacle. We provide solutions with two different methodologies. Firstly, we solve a nonlinear MPC (NMPC) problem with a nonlinear optimization framework, capable of considering the nonlinear constraints. Secondly, by introducing scheduling signals, we embed the nonlinear dynamics in a linear parameter varying (LPV) representation with adaptive linear constraints for realizing the nonlinear constraints associated with the obstacle. Consequently, an LPVMPC optimization problem can be solved efficiently as a quadratic programming (QP) that constitutes the main novelty of this work. We test the two methods for a challenging obstacle avoidance task and provide qualitative comparisons. The LPVMPC shows a significant reduction in terms of the computational burden at the expense of a slight loss of performance.
Rate-Power Tradeoff in THz SWIPT Systems Employing Resonant Tunnelling Diode-based EH Circuits
Authors: Nikita Shanin, Simone Clochiatti, Kenneth M. Mayer, Laura Cottatellucci, Nils Weimann, Robert Schober
Abstract
In this paper, we study THz simultaneous wireless information and power transfer (SWIPT) systems. Since coherent information detection is challenging at THz frequencies and Schottky diodes may not be efficient for THz energy harvesting (EH) and information detection, we employ unipolar amplitude shift keying (ASK) modulation at the transmitter (TX) and a resonant tunnelling diode (RTD)-based EH circuit at the receiver (RX) to extract both information and power from the RX signal. We model the dependence of the instantaneous output power at the RX on the instantaneous received power by a non-linear piecewise function, whose parameters are adjusted to fit circuit simulation results. To determine the rate-power tradeoff in THz SWIPT systems, we derive the distribution of the TX signal that maximizes the mutual information between the TX and RX signals subject to constraints on the required average harvested power at the RX and the peak signal amplitude at the TX. Since the computational complexity of maximizing the mutual information may be too high for real-time THz SWIPT systems, for high and low required average harvested powers, we also obtain the suboptimal input signal distribution that maximizes the achievable information rate numerically and in closed form, respectively. Furthermore, based on the obtained results, we propose a suboptimal closed-form TX distribution which also achieves a desired harvested power at the RX. Our simulation results show that a lower reverse current flow and a higher breakdown voltage of the employed RTD are preferable when the input signal power at the RX is low and high, respectively. Finally, we demonstrate that for low and high received signal powers, the rate-power tradeoff of THz SWIPT systems is determined by the peak amplitude of the TX signal and the maximum instantaneous harvested power, respectively.
Acceleration of complex matrix multiplication using arbitrary precision floating-point arithmetic
Abstract
Efficient multiple precision linear numerical computation libraries such as MPLAPACK are critical in dealing with ill-conditioned problems. Specifically, there are optimization methods for matrix multiplication, such as the Strassen algorithm and the Ozaki scheme, which can be used to speed up computation. For complex matrix multiplication, the 3M method can also be used, which requires only three multiplications of real matrices, instead of the 4M method, which requires four multiplications of real matrices. In this study, we extend these optimization methods to arbitrary precision complex matrix multiplication and verify the possible increase in computation speed through benchmark tests. The optimization methods are also applied to complex LU decomposition using matrix multiplication to demonstrate that the Ozaki scheme can be used to achieve higher computation speeds.
Exploring Millions of User Interactions with ICEBOAT: Big Data Analytics for Automotive User Interfaces
Authors: Patrick Ebel, Kim Julian Gülle, Christoph Lingenfelder, Andreas Vogelsang
Abstract
User Experience (UX) professionals need to be able to analyze large amounts of usage data on their own to make evidence-based design decisions. However, the design process for In-Vehicle Information Systems (IVIS) lacks data-driven support and effective tools for visualizing and analyzing user interaction data. Therefore, we propose ICEBOAT, an interactive visualization tool tailored to the needs of automotive UX experts to effectively and efficiently evaluate driver interactions with IVISs. ICEBOAT visualizes telematics data collected from production line vehicles, allowing UX experts to perform task-specific analyses. Following a mixed methods User-centered design (UCD) approach, we conducted an interview study (N=4) to extract the domain specific information and interaction needs of automotive UX experts and used a co-design approach (N=4) to develop an interactive analysis tool. Our evaluation (N=12) shows that ICEBOAT enables UX experts to efficiently generate knowledge that facilitates data-driven design decisions.
Fast Decoding of Lifted Interleaved Linearized Reed-Solomon Codes for Multishot Network Coding
Abstract
Mart{\'\i}nez-Pe{\~n}as and Kschischang (IEEE Trans.\ Inf.\ Theory, 2019) proposed lifted linearized Reed--Solomon codes as suitable codes for error control in multishot network coding. We show how to construct and decode \ac{LILRS} codes. Compared to the construction by Mart{\'\i}nez-Pe{\~n}as--Kschischang, interleaving allows to increase the decoding region significantly and decreases the overhead due to the lifting (i.e., increases the code rate), at the cost of an increased packet size. We propose two decoding schemes for \ac{LILRS} that are both capable of correcting insertions and deletions beyond half the minimum distance of the code by either allowing a list or a small decoding failure probability. We propose a probabilistic unique {\LOlike} decoder for \ac{LILRS} codes and an efficient interpolation-based decoding scheme that can be either used as a list decoder (with exponential worst-case list size) or as a probabilistic unique decoder. We derive upper bounds on the decoding failure probability of the probabilistic-unique decoders which show that the decoding failure probability is very small for most channel realizations up to the maximal decoding radius. The tightness of the bounds is verified by Monte Carlo simulations.
Recognizing student identification numbers from the matrix templates using a modified U-net architecture
Authors: Filip Pavičić
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Abstract
This paper presents an innovative approach to student identification during exams and knowledge tests, which overcomes the limitations of the traditional personal information entry method. The proposed method employs a matrix template on the designated section of the exam, where squares containing numbers are selectively blackened. The methodology involves the development of a neural network specifically designed for recognizing students' personal identification numbers. The neural network utilizes a specially adapted U-Net architecture, trained on an extensive dataset comprising images of blackened tables. The network demonstrates proficiency in recognizing the patterns and arrangement of blackened squares, accurately interpreting the information inscribed within them. Additionally, the model exhibits high accuracy in correctly identifying entered student personal numbers and effectively detecting erroneous entries within the table. This approach offers multiple advantages. Firstly, it significantly accelerates the exam marking process by automatically extracting identifying information from the blackened tables, eliminating the need for manual entry and minimizing the potential for errors. Secondly, the method automates the identification process, thereby reducing administrative effort and expediting data processing. The introduction of this innovative identification system represents a notable advancement in the field of exams and knowledge tests, replacing the conventional manual entry of personal data with a streamlined, efficient, and accurate identification process.
Learning Kernel-Modulated Neural Representation for Efficient Light Field Compression
Abstract
Light field is a type of image data that captures the 3D scene information by recording light rays emitted from a scene at various orientations. It offers a more immersive perception than classic 2D images but at the cost of huge data volume. In this paper, we draw inspiration from the visual characteristics of Sub-Aperture Images (SAIs) of light field and design a compact neural network representation for the light field compression task. The network backbone takes randomly initialized noise as input and is supervised on the SAIs of the target light field. It is composed of two types of complementary kernels: descriptive kernels (descriptors) that store scene description information learned during training, and modulatory kernels (modulators) that control the rendering of different SAIs from the queried perspectives. To further enhance compactness of the network meanwhile retain high quality of the decoded light field, we accordingly introduce modulator allocation and kernel tensor decomposition mechanisms, followed by non-uniform quantization and lossless entropy coding techniques, to finally form an efficient compression pipeline. Extensive experiments demonstrate that our method outperforms other state-of-the-art (SOTA) methods by a significant margin in the light field compression task. Moreover, after aligning descriptors, the modulators learned from one light field can be transferred to new light fields for rendering dense views, indicating a potential solution for view synthesis task.
NetGPT: A Native-AI Network Architecture Beyond Provisioning Personalized Generative Services
Abstract
Large language models (LLMs) have triggered tremendous success to empower daily life by generative information, and the personalization of LLMs could further contribute to their applications due to better alignment with human intents. Towards personalized generative services, a collaborative cloud-edge methodology sounds promising, as it facilitates the effective orchestration of heterogeneous distributed communication and computing resources. In this article, after discussing the pros and cons of several candidate cloud-edge collaboration techniques, we put forward NetGPT to capably deploy appropriate LLMs at the edge and the cloud in accordance with their computing capacity. In addition, edge LLMs could efficiently leverage location-based information for personalized prompt completion, thus benefiting the interaction with cloud LLMs. After deploying representative open-source LLMs (e.g., GPT-2-base and LLaMA model) at the edge and the cloud, we present the feasibility of NetGPT on the basis of low-rank adaptation-based light-weight fine-tuning. Subsequently, we highlight substantial essential changes required for a native artificial intelligence (AI) network architecture towards NetGPT, with special emphasis on deeper integration of communications and computing resources and careful calibration of logical AI workflow. Furthermore, we demonstrate several by-product benefits of NetGPT, given edge LLM's astonishing capability to predict trends and infer intents, which possibly leads to a unified solution for intelligent network management \& orchestration. In a nutshell, we argue that NetGPT is a promising native-AI network architecture beyond provisioning personalized generative services.
CellGAN: Conditional Cervical Cell Synthesis for Augmenting Cytopathological Image Classification
Abstract
Automatic examination of thin-prep cytologic test (TCT) slides can assist pathologists in finding cervical abnormality for accurate and efficient cancer screening. Current solutions mostly need to localize suspicious cells and classify abnormality based on local patches, concerning the fact that whole slide images of TCT are extremely large. It thus requires many annotations of normal and abnormal cervical cells, to supervise the training of the patch-level classifier for promising performance. In this paper, we propose CellGAN to synthesize cytopathological images of various cervical cell types for augmenting patch-level cell classification. Built upon a lightweight backbone, CellGAN is equipped with a non-linear class mapping network to effectively incorporate cell type information into image generation. We also propose the Skip-layer Global Context module to model the complex spatial relationship of the cells, and attain high fidelity of the synthesized images through adversarial learning. Our experiments demonstrate that CellGAN can produce visually plausible TCT cytopathological images for different cell types. We also validate the effectiveness of using CellGAN to greatly augment patch-level cell classification performance.
An Architecture for Control Plane Slicing in Beyond 5G Networks
Abstract
To accommodate various use cases with differing characteristics, the Fifth Generation (5G) mobile communications system intends to utilize network slicing. Network slicing enables the creation of multiple logical networks over a shared physical network infrastructure. While the problems such as resource allocation for multiple slices in mobile networks have been explored in considerable detail in the existing literature, the suitability of the existing mobile network architecture to support network slicing has not been analysed adequately. We think the existing 5G System (5GS) architecture suffers from certain limitations, such as a lack of slice isolation in its control plane. This work focuses on the future evolution of the existing 5GS architecture from a slicing perspective, especially that of its control plane, addressing some of the limitations of the existing 5GS architecture. We propose a new network architecture which enables efficient slicing in beyond 5G networks. The proposed architecture results in enhanced modularity and scalability of the control plane in sliced mobile networks. In addition, it also brings slice isolation to the control plane, which is not feasible in the existing 5G system. We also present a performance evaluation that confirms the improved performance and scalability of the proposed system viz a viz the existing 5G system.
Power Loss Minimization of Distribution Network using Different Grid Strategies
Abstract
Power losses in electrical power systems especially, distribution systems, occur due to several environmental and technical factors. Transmission & Distribution line losses are normally 17% and 50% respectively. These losses are due to the inappropriate size of the conductor, long distribution lines, low power factor, overloading of lines etc. The power losses cause economic loss and reduce the system's reliability. The reliability of electrical power systems can be improved by decreasing network power loss and by improving the voltage profile. In radial distribution systems, power loss can also be minimized through Distributed Generation (DG) system placement. In this thesis, three different grid strategies including real power sharing, reactive power injection and transformer tap changing are discussed and used to minimize line losses. These three proposed grid strategies have been implemented using a power flow study based on Newton-Raphson (NR) and Genetic Algorithm (GA). To minimize line losses, both methods have been used for each grid strategy. The used test system in this research work is the IEEE-30 bus radial distribution system. Results obtained after simulation of each grid strategy using NR and GA shows that real load sharing is reliable with respect to minimization of line loss as compared to reactive power injection and transformer tap changing grid strategy. Comparative analysis has been performed between GA and NR for each grid strategy, results show that Genetic Algorithm is more reliable and efficient for loss minimization as compared to Newton-Raphson. In the base case for optimum power flow solution using genetic algorithm and Newton-Raphson, real line losses are 9.481475MW and 17.557MW respectively. So, GA is preferable for each proposed grid strategy to minimize line losses than NR.
Connectivity Labeling for Multiple Vertex Failures
Abstract
We present an efficient labeling scheme for answering connectivity queries in graphs subject to a specified number of vertex failures. Our first result is a randomized construction of a labeling function that assigns vertices $O(f^3\log^5 n)$-bit labels, such that given the labels of $F\cup {s,t}$ where $|F|\leq f$, we can correctly report, with probability $1-1/\mathrm{poly}(n)$, whether $s$ and $t$ are connected in $G-F$. However, it is possible that over all $n^{O(f)}$ distinct queries, some are answered incorrectly. Our second result is a deterministic labeling function that produces $O(f^7 \log^{13} n)$-bit labels such that all connectivity queries are answered correctly. Both upper bounds are polynomially off from an $\Omega(f)$-bit lower bound. Our labeling schemes are based on a new low degree decomposition that improves the Duan-Pettie decomposition, and facilitates its distributed representation. We make heavy use of randomization to construct hitting sets, fault-tolerant graph sparsifiers, and in constructing linear sketches. Our derandomized labeling scheme combines a variety of techniques: the method of conditional expectations, hit-miss hash families, and $\epsilon$-nets for axis-aligned rectangles. The prior labeling scheme of Parter and Petruschka shows that $f=1$ and $f=2$ vertex faults can be handled with $O(\log n)$- and $O(\log^3 n)$-bit labels, respectively, and for $f>2$ vertex faults, $\tilde{O}(n^{1-1/2^{f-2}})$-bit labels suffice.
A Comparative Analysis Between the Additive and the Multiplicative Extended Kalman Filter for Satellite Attitude Determination
Authors: Hamza A. Hassan, William Tolstrup, Johanes P. Suriana, Ibrahim D. Kiziloklu
Abstract
The general consensus is that the Multiplicative Extended Kalman Filter (MEKF) is superior to the Additive Extended Kalman Filter (AEKF) based on a wealth of theoretical evidence. This paper deals with a practical comparison between the two filters in simulation with the goal of verifying if the previous theoretical foundations are true. The AEKF and MEKF are two variants of the Extended Kalman Filter that differ in their approach to linearizing the system dynamics. The AEKF uses an additive correction term to update the state estimate, while the MEKF uses a multiplicative correction term. The two also differ in the state of which they use. The AEKF uses the quaternion as its state while the MEKF uses the Gibbs vector as its state. The results show that the MEKF consistently outperforms the AEKF in terms of estimation accuracy with lower uncertainty. The AEKF is more computationally efficient, but the difference is so low that it is almost negligible and it has no effect on a real-time application. Overall, the results suggest that the MEKF is a better choise for satellite attitude estimation due to its superior estimation accuracy and lower uncertainty, which agrees with the statements from previous work
Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution
Authors: Mostafa Dehghani, Basil Mustafa, Josip Djolonga, Jonathan Heek, Matthias Minderer, Mathilde Caron, Andreas Steiner, Joan Puigcerver, Robert Geirhos, Ibrahim Alabdulmohsin, Avital Oliver, Piotr Padlewski, Alexey Gritsenko, Mario Lučić, Neil Houlsby
Abstract
The ubiquitous and demonstrably suboptimal choice of resizing images to a fixed resolution before processing them with computer vision models has not yet been successfully challenged. However, models such as the Vision Transformer (ViT) offer flexible sequence-based modeling, and hence varying input sequence lengths. We take advantage of this with NaViT (Native Resolution ViT) which uses sequence packing during training to process inputs of arbitrary resolutions and aspect ratios. Alongside flexible model usage, we demonstrate improved training efficiency for large-scale supervised and contrastive image-text pretraining. NaViT can be efficiently transferred to standard tasks such as image and video classification, object detection, and semantic segmentation and leads to improved results on robustness and fairness benchmarks. At inference time, the input resolution flexibility can be used to smoothly navigate the test-time cost-performance trade-off. We believe that NaViT marks a departure from the standard, CNN-designed, input and modelling pipeline used by most computer vision models, and represents a promising direction for ViTs.
Locally Adaptive Federated Learning via Stochastic Polyak Stepsizes
Authors: Sohom Mukherjee, Nicolas Loizou, Sebastian U. Stich
Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
Abstract
State-of-the-art federated learning algorithms such as FedAvg require carefully tuned stepsizes to achieve their best performance. The improvements proposed by existing adaptive federated methods involve tuning of additional hyperparameters such as momentum parameters, and consider adaptivity only in the server aggregation round, but not locally. These methods can be inefficient in many practical scenarios because they require excessive tuning of hyperparameters and do not capture local geometric information. In this work, we extend the recently proposed stochastic Polyak stepsize (SPS) to the federated learning setting, and propose new locally adaptive and nearly parameter-free distributed SPS variants (FedSPS and FedDecSPS). We prove that FedSPS converges linearly in strongly convex and sublinearly in convex settings when the interpolation condition (overparametrization) is satisfied, and converges to a neighborhood of the solution in the general case. We extend our proposed method to a decreasing stepsize version FedDecSPS, that converges also when the interpolation condition does not hold. We validate our theoretical claims by performing illustrative convex experiments. Our proposed algorithms match the optimization performance of FedAvg with the best tuned hyperparameters in the i.i.d. case, and outperform FedAvg in the non-i.i.d. case.
SAGE -- A Tool for Optimal Deployments in Kubernetes Clusters
Authors: Vlad-Ioan Luca, Madalina Erascu
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Abstract
Cloud computing has brought a fundamental transformation in how organizations operate their applications, enabling them to achieve affordable high availability of services. Kubernetes has emerged as the preferred choice for container orchestration and service management across many Cloud computing platforms. The scheduler in Kubernetes plays a crucial role in determining the placement of newly deployed service containers. However, the default scheduler, while fast, often lacks optimization, leading to inefficient service placement or even deployment failures. This paper introduces SAGE, a tool for optimal solutions in Kubernetes clusters that can also assist the Kubernetes default scheduler and any other custom scheduler in application deployment. SAGE computes an optimal deployment plan based on the constraints of the application to be deployed and the available Cloud resources. We show the potential benefits of using SAGE by considering test cases with various characteristics. It turns out that SAGE surpasses other schedulers by comprehensively analyzing the application demand and cluster image. This ability allows it to better understand the needs of the pods, resulting in consistently optimal solutions across all scenarios. The accompanying material of this paper is publicly available at https://github.com/SAGE-Project/SAGE-Predeployer.
Information-Theoretically Private Federated Submodel Learning with Storage Constrained Databases
Authors: Sajani Vithana, Sennur Ulukus
Subjects: Information Theory (cs.IT); Cryptography and Security (cs.CR); Networking and Internet Architecture (cs.NI); Signal Processing (eess.SP)
Abstract
In federated submodel learning (FSL), a machine learning model is divided into multiple submodels based on different types of data used for training. Each user involved in the training process only downloads and updates the submodel relevant to the user's local data, which significantly reduces the communication cost compared to classical federated learning (FL). However, the index of the submodel updated by the user and the values of the updates reveal information about the user's private data. In order to guarantee information-theoretic privacy in FSL, the model is stored at multiple non-colluding databases, and the user sends queries and updates to each database in such a way that no information is revealed on the updating submodel index or the values of the updates. In this work, we consider the practical scenario where the multiple non-colluding databases are allowed to have arbitrary storage constraints. The goal of this work is to develop read-write schemes and storage mechanisms for FSL that efficiently utilize the available storage in each database to store the submodel parameters in such a way that the total communication cost is minimized while guaranteeing information-theoretic privacy of the updating submodel index and the values of the updates. As the main result, we consider both heterogeneous and homogeneous storage constrained databases, and propose private read-write and storage schemes for the two cases.
Keyword: faster
Area, Delay, and Energy-Efficient Full Dadda Multiplier
Authors: Muteen Munawar, Zain Shabbir, Muhammad Akram
Subjects: Systems and Control (eess.SY); Hardware Architecture (cs.AR)
Abstract
The Dadda algorithm is a parallel structured multiplier, which is quite faster as compared to array multipliers, i.e., Booth, Braun, Baugh-Wooley, etc. However, it consumes more power and needs a larger number of gates for hardware implementation. In this paper, a modified-Dadda algorithm-based multiplier is designed using a proposed half-adder-based carry-select adder with a binary to excess-1 converter and an improved ripple-carry adder (RCA). The proposed design is simulated in different technologies, i.e., Taiwan Semiconductor Manufacturing Company (TSMC) 50nm, 90nm, and 120nm, and on different GHz frequencies, i.e., 0.5, 1, 2, and 3.33GHz. Specifically, the 4-bit circuit of the proposed design in TSMCs 50nm technology consumes 25uW of power at 3.33GHz with 76ps of delay. The simulation results reveal that the design is faster, more power-energy-efficient, and requires a smaller number of transistors for implementation as compared to some closely related works. The proposed design can be a promising candidate for low-power and low-cost digital controllers. In the end, the design has been compared with recent relevant works in the literature.
DSPC: Efficiently Answering Shortest Path Counting on Dynamic Graphs
Abstract
The widespread use of graph data in various applications and the highly dynamic nature of today's networks have made it imperative to analyze structural trends in dynamic graphs on a continual basis. The shortest path is a fundamental concept in graph analysis and recent research shows that counting the number of shortest paths between two vertices is crucial in applications like potential friend recommendation and betweenness analysis. However, current studies that use hub labeling techniques for real-time shortest path counting are limited by their reliance on a pre-computed index, which cannot tackle frequent updates over dynamic graphs. To address this, we propose a novel approach for maintaining the index in response to changes in the graph structure and develop incremental (IncSPC) and decremental (DecSPC) update algorithms for inserting and deleting vertices/edges, respectively. The main idea of these two algorithms is that we only locate the affected vertices to update the index. Our experiments demonstrate that our dynamic algorithms are up to four orders of magnitude faster processing for incremental updates and up to three orders of magnitude faster processing for hybrid updates than reconstruction.
FGo: A Directed Grey-box Fuzzer with Probabilistic Exponential cut-the-loss Strategies
Abstract
Traditional coverage grey-box fuzzers perform a breadth-first search of the state space of Program Under Test (PUT). This aimlessness wastes a lot of computing resources. Directed grey-box fuzzing focuses on the target of PUT and becomes one of the most popular topics of software testing. The early termination of unreachable test cases is a method to improve directed grey-box fuzzing. However, existing solutions have two problems: firstly, reachability analysis needs to introduce extra technologies (e.g., static analysis); secondly, the performance of reachability analysis and auxiliary technologies lack versatility. We propose FGo, a probabilistic exponential cut-the-loss directed grey-box fuzzer. FGo terminates unreachable test cases early with exponentially increasing probability. Compared to other technologies, FGo makes full use of the unreachable information contained in iCFG and doesn't generate any additional overhead caused by reachability analysis. Moreover, it is easy to generalize to all PUT. This strategy based on probability is perfectly adapted to the randomness of fuzzing. The experiment results show that FGo is 106% faster than AFLGo in reproducing crashes. We compare multiple parameters of probabilistic exponential cut-the-loss algorithm and analyze them in detail. In addition, for enhancing the inerpretability of FGo, this paper discusses the difference between the theoretical performance and the practical performance of probabilistic exponential cut-the-loss algorithm.
Flexible and Fully Quantized Ultra-Lightweight TinyissimoYOLO for Ultra-Low-Power Edge Systems
Authors: Julian Moosmann, Hanna Mueller, Nicky Zimmerman, Georg Rutishauser, Luca Benini, Michele Magno
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
Abstract
This paper deploys and explores variants of TinyissimoYOLO, a highly flexible and fully quantized ultra-lightweight object detection network designed for edge systems with a power envelope of a few milliwatts. With experimental measurements, we present a comprehensive characterization of the network's detection performance, exploring the impact of various parameters, including input resolution, number of object classes, and hidden layer adjustments. We deploy variants of TinyissimoYOLO on state-of-the-art ultra-low-power extreme edge platforms, presenting an in-depth a comparison on latency, energy efficiency, and their ability to efficiently parallelize the workload. In particular, the paper presents a comparison between a novel parallel RISC-V processor (GAP9 from Greenwaves) with and without use of its on-chip hardware accelerator, an ARM Cortex-M7 core (STM32H7 from ST Microelectronics), two ARM Cortex-M4 cores (STM32L4 from STM and Apollo4b from Ambiq), and a multi-core platform with a CNN hardware accelerator (Analog Devices MAX78000). Experimental results show that the GAP9's hardware accelerator achieves the lowest inference latency and energy at 2.12ms and 150uJ respectively, which is around 2x faster and 20% more efficient than the next best platform, the MAX78000. The hardware accelerator of GAP9 can even run an increased resolution version of TinyissimoYOLO with 112x112 pixels and 10 detection classes within 3.2ms, consuming 245uJ. To showcase the competitiveness of a versatile general-purpose system we also deployed and profiled a multi-core implementation on GAP9 at different operating points, achieving 11.3ms with the lowest-latency and 490uJ with the most energy-efficient configuration. With this paper, we demonstrate the suitability and flexibility of TinyissimoYOLO on state-of-the-art detection datasets for real-time ultra-low-power edge inference.
Abstract
Holographic Intellectual Voice Assistant (HIVA) aims to facilitate human computer interaction using audiovisual effects and 3D avatar. HIVA provides complete information about the university, including requests of various nature: admission, study issues, fees, departments, university structure and history, canteen, human resources, library, student life and events, information about the country and the city, etc. There are other ways for receiving the data listed above: the university's official website and other supporting apps, HEI (Higher Education Institution) official social media, directly asking the HEI staff, and other channels. However, HIVA provides the unique experience of "face-to-face" interaction with an animated 3D mascot, helping to get a sense of 'real-life' communication. The system includes many sub-modules and connects a family of applications such as mobile applications, Telegram chatbot, suggestion categorization, and entertainment services. The Voice assistant uses Russian language NLP models and tools, which are pipelined for the best user experience.
UX Heuristics and Checklist for Deep Learning powered Mobile Applications with Image Classification
Authors: Christiane Gresse von Wangenheim, Gustavo Dirschnabel
Abstract
Advances in mobile applications providing image classification enabled by Deep Learning require innovative User Experience solutions in order to assure their adequate use by users. To aid the design process, usability heuristics are typically customized for a specific kind of application. Therefore, based on a literature review and analyzing existing mobile applications with image classification, we propose an initial set of AIX heuristics for Deep Learning powered mobile applications with image classification decomposed into a checklist. In order to facilitate the usage of the checklist we also developed an online course presenting the concepts and heuristics as well as a web-based tool in order to support an evaluation using these heuristics. These results of this research can be used to guide the design of the interfaces of such applications as well as support the conduction of heuristic evaluations supporting practitioners to develop image classification apps that people can understand, trust, and can engage with effectively.
Towards Mobility Data Science (Vision Paper)
Authors: Mohamed Mokbel (University of Minnesota, Minneapolis, USA), Mahmoud Sakr (Université Libre, Brussels, Belgium), Li Xiong (Emory University, Atlanta, USA), Andreas Züfle (Emory University, Atlanta, USA), Jussara Almeida (Federal University of Minas Gerais, Belo Horizonte, Brazil), Walid Aref (Purdue University, West Lafayette, USA), Gennady Andrienko (Fraunhofer IAIS, St. Augustin, Germany), Natalia Andrienko (Fraunhofer IAIS, St. Augustin, Germany), Yang Cao (Kyoto University, Kyoto, Japan), Sanjay Chawla (Qatar Computing Research Institute, Doha, Qatar), Reynold Cheng (University of Hong Kong, Hong Kong, China), Panos Chrysanthis (University of Pittsburgh, Pennsylvania, USA), Xiqi Fei (George Mason University, Fairfax, USA), Gabriel Ghinita (University of Massachusetts at Boston, Boston, USA), et al. (32 additional authors not shown)
Abstract
Mobility data captures the locations of moving objects such as humans, animals, and cars. With the availability of GPS-equipped mobile devices and other inexpensive location-tracking technologies, mobility data is collected ubiquitously. In recent years, the use of mobility data has demonstrated significant impact in various domains including traffic management, urban planning, and health sciences. In this paper, we present the emerging domain of mobility data science. Towards a unified approach to mobility data science, we envision a pipeline having the following components: mobility data collection, cleaning, analysis, management, and privacy. For each of these components, we explain how mobility data science differs from general data science, we survey the current state of the art and describe open challenges for the research community in the coming years.
SnakeSynth: New Interactions for Generative Audio Synthesis
Abstract
I present "SnakeSynth," a web-based lightweight audio synthesizer that combines audio generated by a deep generative model and real-time continuous two-dimensional (2D) input to create and control variable-length generative sounds through 2D interaction gestures. Interaction gestures are touch and mobile-compatible with analogies to strummed, bowed, and plucked musical instrument controls. Point-and-click and drag-and-drop gestures directly control audio playback length and I show that sound length and intensity are modulated by interactions with a programmable 2D coordinate grid. Leveraging the speed and ubiquity of browser-based audio and hardware acceleration in Google's TensorFlow.js we generate time-varying high-fidelity sounds with real-time interactivity. SnakeSynth adaptively reproduces and interpolates between sounds encountered during model training, notably without long training times, and I briefly discuss possible futures for deep generative models as an interactive paradigm for musical expression.
Efficient Task Offloading Algorithm for Digital Twin in Edge/Cloud Computing Environment
Abstract
In the era of Internet of Things (IoT), Digital Twin (DT) is envisioned to empower various areas as a bridge between physical objects and the digital world. Through virtualization and simulation techniques, multiple functions can be achieved by leveraging computing resources. In this process, Mobile Cloud Computing (MCC) and Mobile Edge Computing (MEC) have become two of the key factors to achieve real-time feedback. However, current works only considered edge servers or cloud servers in the DT system models. Besides, The models ignore the DT with not only one data resource. In this paper, we propose a new DT system model considering a heterogeneous MEC/MCC environment. Each DT in the model is maintained in one of the servers via multiple data collection devices. The offloading decision-making problem is also considered and a new offloading scheme is proposed based on Distributed Deep Learning (DDL). Simulation results demonstrate that our proposed algorithm can effectively and efficiently decrease the system's average latency and energy consumption. Significant improvement is achieved compared with the baselines under the dynamic environment of DTs.
Applying SDN to Mobile Networks: A New Perspective for 6G Architecture
Abstract
The upcoming Sixth Generation (6G) mobile communications system envisions supporting a variety of use cases with differing characteristics, e.g., very low to extremely high data rates, diverse latency needs, ultra massive connectivity, sustainable communications, ultra-wide coverage etc. To accommodate these diverse use cases, the 6G system architecture needs to be scalable, modular, and flexible; both in its user plane and the control plane. In this paper, we identify some limitations of the existing Fifth Generation System (5GS) architecture, especially that of its control plane. Further, we propose a novel architecture for the 6G System (6GS) employing Software Defined Networking (SDN) technology to address these limitations of the control plane. The control plane in existing 5GS supports two different categories of functionalities handling end user signalling (e.g., user registration, authentication) and control of user plane functions. We propose to move the end-user signalling functionality out of the mobile network control plane and treat it as user service, i.e., as payload or data. This proposal results in an evolved service-driven architecture for mobile networks bringing increased simplicity, modularity, scalability, flexibility and security to its control plane. The proposed architecture can also support service specific signalling support, if needed, making it better suited for diverse 6GS use cases. To demonstrate the advantages of the proposed architecture, we also compare its performance with the 5GS using a process algebra-based simulation tool.
Exact Resource Allocation over Fair Wireless Relay Networks
Authors: Edgar Arribas, Vicent Cholvi, Vincenzo Mancuso
Subjects: Networking and Internet Architecture (cs.NI)
Abstract
In relay-enabled cellular networks, the intertwined nature of network agents calls for complex schemes to allocate wireless resources. Resources need to be distributed among mobile users while considering how relay resources are allocated, and constrained by the traffic rate achievable by base stations and over backhaul links. In this work, we derive a resource allocation scheme that achieves max-min fairness across mobile users. Furthermore, the optimal allocation is found with linear complexity with respect to the number of mobile users and relays.
SayPlan: Grounding Large Language Models using 3D Scene Graphs for Scalable Task Planning
Authors: Krishan Rana, Jesse Haviland, Sourav Garg, Jad Abou-Chakra, Ian Reid, Niko Suenderhauf
Abstract
Large language models (LLMs) have demonstrated impressive results in developing generalist planning agents for diverse tasks. However, grounding these plans in expansive, multi-floor, and multi-room environments presents a significant challenge for robotics. We introduce SayPlan, a scalable approach to LLM-based, large-scale task planning for robotics using 3D scene graph (3DSG) representations. To ensure the scalability of our approach, we: (1) exploit the hierarchical nature of 3DSGs to allow LLMs to conduct a semantic search for task-relevant subgraphs from a smaller, collapsed representation of the full graph; (2) reduce the planning horizon for the LLM by integrating a classical path planner and (3) introduce an iterative replanning pipeline that refines the initial plan using feedback from a scene graph simulator, correcting infeasible actions and avoiding planning failures. We evaluate our approach on two large-scale environments spanning up to 3 floors, 36 rooms and 140 objects, and show that our approach is capable of grounding large-scale, long-horizon task plans from abstract, and natural language instruction for a mobile manipulator robot to execute.
An Architecture for Control Plane Slicing in Beyond 5G Networks
Abstract
To accommodate various use cases with differing characteristics, the Fifth Generation (5G) mobile communications system intends to utilize network slicing. Network slicing enables the creation of multiple logical networks over a shared physical network infrastructure. While the problems such as resource allocation for multiple slices in mobile networks have been explored in considerable detail in the existing literature, the suitability of the existing mobile network architecture to support network slicing has not been analysed adequately. We think the existing 5G System (5GS) architecture suffers from certain limitations, such as a lack of slice isolation in its control plane. This work focuses on the future evolution of the existing 5GS architecture from a slicing perspective, especially that of its control plane, addressing some of the limitations of the existing 5GS architecture. We propose a new network architecture which enables efficient slicing in beyond 5G networks. The proposed architecture results in enhanced modularity and scalability of the control plane in sliced mobile networks. In addition, it also brings slice isolation to the control plane, which is not feasible in the existing 5G system. We also present a performance evaluation that confirms the improved performance and scalability of the proposed system viz a viz the existing 5G system.
Tackling Computational Heterogeneity in FL: A Few Theoretical Insights
Authors: Adnan Ben Mansour, Gaia Carenini, Alexandre Duplessis
Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC)
Abstract
The future of machine learning lies in moving data collection along with training to the edge. Federated Learning, for short FL, has been recently proposed to achieve this goal. The principle of this approach is to aggregate models learned over a large number of distributed clients, i.e., resource-constrained mobile devices that collect data from their environment, to obtain a new more general model. The latter is subsequently redistributed to clients for further training. A key feature that distinguishes federated learning from data-center-based distributed training is the inherent heterogeneity. In this work, we introduce and analyse a novel aggregation framework that allows for formalizing and tackling computational heterogeneity in federated optimization, in terms of both heterogeneous data and local updates. Proposed aggregation algorithms are extensively analyzed from a theoretical, and an experimental prospective.
Keyword: pruning
There is no result
Keyword: diffusion
Augmenters at SemEval-2023 Task 1: Enhancing CLIP in Handling Compositionality and Ambiguity for Zero-Shot Visual WSD through Prompt Augmentation and Text-To-Image Diffusion
Authors: Jie S. Li, Yow-Ting Shiue, Yong-Siang Shih, Jonas Geiping
Abstract
This paper describes our zero-shot approaches for the Visual Word Sense Disambiguation (VWSD) Task in English. Our preliminary study shows that the simple approach of matching candidate images with the phrase using CLIP suffers from the many-to-many nature of image-text pairs. We find that the CLIP text encoder may have limited abilities in capturing the compositionality in natural language. Conversely, the descriptive focus of the phrase varies from instance to instance. We address these issues in our two systems, Augment-CLIP and Stable Diffusion Sampling (SD Sampling). Augment-CLIP augments the text prompt by generating sentences that contain the context phrase with the help of large language models (LLMs). We further explore CLIP models in other languages, as the an ambiguous word may be translated into an unambiguous one in the other language. SD Sampling uses text-to-image Stable Diffusion to generate multiple images from the given phrase, increasing the likelihood that a subset of images match the one that paired with the text.
Merging multiple input descriptors and supervisors in a deep neural network for tractogram filtering
Authors: Daniel Jörgens, Pierre-Marc Jodoin, Maxime Descoteaux, Rodrigo Moreno
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Abstract
One of the main issues of the current tractography methods is their high false-positive rate. Tractogram filtering is an option to remove false-positive streamlines from tractography data in a post-processing step. In this paper, we train a deep neural network for filtering tractography data in which every streamline of a tractogram is classified as {\em plausible, implausible}, or {\em inconclusive}. For this, we use four different tractogram filtering strategies as supervisors: TractQuerier, RecobundlesX, TractSeg, and an anatomy-inspired filter. Their outputs are combined to obtain the classification labels for the streamlines. We assessed the importance of different types of information along the streamlines for performing this classification task, including the coordinates of the streamlines, diffusion data, landmarks, T1-weighted information, and a brain parcellation. We found that the streamline coordinates are the most relevant followed by the diffusion data in this particular classification task.
DiffuseGAE: Controllable and High-fidelity Image Manipulation from Disentangled Representation
Abstract
Diffusion probabilistic models (DPMs) have shown remarkable results on various image synthesis tasks such as text-to-image generation and image inpainting. However, compared to other generative methods like VAEs and GANs, DPMs lack a low-dimensional, interpretable, and well-decoupled latent code. Recently, diffusion autoencoders (Diff-AE) were proposed to explore the potential of DPMs for representation learning via autoencoding. Diff-AE provides an accessible latent space that exhibits remarkable interpretability, allowing us to manipulate image attributes based on latent codes from the space. However, previous works are not generic as they only operated on a few limited attributes. To further explore the latent space of Diff-AE and achieve a generic editing pipeline, we proposed a module called Group-supervised AutoEncoder(dubbed GAE) for Diff-AE to achieve better disentanglement on the latent code. Our proposed GAE has trained via an attribute-swap strategy to acquire the latent codes for multi-attribute image manipulation based on examples. We empirically demonstrate that our method enables multiple-attributes manipulation and achieves convincing sample quality and attribute alignments, while significantly reducing computational requirements compared to pixel-based approaches for representational decoupling. Code will be released soon.
Towards Safe Self-Distillation of Internet-Scale Text-to-Image Diffusion Models
Authors: Sanghyun Kim, Seohyeon Jung, Balhae Kim, Moonseok Choi, Jinwoo Shin, Juho Lee
Abstract
Large-scale image generation models, with impressive quality made possible by the vast amount of data available on the Internet, raise social concerns that these models may generate harmful or copyrighted content. The biases and harmfulness arise throughout the entire training process and are hard to completely remove, which have become significant hurdles to the safe deployment of these models. In this paper, we propose a method called SDD to prevent problematic content generation in text-to-image diffusion models. We self-distill the diffusion model to guide the noise estimate conditioned on the target removal concept to match the unconditional one. Compared to the previous methods, our method eliminates a much greater proportion of harmful content from the generated images without degrading the overall image quality. Furthermore, our method allows the removal of multiple concepts at once, whereas previous works are limited to removing a single concept at a time.
Reduced basis method for non-symmetric eigenvalue problems: application to the multigroup neutron diffusion equations
Authors: Yonah Conjungo Taumhas, Geneviève Dusson, Virginie Ehrlacher, Tony Lelièvre, François Madiot (SERMA)
Abstract
In this article, we propose a reduced basis method for parametrized non-symmetric eigenvalue problems arising in the loading pattern optimization of a nuclear core in neutronics. To this end, we derive a posteriori error estimates for the eigenvalue and left and right eigenvectors. The practical computation of these estimators requires the estimation of a constant called prefactor, which we can express as the spectral norm of some operator. We provide some elements of theoretical analysis which illustrate the link between the expression of the prefactor we obtain here and its well-known expression in the case of symmetric eigenvalue problems, either using the notion of numerical range of the operator, or via a perturbative analysis. Lastly, we propose a practical method in order to estimate this prefactor which yields interesting numerical results on actual test cases. We provide detailed numerical simulations on two-dimensional examples including a multigroup neutron diffusion equation.
Navigating the Complexity of Generative AI Adoption in Software Engineering
Abstract
In this paper, the adoption patterns of Generative Artificial Intelligence (AI) tools within software engineering are investigated. Influencing factors at the individual, technological, and societal levels are analyzed using a mixed-methods approach for an extensive comprehension of AI adoption. An initial structured interview was conducted with 100 software engineers, employing the Technology Acceptance Model (TAM), the Diffusion of Innovations theory (DOI), and the Social Cognitive Theory (SCT) as guiding theories. A theoretical model named the Human-AI Collaboration and Adaptation Framework (HACAF) was deduced using the Gioia Methodology, characterizing AI adoption in software engineering. This model's validity was subsequently tested through Partial Least Squares - Structural Equation Modeling (PLS-SEM), using data collected from 183 software professionals. The results indicate that the adoption of AI tools in these early integration stages is primarily driven by their compatibility with existing development workflows. This finding counters the traditional theories of technology acceptance. Contrary to expectations, the influence of perceived usefulness, social aspects, and personal innovativeness on adoption appeared to be less significant. This paper yields significant insights for the design of future AI tools and supplies a structure for devising effective strategies for organizational implementation.
Learning Stochastic Dynamical Systems as an Implicit Regularization with Graph Neural Networks
Abstract
Stochastic Gumbel graph networks are proposed to learn high-dimensional time series, where the observed dimensions are often spatially correlated. To that end, the observed randomness and spatial-correlations are captured by learning the drift and diffusion terms of the stochastic differential equation with a Gumble matrix embedding, respectively. In particular, this novel framework enables us to investigate the implicit regularization effect of the noise terms in S-GGNs. We provide a theoretical guarantee for the proposed S-GGNs by deriving the difference between the two corresponding loss functions in a small neighborhood of weight. Then, we employ Kuramoto's model to generate data for comparing the spectral density from the Hessian Matrix of the two loss functions. Experimental results on real-world data, demonstrate that S-GGNs exhibit superior convergence, robustness, and generalization, compared with state-of-the-arts.
Diffusion Based Multi-Agent Adversarial Tracking
Authors: Sean Ye, Manisha Natarajan, Zixuan Wu, Matthew Gombolay
Subjects: Robotics (cs.RO); Machine Learning (cs.LG); Multiagent Systems (cs.MA)
Abstract
Target tracking plays a crucial role in real-world scenarios, particularly in drug-trafficking interdiction, where the knowledge of an adversarial target's location is often limited. Improving autonomous tracking systems will enable unmanned aerial, surface, and underwater vehicles to better assist in interdicting smugglers that use manned surface, semi-submersible, and aerial vessels. As unmanned drones proliferate, accurate autonomous target estimation is even more crucial for security and safety. This paper presents Constrained Agent-based Diffusion for Enhanced Multi-Agent Tracking (CADENCE), an approach aimed at generating comprehensive predictions of adversary locations by leveraging past sparse state information. To assess the effectiveness of this approach, we evaluate predictions on single-target and multi-target pursuit environments, employing Monte-Carlo sampling of the diffusion model to estimate the probability associated with each generated trajectory. We propose a novel cross-attention based diffusion model that utilizes constraint-based sampling to generate multimodal track hypotheses. Our single-target model surpasses the performance of all baseline methods on Average Displacement Error (ADE) for predictions across all time horizons.
Exposing the Fake: Effective Diffusion-Generated Images Detection
Abstract
Image synthesis has seen significant advancements with the advent of diffusion-based generative models like Denoising Diffusion Probabilistic Models (DDPM) and text-to-image diffusion models. Despite their efficacy, there is a dearth of research dedicated to detecting diffusion-generated images, which could pose potential security and privacy risks. This paper addresses this gap by proposing a novel detection method called Stepwise Error for Diffusion-generated Image Detection (SeDID). Comprising statistical-based $\text{SeDID}{\text{Stat}}$ and neural network-based $\text{SeDID}{\text{NNs}}$, SeDID exploits the unique attributes of diffusion models, namely deterministic reverse and deterministic denoising computation errors. Our evaluations demonstrate SeDID's superior performance over existing methods when applied to diffusion models. Thus, our work makes a pivotal contribution to distinguishing diffusion model-generated images, marking a significant step in the domain of artificial intelligence security.
Keyword: adaptive
Adaptive Graph Convolution Networks for Traffic Flow Forecasting
Abstract
Traffic flow forecasting is a highly challenging task due to the dynamic spatial-temporal road conditions. Graph neural networks (GNN) has been widely applied in this task. However, most of these GNNs ignore the effects of time-varying road conditions due to the fixed range of the convolution receptive field. In this paper, we propose a novel Adaptive Graph Convolution Networks (AGC-net) to address this issue in GNN. The AGC-net is constructed by the Adaptive Graph Convolution (AGC) based on a novel context attention mechanism, which consists of a set of graph wavelets with various learnable scales. The AGC transforms the spatial graph representations into time-sensitive features considering the temporal context. Moreover, a shifted graph convolution kernel is designed to enhance the AGC, which attempts to correct the deviations caused by inaccurate topology. Experimental results on two public traffic datasets demonstrate the effectiveness of the AGC-net\footnote{Code is available at: https://github.com/zhengdaoli/AGC-net} which outperforms other baseline models significantly.
Improved Efficiency and Accuracy of the Magnetic Polarizability Tensor Spectral Signature Object Characterisation for Metal Detection
Abstract
Magnetic polarizability tensors (MPTs) provide an economical characterisation of conducting metallic objects and can aid in the solution of metal detection inverse problems, such as scrap metal sorting, searching for unexploded ordnance in areas of former conflict, and security screening at event venues and transport hubs. Previous work has established explicit formulae for their coefficients and a rigorous mathematical theory for the characterisation they provide. In order to assist with efficient computation of MPT spectral signatures of different objects to enable the construction of large dictionaries of characterisations for classification approaches, this work proposes a new, highly-efficient, strategy for predicting MPT coefficients. This is achieved by solving an eddy current type problem using hp-finite elements in combination with a proper orthogonal decomposition reduced order modelling (ROM) methodology and offers considerable computational savings over our previous approach. Furthermore, an adaptive approach is described for generating new frequency snapshots to further improve the accuracy of the ROM. To improve the resolution of highly conducting and magnetic objects, a recipe is proposed to choose the number and thicknesses of prismatic boundary layers for accurate resolution of thin skin depths in such problems. The paper includes a series of challenging examples to demonstrate the success of the proposed methodologies.
Transaction Fraud Detection via an Adaptive Graph Neural Network
Abstract
Many machine learning methods have been proposed to achieve accurate transaction fraud detection, which is essential to the financial security of individuals and banks. However, most existing methods leverage original features only or require manual feature engineering. They lack the ability to learn discriminative representations from transaction data. Moreover, criminals often commit fraud by imitating cardholders' behaviors, which causes the poor performance of existing detection models. In this paper, we propose an Adaptive Sampling and Aggregation-based Graph Neural Network (ASA-GNN) that learns discriminative representations to improve the performance of transaction fraud detection. A neighbor sampling strategy is performed to filter noisy nodes and supplement information for fraudulent nodes. Specifically, we leverage cosine similarity and edge weights to adaptively select neighbors with similar behavior patterns for target nodes and then find multi-hop neighbors for fraudulent nodes. A neighbor diversity metric is designed by calculating the entropy among neighbors to tackle the camouflage issue of fraudsters and explicitly alleviate the over-smoothing phenomena. Extensive experiments on three real financial datasets demonstrate that the proposed method ASA-GNN outperforms state-of-the-art ones.
EgoAdapt: A multi-stream evaluation study of adaptation to real-world egocentric user video
Authors: Matthias De Lange, Hamid Eghbalzadeh, Reuben Tan, Michael Iuzzolino, Franziska Meier, Karl Ridgeway
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Abstract
In egocentric action recognition a single population model is typically trained and subsequently embodied on a head-mounted device, such as an augmented reality headset. While this model remains static for new users and environments, we introduce an adaptive paradigm of two phases, where after pretraining a population model, the model adapts on-device and online to the user's experience. This setting is highly challenging due to the change from population to user domain and the distribution shifts in the user's data stream. Coping with the latter in-stream distribution shifts is the focus of continual learning, where progress has been rooted in controlled benchmarks but challenges faced in real-world applications often remain unaddressed. We introduce EgoAdapt, a benchmark for real-world egocentric action recognition that facilitates our two-phased adaptive paradigm, and real-world challenges naturally occur in the egocentric video streams from Ego4d, such as long-tailed action distributions and large-scale classification over 2740 actions. We introduce an evaluation framework that directly exploits the user's data stream with new metrics to measure the adaptation gain over the population model, online generalization, and hindsight performance. In contrast to single-stream evaluation in existing works, our framework proposes a meta-evaluation that aggregates the results from 50 independent user streams. We provide an extensive empirical study for finetuning and experience replay.
Making the Nyström method highly accurate for low-rank approximations
Abstract
The Nystr\"om method is a convenient heuristic method to obtain low-rank approximations to kernel matrices in nearly linear complexity. Existing studies typically use the method to approximate positive semidefinite matrices with low or modest accuracies. In this work, we propose a series of heuristic strategies to make the Nystr\"om method reach high accuracies for nonsymmetric and/or rectangular matrices. The resulting methods (called high-accuracy Nystr\"om methods) treat the Nystr\"om method and a skinny rank-revealing factorization as a fast pivoting strategy in a progressive alternating direction refinement process. Two refinement mechanisms are used: alternating the row and column pivoting starting from a small set of randomly chosen columns, and adaptively increasing the number of samples until a desired rank or accuracy is reached. A fast subset update strategy based on the progressive sampling of Schur complements is further proposed to accelerate the refinement process. Efficient randomized accuracy control is also provided. Relevant accuracy and singular value analysis is given to support some of the heuristics. Extensive tests with various kernel functions and data sets show how the methods can quickly reach prespecified high accuracies in practice, sometimes with quality close to SVDs, using only small numbers of progressive sampling steps.
Implicit Adaptive Mesh Refinement for Dispersive Tsunami Propagation
Authors: Marsha J. Berger, Randall J. LeVeque
Subjects: Numerical Analysis (math.NA); Atmospheric and Oceanic Physics (physics.ao-ph)
Abstract
We present an algorithm to solve the dispersive depth-averaged Serre-Green-Naghdi (SGN) equations using patch-based adaptive mesh refinement. These equations require adding additional higher derivative terms to the nonlinear shallow water equations. This has been implemented as a new component of the open source GeoClaw software that is widely used for modeling tsunamis, storm surge, and related hazards, improving its accuracy on shorter wavelength phenomena. The equations require the solution of an elliptic system at each time step. The adaptive algorithm allows different time steps on different refinement levels, and solves the implicit equations level by level. Computational examples are presented to illustrate the stability and accuracy on a radially symmetric test case and two realistic tsunami modeling problems, including a hypothetical asteroid impact creating a short wavelength tsunami for which dispersive terms are necessary.
SnakeSynth: New Interactions for Generative Audio Synthesis
Abstract
I present "SnakeSynth," a web-based lightweight audio synthesizer that combines audio generated by a deep generative model and real-time continuous two-dimensional (2D) input to create and control variable-length generative sounds through 2D interaction gestures. Interaction gestures are touch and mobile-compatible with analogies to strummed, bowed, and plucked musical instrument controls. Point-and-click and drag-and-drop gestures directly control audio playback length and I show that sound length and intensity are modulated by interactions with a programmable 2D coordinate grid. Leveraging the speed and ubiquity of browser-based audio and hardware acceleration in Google's TensorFlow.js we generate time-varying high-fidelity sounds with real-time interactivity. SnakeSynth adaptively reproduces and interpolates between sounds encountered during model training, notably without long training times, and I briefly discuss possible futures for deep generative models as an interactive paradigm for musical expression.
GLA-GCN: Global-local Adaptive Graph Convolutional Network for 3D Human
Authors: Bruce X.B. Yu, Zhi Zhang, Yongxu Liu, Sheng-hua Zhong, Yan Liu, Chang Wen Chen
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Abstract
3D human pose estimation has been researched for decades with promising fruits. 3D human pose lifting is one of the promising research directions toward the task where both estimated pose and ground truth pose data are used for training. Existing pose lifting works mainly focus on improving the performance of estimated pose, but they usually underperform when testing on the ground truth pose data. We observe that the performance of the estimated pose can be easily improved by preparing good quality 2D pose, such as fine-tuning the 2D pose or using advanced 2D pose detectors. As such, we concentrate on improving the 3D human pose lifting via ground truth data for the future improvement of more quality estimated pose data. Towards this goal, a simple yet effective model called Global-local Adaptive Graph Convolutional Network (GLA-GCN) is proposed in this work. Our GLA-GCN globally models the spatiotemporal structure via a graph representation and backtraces local joint features for 3D human pose estimation via individually connected layers. To validate our model design, we conduct extensive experiments on three benchmark datasets: Human3.6M, HumanEva-I, and MPI-INF-3DHP. Experimental results show that our GLA-GCN implemented with ground truth 2D poses significantly outperforms state-of-the-art methods (e.g., up to around 3%, 17%, and 13% error reductions on Human3.6M, HumanEva-I, and MPI-INF-3DHP, respectively).
FAIRO: Fairness-aware Adaptation in Sequential-Decision Making for Human-in-the-Loop Systems
Abstract
Achieving fairness in sequential-decision making systems within Human-in-the-Loop (HITL) environments is a critical concern, especially when multiple humans with different behavior and expectations are affected by the same adaptation decisions in the system. This human variability factor adds more complexity since policies deemed fair at one point in time may become discriminatory over time due to variations in human preferences resulting from inter- and intra-human variability. This paper addresses the fairness problem from an equity lens, considering human behavior variability, and the changes in human preferences over time. We propose FAIRO, a novel algorithm for fairness-aware sequential-decision making in HITL adaptation, which incorporates these notions into the decision-making process. In particular, FAIRO decomposes this complex fairness task into adaptive sub-tasks based on individual human preferences through leveraging the Options reinforcement learning framework. We design FAIRO to generalize to three types of HITL application setups that have the shared adaptation decision problem. Furthermore, we recognize that fairness-aware policies can sometimes conflict with the application's utility. To address this challenge, we provide a fairness-utility tradeoff in FAIRO, allowing system designers to balance the objectives of fairness and utility based on specific application requirements. Extensive evaluations of FAIRO on the three HITL applications demonstrate its generalizability and effectiveness in promoting fairness while accounting for human variability. On average, FAIRO can improve fairness compared with other methods across all three applications by 35.36%.
An adaptive approach to remove tensile instability in SPH for weakly compressible fluids
Authors: Kanishka Bhattacharya, Tapan Jana, Amit Shaw, L. S. Ramachandra, Vishal Mehera
Subjects: Computational Engineering, Finance, and Science (cs.CE)
Abstract
Smoothed Particle Hydrodynamics (SPH) is plagued by the phenomenon of tensile instability, which is the occurrence of short wavelength zero energy modes resulting in unphysical clustering of particles. The root cause of the instability is the shape of derivative of the compactly supported kernel function which may yield negative stiffness in the particle interaction under certain circumstances. In this work, an adaptive algorithm is developed to remove tensile instability in SPH for weakly compressible fluids. Herein, a B-spline function is used as the SPH kernel and the knots of the B-spline are adapted to change the shape of the kernel, thereby satisfying the condition associated with stability. The knot-shifting criterion is based on the particle movement within the influence domain. This enables the prevention of instability in fluid problems where excessive rearrangement of particle positions occurs. A 1D dispersion analysis of an Oldroyd B fluid material model is performed to show how the algorithm prevents instabilities for short wavelengths but ensures accuracy at large wavelengths. The efficacy of the approach is demonstrated through a few benchmark fluid dynamics simulations where a visco-elastic Oldroyd B material model and a non-viscous Eulerian fluid material model are considered.
On the Design of Nonlinear MPC and LPVMPC for Obstacle Avoidance in Autonomous Driving
Authors: Maryam Nezami, Dimitrios S. Karachalios, Georg Schildbach, Hossam S. Abbas
Abstract
In this study, we are concerned with autonomous driving missions when a static obstacle blocks a given reference trajectory. To provide a realistic control design, we employ a model predictive control (MPC) utilizing nonlinear state-space dynamic models of a car with linear tire forces, allowing for optimal path planning and tracking to overtake the obstacle. We provide solutions with two different methodologies. Firstly, we solve a nonlinear MPC (NMPC) problem with a nonlinear optimization framework, capable of considering the nonlinear constraints. Secondly, by introducing scheduling signals, we embed the nonlinear dynamics in a linear parameter varying (LPV) representation with adaptive linear constraints for realizing the nonlinear constraints associated with the obstacle. Consequently, an LPVMPC optimization problem can be solved efficiently as a quadratic programming (QP) that constitutes the main novelty of this work. We test the two methods for a challenging obstacle avoidance task and provide qualitative comparisons. The LPVMPC shows a significant reduction in terms of the computational burden at the expense of a slight loss of performance.
Security in Online Freelance Software Development: A case for Distributed Security Responsibility
Abstract
Secure software is a cornerstone to safe and resilient digital ecosystems. It offers strong foundation to protect users' sensitive data and guard against cyber-threats. The rapidly increasing landscape of digital economy has encouraged developers from different socio-technical and socio-economic backgrounds to join online freelance marketplaces. While, secure software practices facilitate software developers in developing secure software, there is paucity of research on how freelance developers adhere to security practices and how they can be facilitated to improve their security behavior in under-resourced environments. Moreover, freelance developers are often held responsible for producing insecure code. In this position paper, we review existing literature and argue for the case of distributed security responsibilities in online freelance environment. We propose a research agenda aimed at offering an organized and systematic effort by researchers to address security needs and challenges of online freelance marketplaces. These include: characterising software security and defining separation of responsibilities, building trust in online freelance development communities, leveraging the potential of online freelancing platforms in the promotion of secure software development and building adaptive security interventions for online freelance software development. The research has the potential to bring forth existing security solutions to wider developer community and deliver substantial benefits to the broader security ecosystem.
AICT: An Adaptive Image Compression Transformer
Authors: Ahmed Ghorbel, Wassim Hamidouche, Luce Morin
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
Abstract
Motivated by the efficiency investigation of the Tranformer-based transform coding framework, namely SwinT-ChARM, we propose to enhance the latter, as first, with a more straightforward yet effective Tranformer-based channel-wise auto-regressive prior model, resulting in an absolute image compression transformer (ICT). Current methods that still rely on ConvNet-based entropy coding are limited in long-range modeling dependencies due to their local connectivity and an increasing number of architectural biases and priors. On the contrary, the proposed ICT can capture both global and local contexts from the latent representations and better parameterize the distribution of the quantized latents. Further, we leverage a learnable scaling module with a sandwich ConvNeXt-based pre/post-processor to accurately extract more compact latent representation while reconstructing higher-quality images. Extensive experimental results on benchmark datasets showed that the proposed adaptive image compression transformer (AICT) framework significantly improves the trade-off between coding efficiency and decoder complexity over the versatile video coding (VVC) reference encoder (VTM-18.0) and the neural codec SwinT-ChARM.
Self-Adaptive Large Language Model (LLM)-Based Multiagent Systems
Authors: Nathalia Nascimento, Paulo Alencar, Donald Cowan
Subjects: Multiagent Systems (cs.MA); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Abstract
In autonomic computing, self-adaptation has been proposed as a fundamental paradigm to manage the complexity of multiagent systems (MASs). This achieved by extending a system with support to monitor and adapt itself to achieve specific concerns of interest. Communication in these systems is key given that in scenarios involving agent interaction, it enhances cooperation and reduces coordination challenges by enabling direct, clear information exchange. However, improving the expressiveness of the interaction communication with MASs is not without challenges. In this sense, the interplay between self-adaptive systems and effective communication is crucial for future MAS advancements. In this paper, we propose the integration of large language models (LLMs) such as GPT-based technologies into multiagent systems. We anchor our methodology on the MAPE-K model, which is renowned for its robust support in monitoring, analyzing, planning, and executing system adaptations in response to dynamic environments. We also present a practical illustration of the proposed approach, in which we implement and assess a basic MAS-based application. The approach significantly advances the state-of-the-art of self-adaptive systems by proposing a new paradigm for MAS self-adaptation of autonomous systems based on LLM capabilities.
Locally Adaptive Federated Learning via Stochastic Polyak Stepsizes
Authors: Sohom Mukherjee, Nicolas Loizou, Sebastian U. Stich
Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
Abstract
State-of-the-art federated learning algorithms such as FedAvg require carefully tuned stepsizes to achieve their best performance. The improvements proposed by existing adaptive federated methods involve tuning of additional hyperparameters such as momentum parameters, and consider adaptivity only in the server aggregation round, but not locally. These methods can be inefficient in many practical scenarios because they require excessive tuning of hyperparameters and do not capture local geometric information. In this work, we extend the recently proposed stochastic Polyak stepsize (SPS) to the federated learning setting, and propose new locally adaptive and nearly parameter-free distributed SPS variants (FedSPS and FedDecSPS). We prove that FedSPS converges linearly in strongly convex and sublinearly in convex settings when the interpolation condition (overparametrization) is satisfied, and converges to a neighborhood of the solution in the general case. We extend our proposed method to a decreasing stepsize version FedDecSPS, that converges also when the interpolation condition does not hold. We validate our theoretical claims by performing illustrative convex experiments. Our proposed algorithms match the optimization performance of FedAvg with the best tuned hyperparameters in the i.i.d. case, and outperform FedAvg in the non-i.i.d. case.
Keyword: quantization
Mixed-Precision Quantization with Cross-Layer Dependencies
Authors: Zihao Deng, Xin Wang, Sayeh Sharify, Michael Orshansky
Subjects: Neural and Evolutionary Computing (cs.NE)
Abstract
Quantization is commonly used to compress and accelerate deep neural networks. Quantization assigning the same bit-width to all layers leads to large accuracy degradation at low precision and is wasteful at high precision settings. Mixed-precision quantization (MPQ) assigns varied bit-widths to layers to optimize the accuracy-efficiency trade-off. Existing methods simplify the MPQ problem by assuming that quantization errors at different layers act independently. We show that this assumption does not reflect the true behavior of quantized deep neural networks. We propose the first MPQ algorithm that captures the cross-layer dependency of quantization error. Our algorithm (CLADO) enables a fast approximation of pairwise cross-layer error terms by solving linear equations that require only forward evaluations of the network on a small amount of data. Decisions on layerwise bit-width assignments are then determined by optimizing a new MPQ formulation dependent on these cross-layer quantization errors via the Integer Quadratic Program (IQP), which can be solved within seconds. We conduct experiments on multiple networks on the Imagenet dataset and demonstrate an improvement, in top-1 classification accuracy, of up to 27% over uniform precision quantization, and up to 15% over existing MPQ methods.
Self-Distilled Quantization: Achieving High Compression Rates in Transformer-Based Language Models
Authors: James O' Neill, Sourav Dutta
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)
Abstract
We investigate the effects of post-training quantization and quantization-aware training on the generalization of Transformer language models. We present a new method called self-distilled quantization (SDQ) that minimizes accumulative quantization errors and outperforms baselines. We apply SDQ to multilingual models XLM-R-Base and InfoXLM-Base and demonstrate that both models can be reduced from 32-bit floating point weights to 8-bit integer weights while maintaining a high level of performance on the XGLUE benchmark. Our results also highlight the challenges of quantizing multilingual models, which must generalize to languages they were not fine-tuned on.
Learning Kernel-Modulated Neural Representation for Efficient Light Field Compression
Abstract
Light field is a type of image data that captures the 3D scene information by recording light rays emitted from a scene at various orientations. It offers a more immersive perception than classic 2D images but at the cost of huge data volume. In this paper, we draw inspiration from the visual characteristics of Sub-Aperture Images (SAIs) of light field and design a compact neural network representation for the light field compression task. The network backbone takes randomly initialized noise as input and is supervised on the SAIs of the target light field. It is composed of two types of complementary kernels: descriptive kernels (descriptors) that store scene description information learned during training, and modulatory kernels (modulators) that control the rendering of different SAIs from the queried perspectives. To further enhance compactness of the network meanwhile retain high quality of the decoded light field, we accordingly introduce modulator allocation and kernel tensor decomposition mechanisms, followed by non-uniform quantization and lossless entropy coding techniques, to finally form an efficient compression pipeline. Extensive experiments demonstrate that our method outperforms other state-of-the-art (SOTA) methods by a significant margin in the light field compression task. Moreover, after aligning descriptors, the modulators learned from one light field can be transferred to new light fields for rendering dense views, indicating a potential solution for view synthesis task.
Keyword: efficient
Prototyping Theories with ChatGPT: Experiment with the Technology Acceptance Model
Towards Environmentally Equitable AI via Geographical Load Balancing
Modelling human seat contact interaction for vibration comfort
Estimating See and Be Seen Performance with an Airborne Visual Acquisition Model
On Pseudolinear Codes for Correcting Adversarial Errors
Improved Efficiency and Accuracy of the Magnetic Polarizability Tensor Spectral Signature Object Characterisation for Metal Detection
A Novel Approach to Identify Security Controls in Source Code
Image Reconstruction using Enhanced Vision Transformer
$\mathrm{SAM^{Med}}$: A medical image annotation framework based on large vision model
Area, Delay, and Energy-Efficient Full Dadda Multiplier
Stack More Layers Differently: High-Rank Training Through Low-Rank Updates
A Personalized Reinforcement Learning Summarization Service for Learning Structure from Unstructured Data
A Machine-Learned Ranking Algorithm for Dynamic and Personalised Car Pooling Services
SepHRNet: Generating High-Resolution Crop Maps from Remote Sensing imagery using HRNet with Separable Convolution
Minimum Cost Loop Nests for Contraction of a Sparse Tensor with a Tensor Network
Formal and Fuzzing Amplification: Targeting Vulnerability Detection in 5G and Beyond
Making the Nyström method highly accurate for low-rank approximations
Design of an energy aware petaflops class high performance cluster based on power architecture
Neuro-Inspired Efficient Map Building via Fragmentation and Recall
Verifi-Chain: A Credentials Verifier using Blockchain and IPFS
Differentiable Forward Projector for X-ray Computed Tomography
DeepMapping: The Case for Learned Data Mapping for Compression and Efficient Query Processing
Knowledge-Driven Resource Allocation for D2D Networks: A WMMSE Unrolled Graph Neural Network Approach
Efficient Task Offloading Algorithm for Digital Twin in Edge/Cloud Computing Environment
FIS-ONE: Floor Identification System with One Label for Crowdsourced RF Signals
Prompt Generate Train (PGT): A framework for few-shot domain adaptation, alignment, and uncertainty calibration of a retriever augmented generation (RAG) model for domain specific open book question-answering
SwiFT: Swin 4D fMRI Transformer
Introducing Packet-Level Analysis in Programmable Data Planes to Advance Network Intrusion Detection
Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition
VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models
Transformers in Reinforcement Learning: A Survey
Flexible and Fully Quantized Ultra-Lightweight TinyissimoYOLO for Ultra-Low-Power Edge Systems
Unsupervised Optical Flow Estimation with Dynamic Timing Representation for Spike Camera
An Effective and Efficient Time-aware Entity Alignment Framework via Two-aspect Three-view Label Propagation
Semantic Communications System with Model Division Multiple Access and Controllable Coding Rate for Point Cloud
On the Design of Nonlinear MPC and LPVMPC for Obstacle Avoidance in Autonomous Driving
Rate-Power Tradeoff in THz SWIPT Systems Employing Resonant Tunnelling Diode-based EH Circuits
Acceleration of complex matrix multiplication using arbitrary precision floating-point arithmetic
Exploring Millions of User Interactions with ICEBOAT: Big Data Analytics for Automotive User Interfaces
Fast Decoding of Lifted Interleaved Linearized Reed-Solomon Codes for Multishot Network Coding
Recognizing student identification numbers from the matrix templates using a modified U-net architecture
Learning Kernel-Modulated Neural Representation for Efficient Light Field Compression
NetGPT: A Native-AI Network Architecture Beyond Provisioning Personalized Generative Services
CellGAN: Conditional Cervical Cell Synthesis for Augmenting Cytopathological Image Classification
An Architecture for Control Plane Slicing in Beyond 5G Networks
Power Loss Minimization of Distribution Network using Different Grid Strategies
Connectivity Labeling for Multiple Vertex Failures
A Comparative Analysis Between the Additive and the Multiplicative Extended Kalman Filter for Satellite Attitude Determination
Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution
Locally Adaptive Federated Learning via Stochastic Polyak Stepsizes
SAGE -- A Tool for Optimal Deployments in Kubernetes Clusters
Information-Theoretically Private Federated Submodel Learning with Storage Constrained Databases
Keyword: faster
Area, Delay, and Energy-Efficient Full Dadda Multiplier
DSPC: Efficiently Answering Shortest Path Counting on Dynamic Graphs
FGo: A Directed Grey-box Fuzzer with Probabilistic Exponential cut-the-loss Strategies
Flexible and Fully Quantized Ultra-Lightweight TinyissimoYOLO for Ultra-Low-Power Edge Systems
Keyword: mobile
HIVA: Holographic Intellectual Voice Assistant
UX Heuristics and Checklist for Deep Learning powered Mobile Applications with Image Classification
Towards Mobility Data Science (Vision Paper)
SnakeSynth: New Interactions for Generative Audio Synthesis
Efficient Task Offloading Algorithm for Digital Twin in Edge/Cloud Computing Environment
Applying SDN to Mobile Networks: A New Perspective for 6G Architecture
Exact Resource Allocation over Fair Wireless Relay Networks
SayPlan: Grounding Large Language Models using 3D Scene Graphs for Scalable Task Planning
An Architecture for Control Plane Slicing in Beyond 5G Networks
Tackling Computational Heterogeneity in FL: A Few Theoretical Insights
Keyword: pruning
There is no result
Keyword: diffusion
Augmenters at SemEval-2023 Task 1: Enhancing CLIP in Handling Compositionality and Ambiguity for Zero-Shot Visual WSD through Prompt Augmentation and Text-To-Image Diffusion
Merging multiple input descriptors and supervisors in a deep neural network for tractogram filtering
DiffuseGAE: Controllable and High-fidelity Image Manipulation from Disentangled Representation
Towards Safe Self-Distillation of Internet-Scale Text-to-Image Diffusion Models
Reduced basis method for non-symmetric eigenvalue problems: application to the multigroup neutron diffusion equations
Navigating the Complexity of Generative AI Adoption in Software Engineering
Learning Stochastic Dynamical Systems as an Implicit Regularization with Graph Neural Networks
Diffusion Based Multi-Agent Adversarial Tracking
Exposing the Fake: Effective Diffusion-Generated Images Detection
Keyword: adaptive
Adaptive Graph Convolution Networks for Traffic Flow Forecasting
Improved Efficiency and Accuracy of the Magnetic Polarizability Tensor Spectral Signature Object Characterisation for Metal Detection
Transaction Fraud Detection via an Adaptive Graph Neural Network
EgoAdapt: A multi-stream evaluation study of adaptation to real-world egocentric user video
Making the Nyström method highly accurate for low-rank approximations
Implicit Adaptive Mesh Refinement for Dispersive Tsunami Propagation
SnakeSynth: New Interactions for Generative Audio Synthesis
GLA-GCN: Global-local Adaptive Graph Convolutional Network for 3D Human
FAIRO: Fairness-aware Adaptation in Sequential-Decision Making for Human-in-the-Loop Systems
An adaptive approach to remove tensile instability in SPH for weakly compressible fluids
On the Design of Nonlinear MPC and LPVMPC for Obstacle Avoidance in Autonomous Driving
Security in Online Freelance Software Development: A case for Distributed Security Responsibility
AICT: An Adaptive Image Compression Transformer
Self-Adaptive Large Language Model (LLM)-Based Multiagent Systems
Locally Adaptive Federated Learning via Stochastic Polyak Stepsizes
Keyword: quantization
Mixed-Precision Quantization with Cross-Layer Dependencies
Self-Distilled Quantization: Achieving High Compression Rates in Transformer-Based Language Models
Learning Kernel-Modulated Neural Representation for Efficient Light Field Compression