Abstract
An end-to-end machine learning (ML) lifecycle consists of many iterative processes, from data preparation and ML model design to model training and then deploying the trained model for inference. When building an end-to-end lifecycle for an ML problem, many ML pipelines must be designed and executed that produce a huge number of lifecycle versions. Therefore, this paper introduces VeML, a Version management system dedicated to end-to-end ML Lifecycle. Our system tackles several crucial problems that other systems have not solved. First, we address the high cost of building an ML lifecycle, especially for large-scale and high-dimensional dataset. We solve this problem by proposing to transfer the lifecycle of similar datasets managed in our system to the new training data. We design an algorithm based on the core set to compute similarity for large-scale, high-dimensional data efficiently. Another critical issue is the model accuracy degradation by the difference between training data and testing data during the ML lifetime, which leads to lifecycle rebuild. Our system helps to detect this mismatch without getting labeled data from testing data and rebuild the ML lifecycle for a new data version. To demonstrate our contributions, we conduct experiments on real-world, large-scale datasets of driving images and spatiotemporal sensor data and show promising results.
Diffusion Probabilistic Model Based Accurate and High-Degree-of-Freedom Metasurface Inverse Design
Abstract
Conventional meta-atom designs rely heavily on researchers' prior knowledge and trial-and-error searches using full-wave simulations, resulting in time-consuming and inefficient processes. Inverse design methods based on optimization algorithms, such as evolutionary algorithms, and topological optimizations, have been introduced to design metamaterials. However, none of these algorithms are general enough to fulfill multi-objective tasks. Recently, deep learning methods represented by Generative Adversarial Networks (GANs) have been applied to inverse design of metamaterials, which can directly generate high-degree-of-freedom meta-atoms based on S-parameter requirements. However, the adversarial training process of GANs makes the network unstable and results in high modeling costs. This paper proposes a novel metamaterial inverse design method based on the diffusion probability theory. By learning the Markov process that transforms the original structure into a Gaussian distribution, the proposed method can gradually remove the noise starting from the Gaussian distribution and generate new high-degree-of-freedom meta-atoms that meet S-parameter conditions, which avoids the model instability introduced by the adversarial training process of GANs and ensures more accurate and high-quality generation results. Experiments have proven that our method is superior to representative methods of GANs in terms of model convergence speed, generation accuracy, and quality.
Optimizing Deep Learning Models For Raspberry Pi
Authors: Salem Ameen, Kangaranmulle Siriwardana, Theo Theodoridis
Subjects: Systems and Control (eess.SY); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Performance (cs.PF)
Abstract
Deep learning models have become increasingly popular for a wide range of applications, including computer vision, natural language processing, and speech recognition. However, these models typically require large amounts of computational resources, making them challenging to run on low-power devices such as the Raspberry Pi. One approach to addressing this challenge is to use pruning techniques to reduce the size of the deep learning models. Pruning involves removing unimportant weights and connections from the model, resulting in a smaller and more efficient model. Pruning can be done during training or after the model has been trained. Another approach is to optimize the deep learning models specifically for the Raspberry Pi architecture. This can include optimizing the model's architecture and parameters to take advantage of the Raspberry Pi's hardware capabilities, such as its CPU and GPU. Additionally, the model can be optimized for energy efficiency by minimizing the amount of computation required. Pruning and optimizing deep learning models for the Raspberry Pi can help overcome the computational and energy constraints of low-power devices, making it possible to run deep learning models on a wider range of devices. In the following sections, we will explore these approaches in more detail and discuss their effectiveness for optimizing deep learning models for the Raspberry Pi.
Organizational Governance of Emerging Technologies: AI Adoption in Healthcare
Authors: Jee Young Kim, William Boag, Freya Gulamali, Alifia Hasan, Henry David Jeffry Hogg, Mark Lifson, Deirdre Mulligan, Manesh Patel, Inioluwa Deborah Raji, Ajai Sehgal, Keo Shaw, Danny Tobey, Alexandra Valladares, David Vidal, Suresh Balu, Mark Sendak
Abstract
Private and public sector structures and norms refine how emerging technology is used in practice. In healthcare, despite a proliferation of AI adoption, the organizational governance surrounding its use and integration is often poorly understood. What the Health AI Partnership (HAIP) aims to do in this research is to better define the requirements for adequate organizational governance of AI systems in healthcare settings and support health system leaders to make more informed decisions around AI adoption. To work towards this understanding, we first identify how the standards for the AI adoption in healthcare may be designed to be used easily and efficiently. Then, we map out the precise decision points involved in the practical institutional adoption of AI technology within specific health systems. Practically, we achieve this through a multi-organizational collaboration with leaders from major health systems across the United States and key informants from related fields. Working with the consultancy IDEO.org, we were able to conduct usability-testing sessions with healthcare and AI ethics professionals. Usability analysis revealed a prototype structured around mock key decision points that align with how organizational leaders approach technology adoption. Concurrently, we conducted semi-structured interviews with 89 professionals in healthcare and other relevant fields. Using a modified grounded theory approach, we were able to identify 8 key decision points and comprehensive procedures throughout the AI adoption lifecycle. This is one of the most detailed qualitative analyses to date of the current governance structures and processes involved in AI adoption by health systems in the United States. We hope these findings can inform future efforts to build capabilities to promote the safe, effective, and responsible adoption of emerging technologies in healthcare.
Bridging graph data models: RDF, RDF-star, and property graphs as directed acyclic graphs
Authors: Ewout Gelling, George Fletcher, Michael Schmidt
Abstract
Graph database users today face a choice between two technology stacks: the Resource Description Framework (RDF), on one side, is a data model with built-in semantics that was originally developed by the W3C to exchange interconnected data on the Web; on the other side, Labeled Property Graphs (LPGs) are geared towards efficient graph processing and have strong roots in developer and engineering communities. The two models look at graphs from different abstraction layers (triples in RDF vs. edges connecting vertices with inlined properties in LPGs), expose - at least at the surface - distinct features, come with different query languages, and are embedded into their own software ecosystems. In this short paper, we introduce a novel unifying graph data model called Statement Graphs, which combines the traits of both RDF and LPG and achieves interoperability at different levels: it (a) provides the ability to manage RDF and LPG data as a single, interconnected graph, (b) supports querying over the integrated graph using any RDF or LPG query language, while (c) clearing the way for graph stack independent data exchange mechanisms and formats. We formalize our new model as directed acyclic graphs and sketch a system of bidirectional mappings between RDF, LPGs, and Statement Graphs. Our mappings implicitly define read query semantics for RDF and LPGs query languages over the unified data model, thus providing graph users with the flexibility to use the query language of their choice for their graph use cases. As a proof of concept for our ideas, we also present the 1G Playground; an in-memory DBMS built on the concepts of Statement Graphs, which facilitates storage of both RDF and LPG data, and allows for cross-model querying using both SPARQL and Gremlin.
Exponentially Convergent Numerical Method for Abstract Cauchy Problem with Fractional Derivative of Caputo Type
Authors: Dmytro Sytnyk, Barbara Wohlmuth
Subjects: Numerical Analysis (math.NA); Mathematical Software (cs.MS); Analysis of PDEs (math.AP); Classical Analysis and ODEs (math.CA)
Abstract
We present an exponentially convergent numerical method to approximate the solution of the Cauchy problem for the inhomogeneous fractional differential equation with an unbounded operator coefficient and Caputo fractional derivative in time. The numerical method is based on the newly obtained solution formula that consolidates the mild solution representations of sub-parabolic, parabolic and sub-hyperbolic equations with sectorial operator coefficient $A$ and non-zero initial data. The involved integral operators are approximated using the sinc-quadrature formulas that are tailored to the spectral parameters of $A$, fractional order $\alpha$ and the smoothness of the first initial condition, as well as to the properties of the equation's right-hand side $f(t)$. The resulting method possesses exponential convergence for positive sectorial $A$, any finite $t$, including $t = 0$, and the whole range $\alpha \in (0,2)$. It is suitable for a practically important case, when no knowledge of $f(t)$ is available outside the considered interval $t \in [0, T]$. The algorithm of the method is capable of multi-level parallelism. We provide numerical examples that confirm the theoretical error estimates.
Application of Transformers for Nonlinear Channel Compensation in Optical Systems
Authors: Behnam Behinaein Hamgini, Hossein Najafi, Ali Bakhshali, Zhuhong Zhang
Subjects: Information Theory (cs.IT); Machine Learning (cs.LG); Signal Processing (eess.SP)
Abstract
In this paper, we introduce a new nonlinear channel equalization method for the coherent long-haul transmission based on Transformers. We show that due to their capability to attend directly to the memory across a sequence of symbols, Transformers can be used effectively with a parallelized structure. We present an implementation of encoder part of Transformer for nonlinear equalization and analyze its performance over a wide range of different hyper-parameters. It is shown that by processing blocks of symbols at each iteration and carefully selecting subsets of the encoder's output to be processed together, an efficient nonlinear compensation can be achieved. We also propose the use of a physic-informed mask inspired by nonlinear perturbation theory for reducing the computational complexity of Transformer nonlinear equalization.
Directed Chain Generative Adversarial Networks
Authors: Ming Min, Ruimeng Hu, Tomoyuki Ichiba
Subjects: Machine Learning (cs.LG); Probability (math.PR)
Abstract
Real-world data can be multimodal distributed, e.g., data describing the opinion divergence in a community, the interspike interval distribution of neurons, and the oscillators natural frequencies. Generating multimodal distributed real-world data has become a challenge to existing generative adversarial networks (GANs). For example, neural stochastic differential equations (Neural SDEs), treated as infinite-dimensional GANs, have demonstrated successful performance mainly in generating unimodal time series data. In this paper, we propose a novel time series generator, named directed chain GANs (DC-GANs), which inserts a time series dataset (called a neighborhood process of the directed chain or input) into the drift and diffusion coefficients of the directed chain SDEs with distributional constraints. DC-GANs can generate new time series of the same distribution as the neighborhood process, and the neighborhood process will provide the key step in learning and generating multimodal distributed time series. The proposed DC-GANs are examined on four datasets, including two stochastic models from social sciences and computational neuroscience, and two real-world datasets on stock prices and energy consumption. To our best knowledge, DC-GANs are the first work that can generate multimodal time series data and consistently outperforms state-of-the-art benchmarks with respect to measures of distribution, data similarity, and predictive ability.
ESimCSE Unsupervised Contrastive Learning Jointly with UDA Semi-Supervised Learning for Large Label System Text Classification Mode
Authors: Ruan Lu, Zhou HangCheng, Ran Meng, Zhao Jin, Qin JiaoYu, Wei Feng, Wang ChenZi
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL)
Abstract
The challenges faced by text classification with large tag systems in natural language processing tasks include multiple tag systems, uneven data distribution, and high noise. To address these problems, the ESimCSE unsupervised comparative learning and UDA semi-supervised comparative learning models are combined through the use of joint training techniques in the models.The ESimCSE model efficiently learns text vector representations using unlabeled data to achieve better classification results, while UDA is trained using unlabeled data through semi-supervised learning methods to improve the prediction performance of the models and stability, and further improve the generalization ability of the model. In addition, adversarial training techniques FGM and PGD are used in the model training process to improve the robustness and reliability of the model. The experimental results show that there is an 8% and 10% accuracy improvement relative to Baseline on the public dataset Ruesters as well as on the operational dataset, respectively, and a 15% improvement in manual validation accuracy can be achieved on the operational dataset, indicating that the method is effective.
LumiGAN: Unconditional Generation of Relightable 3D Human Faces
Authors: Boyang Deng, Yifan Wang, Gordon Wetzstein
Abstract
Unsupervised learning of 3D human faces from unstructured 2D image data is an active research area. While recent works have achieved an impressive level of photorealism, they commonly lack control of lighting, which prevents the generated assets from being deployed in novel environments. To this end, we introduce LumiGAN, an unconditional Generative Adversarial Network (GAN) for 3D human faces with a physically based lighting module that enables relighting under novel illumination at inference time. Unlike prior work, LumiGAN can create realistic shadow effects using an efficient visibility formulation that is learned in a self-supervised manner. LumiGAN generates plausible physical properties for relightable faces, including surface normals, diffuse albedo, and specular tint without any ground truth data. In addition to relightability, we demonstrate significantly improved geometry generation compared to state-of-the-art non-relightable 3D GANs and notably better photorealism than existing relightable GANs.
LEMaRT: Label-Efficient Masked Region Transform for Image Harmonization
Abstract
We present a simple yet effective self-supervised pre-training method for image harmonization which can leverage large-scale unannotated image datasets. To achieve this goal, we first generate pre-training data online with our Label-Efficient Masked Region Transform (LEMaRT) pipeline. Given an image, LEMaRT generates a foreground mask and then applies a set of transformations to perturb various visual attributes, e.g., defocus blur, contrast, saturation, of the region specified by the generated mask. We then pre-train image harmonization models by recovering the original image from the perturbed image. Secondly, we introduce an image harmonization model, namely SwinIH, by retrofitting the Swin Transformer [27] with a combination of local and global self-attention mechanisms. Pre-training SwinIH with LEMaRT results in a new state of the art for image harmonization, while being label-efficient, i.e., consuming less annotated data for fine-tuning than existing methods. Notably, on iHarmony4 dataset [8], SwinIH outperforms the state of the art, i.e., SCS-Co [16] by a margin of 0.4 dB when it is fine-tuned on only 50% of the training data, and by 1.0 dB when it is trained on the full training dataset.
Connector 0.5: A unified framework for graph representation learning
Authors: Thanh Sang Nguyen, Jooho Lee, Van Thuy Hoang, O-Joun Lee
Subjects: Machine Learning (cs.LG); Social and Information Networks (cs.SI)
Abstract
Graph representation learning models aim to represent the graph structure and its features into low-dimensional vectors in a latent space, which can benefit various downstream tasks, such as node classification and link prediction. Due to its powerful graph data modelling capabilities, various graph embedding models and libraries have been proposed to learn embeddings and help researchers ease conducting experiments. In this paper, we introduce a novel graph representation framework covering various graph embedding models, ranging from shallow to state-of-the-art models, namely Connector. First, we consider graph generation by constructing various types of graphs with different structural relations, including homogeneous, signed, heterogeneous, and knowledge graphs. Second, we introduce various graph representation learning models, ranging from shallow to deep graph embedding models. Finally, we plan to build an efficient open-source framework that can provide deep graph embedding models to represent structural relations in graphs. The framework is available at https://github.com/NSLab-CUK/Connector.
Numerical methods for computing the discrete and continuous Laplace transforms
Authors: Yupeng Zhang, Yueyang Shen, Rongqian Zhang, Yuyao Liu, Yunjie Guo, Daxuan Deng, Ivo D. Dinov
Abstract
We propose a numerical method to spline-interpolate discrete signals and then apply the integral transforms to the corresponding analytical spline functions. This represents a robust and computationally efficient technique for estimating the Laplace transform for noisy data. We revisited a Meijer-G symbolic approach to compute the Laplace transform and alternative approaches to extend canonical observed time-series. A discrete quantization scheme provides the foundation for rapid and reliable estimation of the inverse Laplace transform. We derive theoretic estimates for the inverse Laplace transform of analytic functions and demonstrate empirical results validating the algorithmic performance using observed and simulated data. We also introduce a generalization of the Laplace transform in higher dimensional space-time. We tested the discrete LT algorithm on data sampled from analytic functions with known exact Laplace transforms. The validation of the discrete ILT involves using complex functions with known analytic ILTs.
Cooperative Hierarchical Deep Reinforcement Learning based Joint Sleep, Power, and RIS Control for Energy-Efficient HetNet
Abstract
Energy efficiency (EE) is one of the most important metrics for 5G and future 6G networks to reduce energy costs and control carbon footprint. Sleep control, as a cost-efficient approach, can significantly lower power consumption by switching off network devices selectively. Meanwhile, reconfigurable intelligent surface (RIS) has emerged as a promising technique to enhance the EE of 5G beyond and 6G networks. In this work, we jointly consider sleep and transmission power control for reconfigurable intelligent surface (RIS)-aided energy-efficient heterogeneous networks (Hetnets). In particular, we first propose a fractional programming (FP) method for RIS phase-shift control, which aims to maximize the sum-rate under given transmission power levels. Then, considering the timescale difference between sleep control and power control, we introduce a cooperative hierarchical deep reinforcement learning (Co-HDRL) algorithm, including a cross-entropy enabled meta-controller for sleep control, and correlated equilibrium-based sub-controllers for power control. Moreover, we proposed a surrogate optimization method as one baseline for RIS control, and conventional HDRL as another baseline for sleep and power control. Finally, simulations show that the RIS-assisted sleep control can achieve more than 16% lower energy consumption and 30% higher energy efficiency than baseline algorithms.
Generating Adversarial Examples with Task Oriented Multi-Objective Optimization
Authors: Anh Bui, Trung Le, He Zhao, Quan Tran, Paul Montague, Dinh Phung
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Abstract
Deep learning models, even the-state-of-the-art ones, are highly vulnerable to adversarial examples. Adversarial training is one of the most efficient methods to improve the model's robustness. The key factor for the success of adversarial training is the capability to generate qualified and divergent adversarial examples which satisfy some objectives/goals (e.g., finding adversarial examples that maximize the model losses for simultaneously attacking multiple models). Therefore, multi-objective optimization (MOO) is a natural tool for adversarial example generation to achieve multiple objectives/goals simultaneously. However, we observe that a naive application of MOO tends to maximize all objectives/goals equally, without caring if an objective/goal has been achieved yet. This leads to useless effort to further improve the goal-achieved tasks, while putting less focus on the goal-unachieved tasks. In this paper, we propose \emph{Task Oriented MOO} to address this issue, in the context where we can explicitly define the goal achievement for a task. Our principle is to only maintain the goal-achieved tasks, while letting the optimizer spend more effort on improving the goal-unachieved tasks. We conduct comprehensive experiments for our Task Oriented MOO on various adversarial example generation schemes. The experimental results firmly demonstrate the merit of our proposed approach. Our code is available at \url{https://github.com/tuananhbui89/TAMOO}.
Numerical Approximation of Andrews Plots with Optimal Spatial-Spectral Smoothing
Abstract
Andrews plots provide aesthetically pleasant visualizations of high-dimensional datasets. This work proves that Andrews plots (when defined in terms of the principal component scores of a dataset) are optimally smooth'' on average, and solve an infinite-dimensional quadratic minimization program over the set of linear isometries from the Euclidean data space to $L^2([0,1])$. By building technical machinery that characterizes the solutions to general infinite-dimensional quadratic minimization programs over linear isometries, we further show that the solution set is (in the generic case) a manifold. To avoid the ambiguities presented by this manifold of solutions, we addspectral smoothing'' terms to the infinite-dimensional optimization program to induce Andrews plots with optimal spatial-spectral smoothing. We characterize the (generic) set of solutions to this program and prove that the resulting plots admit efficient numerical approximations. These spatial-spectral smooth Andrews plots tend to avoid some ``visual clutter'' that arises due to the oscillation of trigonometric polynomials.
Structure Diagram Recognition in Financial Announcements
Authors: Meixuan Qiao, Jun Wang, Junfu Xiang, Qiyu Hou, Ruixuan Li
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Abstract
Accurately extracting structured data from structure diagrams in financial announcements is of great practical importance for building financial knowledge graphs and further improving the efficiency of various financial applications. First, we proposed a new method for recognizing structure diagrams in financial announcements, which can better detect and extract different types of connecting lines, including straight lines, curves, and polylines of different orientations and angles. Second, we developed a two-stage method to efficiently generate the industry's first benchmark of structure diagrams from Chinese financial announcements, where a large number of diagrams were synthesized and annotated using an automated tool to train a preliminary recognition model with fairly good performance, and then a high-quality benchmark can be obtained by automatically annotating the real-world structure diagrams using the preliminary model and then making few manual corrections. Finally, we experimentally verified the significant performance advantage of our structure diagram recognition method over previous methods.
ESCM: An Efficient and Secure Communication Mechanism for UAV Networks
Abstract
UAV (unmanned aerial vehicle) is gradually entering various human activities. It has also become an important part of satellite-air-ground-sea integrated network (SAGS) for 6G communication. In order to achieve high mobility, UAV has strict requirements on communication latency, and it cannot be illegally controlled as weapons of attack with malicious intentions. Therefore, an efficient and secure communication method specifically designed for UAV network is required. This paper proposes a communication mechanism named ESCM for the above requirements. For high efficiency of communication, ESCM designs a routing protocol based on artificial bee colony algorithm (ABC) for UAV network to accelerate communication between UAVs. Meanwhile, we plan to use blockchain to guarantee the communication security of UAV networks. However, blockchain has unstable links in high mobility network scenarios, resulting in low consensus efficiency and high communication overhead. Therefore, ESCM also introduces the concept of the digital twin, mapping the UAVs from the physical world into Cyberspace, transforming the UAV network into a static network. And this virtual UAV network is called CyberUAV. Then, in CyberUAV, we design a blockchain system and propose a consensus algorithm based on network coding, named proof of network coding (PoNC). PoNC not only ensures the security of ESCM, but also further improves the performance of ESCM through network coding. Simulation results show that ESCM has obvious advantages in communication efficiency and security. Moreover, encoding messages through PoNC consensus can increase the network throughput, and make mobile blockchain static through digital twin can improve the consensus success rate.
CrowdCache: A Decentralized Game-Theoretic Framework for Mobile Edge Content Sharing
Abstract
Mobile edge computing (MEC) is a promising solution for enhancing the user experience, minimizing content delivery expenses, and reducing backhaul traffic. In this paper, we propose a novel privacy-preserving decentralized game-theoretic framework for resource crowdsourcing in MEC. Our framework models the interactions between a content provider (CP) and multiple mobile edge device users (MEDs) as a non-cooperative game, in which MEDs offer idle storage resources for content caching in exchange for rewards. We introduce efficient decentralized gradient play algorithms for Nash equilibrium (NE) computation by exchanging local information among neighboring MEDs only, thus preventing attackers from learning users' private information. The key challenge in designing such algorithms is that communication among MEDs is not fixed and is facilitated by a sequence of undirected time-varying graphs. Our approach achieves linear convergence to the NE without imposing any assumptions on the values of parameters in the local objective functions, such as requiring strong monotonicity to be stronger than its dependence on other MEDs' actions, which is commonly required in existing literature when the graph is directed time-varying. Extensive simulations demonstrate the effectiveness of our approach in achieving efficient resource outsourcing decisions while preserving the privacy of the edge devices.
Solution of planar elastic stress problems using stress basis functions
Abstract
The use of global displacement basis functions to solve boundary-value problems in linear elasticity is well established. No prior work uses a global stress tensor basis for such solutions. We present two such methods for solving stress problems in linear elasticity. In both methods, we split the sought stress $\sigma$ into two parts, where neither part is required to satisfy strain compatibility. The first part, $\sigma_p$, is any stress in equilibrium with the loading. The second part, $\sigma_h$, is a self-equilibrated stress field on the unloaded body. In both methods, $\sigma_h$ is expanded using tensor-valued global stress basis functions developed elsewhere. In the first method, the coefficients in the expansion are found by minimizing the strain energy based on the well-known complementary energy principle. For the second method, which is restricted to planar homogeneous isotropic bodies, we show that we merely need to minimize the squared $L^2$ norm of the trace of stress. For demonstration, we solve eight stress problems involving sharp corners, multiple-connectedness, non-zero net force and/or moment on an internal hole, body force, discontinuous surface traction, material inhomogeneity, and anisotropy. The first method presents a new application of a known principle. The second method presents a hitherto unreported principle, to the best of our knowledge.
C2PI: An Efficient Crypto-Clear Two-Party Neural Network Private Inference
Authors: Yuke Zhang, Dake Chen, Souvik Kundu, Haomei Liu, Ruiheng Peng, Peter A. Beerel
Abstract
Recently, private inference (PI) has addressed the rising concern over data and model privacy in machine learning inference as a service. However, existing PI frameworks suffer from high computational and communication costs due to the expensive multi-party computation (MPC) protocols. Existing literature has developed lighter MPC protocols to yield more efficient PI schemes. We, in contrast, propose to lighten them by introducing an empirically-defined privacy evaluation. To that end, we reformulate the threat model of PI and use inference data privacy attacks (IDPAs) to evaluate data privacy. We then present an enhanced IDPA, named distillation-based inverse-network attack (DINA), for improved privacy evaluation. Finally, we leverage the findings from DINA and propose C2PI, a two-party PI framework presenting an efficient partitioning of the neural network model and requiring only the initial few layers to be performed with MPC protocols. Based on our experimental evaluations, relaxing the formal data privacy guarantees C2PI can speed up existing PI frameworks, including Delphi [1] and Cheetah [2], up to 2.89x and 3.88x under LAN and WAN settings, respectively, and save up to 2.75x communication costs.
Making Models Shallow Again: Jointly Learning to Reduce Non-Linearity and Depth for Latency-Efficient Private Inference
Authors: Souvik Kundu, Yuke Zhang, Dake Chen, Peter A. Beerel
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR)
Abstract
Large number of ReLU and MAC operations of Deep neural networks make them ill-suited for latency and compute-efficient private inference. In this paper, we present a model optimization method that allows a model to learn to be shallow. In particular, we leverage the ReLU sensitivity of a convolutional block to remove a ReLU layer and merge its succeeding and preceding convolution layers to a shallow block. Unlike existing ReLU reduction methods, our joint reduction method can yield models with improved reduction of both ReLUs and linear operations by up to 1.73x and 1.47x, respectively, evaluated with ResNet18 on CIFAR-100 without any significant accuracy-drop.
Membrane Potential Distribution Adjustment and Parametric Surrogate Gradient in Spiking Neural Networks
Authors: Siqi Wang, Tee Hiang Cheng, Meng-Hiot Lim
Subjects: Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Abstract
As an emerging network model, spiking neural networks (SNNs) have aroused significant research attentions in recent years. However, the energy-efficient binary spikes do not augur well with gradient descent-based training approaches. Surrogate gradient (SG) strategy is investigated and applied to circumvent this issue and train SNNs from scratch. Due to the lack of well-recognized SG selection rule, most SGs are chosen intuitively. We propose the parametric surrogate gradient (PSG) method to iteratively update SG and eventually determine an optimal surrogate gradient parameter, which calibrates the shape of candidate SGs. In SNNs, neural potential distribution tends to deviate unpredictably due to quantization error. We evaluate such potential shift and propose methodology for potential distribution adjustment (PDA) to minimize the loss of undesired pre-activations. Experimental results demonstrate that the proposed methods can be readily integrated with backpropagation through time (BPTT) algorithm and help modulated SNNs to achieve state-of-the-art performance on both static and dynamic dataset with fewer timesteps.
Scene Graph Lossless Compression with Adaptive Prediction for Objects and Relations
Abstract
The scene graph is a new data structure describing objects and their pairwise relationship within image scenes. As the size of scene graph in vision applications grows, how to losslessly and efficiently store such data on disks or transmit over the network becomes an inevitable problem. However, the compression of scene graph is seldom studied before because of the complicated data structures and distributions. Existing solutions usually involve general-purpose compressors or graph structure compression methods, which is weak at reducing redundancy for scene graph data. This paper introduces a new lossless compression framework with adaptive predictors for joint compression of objects and relations in scene graph data. The proposed framework consists of a unified prior extractor and specialized element predictors to adapt for different data elements. Furthermore, to exploit the context information within and between graph elements, Graph Context Convolution is proposed to support different graph context modeling schemes for different graph elements. Finally, a learned distribution model is devised to predict numerical data under complicated conditional constraints. Experiments conducted on labeled or generated scene graphs proves the effectiveness of the proposed framework in scene graph lossless compression task.
Efficient Explainable Face Verification based on Similarity Score Argument Backpropagation
Authors: Marco Huber, Anh Thi Luu, Philipp Terhörst, Naser Damer
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Abstract
Explainable Face Recognition is gaining growing attention as the use of the technology is gaining ground in security-critical applications. Understanding why two faces images are matched or not matched by a given face recognition system is important to operators, users, anddevelopers to increase trust, accountability, develop better systems, and highlight unfair behavior. In this work, we propose xSSAB, an approach to back-propagate similarity score-based arguments that support or oppose the face matching decision to visualize spatial maps that indicate similar and dissimilar areas as interpreted by the underlying FR model. Furthermore, we present Patch-LFW, a new explainable face verification benchmark that enables along with a novel evaluation protocol, the first quantitative evaluation of the validity of similarity and dissimilarity maps in explainable face recognition approaches. We compare our efficient approach to state-of-the-art approaches demonstrating a superior trade-off between efficiency and performance. The code as well as the proposed Patch-LFW is publicly available at: https://github.com/marcohuber/xSSAB.
Fair Selection of Edge Nodes to Participate in Clustered Federated Multitask Learning
Authors: Abdullatif Albaseer, Mohamed Abdallah, Ala Al-Fuqaha, Abegaz Mohammed, Aiman Erbad, Octavia A. Dobre
Subjects: Networking and Internet Architecture (cs.NI)
Abstract
Clustered federated Multitask learning is introduced as an efficient technique when data is unbalanced and distributed amongst clients in a non-independent and identically distributed manner. While a similarity metric can provide client groups with specialized models according to their data distribution, this process can be time-consuming because the server needs to capture all data distribution first from all clients to perform the correct clustering. Due to resource and time constraints at the network edge, only a fraction of devices {is} selected every round, necessitating the need for an efficient scheduling technique to address these issues. Thus, this paper introduces a two-phased client selection and scheduling approach to improve the convergence speed while capturing all data distributions. This approach ensures correct clustering and fairness between clients by leveraging bandwidth reuse for participants spent a longer time training their models and exploiting the heterogeneity in the devices to schedule the participants according to their delay. The server then performs the clustering depending on predetermined thresholds and stopping criteria. When a specified cluster approximates a stopping point, the server employs a greedy selection for that cluster by picking the devices with lower delay and better resources. The convergence analysis is provided, showing the relationship between the proposed scheduling approach and the convergence rate of the specialized models to obtain convergence bounds under non-i.i.d. data distribution. We carry out extensive simulations, and the results demonstrate that the proposed algorithms reduce training time and improve the convergence speed while equipping every user with a customized model tailored to its data distribution.
FLEX: an Adaptive Exploration Algorithm for Nonlinear Systems
Abstract
Model-based reinforcement learning is a powerful tool, but collecting data to fit an accurate model of the system can be costly. Exploring an unknown environment in a sample-efficient manner is hence of great importance. However, the complexity of dynamics and the computational limitations of real systems make this task challenging. In this work, we introduce FLEX, an exploration algorithm for nonlinear dynamics based on optimal experimental design. Our policy maximizes the information of the next step and results in an adaptive exploration algorithm, compatible with generic parametric learning models and requiring minimal resources. We test our method on a number of nonlinear environments covering different settings, including time-varying dynamics. Keeping in mind that exploration is intended to serve an exploitation objective, we also test our algorithm on downstream model-based classical control tasks and compare it to other state-of-the-art model-based and model-free approaches. The performance achieved by FLEX is competitive and its computational cost is low.
An efficient multiple harmonic balance method for computing quasi-periodic responses of nonlinear systems
Abstract
Quasi-periodic responses composed of multiple base frequencies widely exist in science and engineering problems. The multiple harmonic balance (MHB) method is one of the most commonly used approaches for such problems. However, it is limited by low-order estimations due to complex symbolic operations in practical uses. Many variants have been developed to improve the MHB method, among which the time domain MHB-like methods are regarded as crucial improvements because of their high efficiency and simple derivation. But there is still one main drawback remaining to be addressed. The time domain MHB-like methods negatively suffer from non-physical solutions, which have been shown to be caused by aliasing (mixtures of the high-order into the low-order harmonics). Inspired by the collocation-based harmonic balancing framework recently established by our group, we herein propose a reconstruction multiple harmonic balance (RMHB) method to reconstruct the conventional MHB method using discrete time domain collocations. Our study shows that the relation between the MHB and time domain MHB-like methods is determined by an aliasing matrix, which is non-zero when aliasing occurs. On this basis, a conditional equivalence is established to form the RMHB method. Three numerical examples demonstrate that this new method is more robust and efficient than the state-of-the-art methods.
An Improved Modular Addition Checksum Algorithm
Authors: Philip Koopman
Subjects: Data Structures and Algorithms (cs.DS); Networking and Internet Architecture (cs.NI)
Abstract
This paper introduces a checksum algorithm that provides a new point in the performance/complexity/effectiveness checksum tradeoff space. It has better fault detection properties than single-sum and dual-sum modular addition checksums. It is also simpler to compute efficiently than a cyclic redundancy check (CRC) due to exploiting commonly available hardware and programming language support for unsigned integer division. The key idea is to compute a single running sum, but introduce a left shift by the size (in bits) of the modulus before performing the modular reduction after each addition step. This approach provides a Hamming Distance of 3 for longer data word lengths than dual-sum approaches such as the Fletcher checksum. Moreover, it provides this capability using a single running sum that is only twice the size of the final computed check value, while providing fault detection capabilities even better than large-block variants of dual-sum approaches that require larger division operations.
Leveraging Compositional Methods for Modeling and Verification of an Autonomous Taxi System
Authors: Alessandro Pinto, Anthony Corso, Edward Schmerling
Abstract
We apply a compositional formal modeling and verification method to an autonomous aircraft taxi system. We provide insights into the modeling approach and we identify several research areas where further development is needed. Specifically, we identify the following needs: (1) semantics of composition of viewpoints expressed in different specification languages, and tools to reason about heterogeneous declarative models; (2) libraries of formal models for autonomous systems to speed up modeling and enable efficient reasoning; (3) methods to lift verification results generated by automated reasoning tools to the specification level; (4) probabilistic contract frameworks to reason about imperfect implementations; (5) standard high-level functional architectures for autonomous systems; and (6) a theory of higher-order contracts. We believe that addressing these research needs, among others, could improve the adoption of formal methods in the design of autonomous systems including learning-enabled systems, and increase confidence in their safe operations.
Konzeption und Umsetzung einer mobilen Applikation zur Validierung von fälschungssicheren Produktlabeln
Abstract
Due to increasing numbers of product piracy worldwide, a cost-effective method for verifying the origin of a product is to be developed. For this purpose, a certificate of authenticity can be created using precisely measurable, unique properties of special physical objects that are difficult to reconstruct. In the context of the present work, this is a counterfeit-proof label composed of randomly distributed gold nanospheres or rods in a semi-transparent material. The characteristic positioning of the label's elements can be precisely measured using a smartphone's camera and additional technologies. This can create an offline usable verification method for the general public without the need for an existing network connection. The present work provides a first part of the proof of concept that such a system and especially the associated algorithmic computation method can be implemented and efficiently used in a mobile application. In addition, a method suitable in practice for transmitting and securing the required information is determined in each case. Furthermore, the results of the validation of counterfeit-proof product labels are analyzed in detail and existing weaknesses are pointed out. -- Auf Grund weltweit steigender Zahlen der Produktpiraterie soll ein kosteng\"unstiges Verfahren zur Verifizierung der Herkunft eines Produktes entwickelt werden. Daf\"ur l\"asst sich durch exakt messbare, einzigartige, jedoch schwer rekonstruierbare Eigenschaften spezieller physischer Objekte ein Echtheitszertifikat kreieren. Dieses ist im Kontext der vorliegenden Arbeit ein f\"alschungssicheres Label, das sich in einem semi-transparenten Material aus zuf\"allig verteilten Goldnanok\"ugelchen oder -st\"abchen zusammensetzt. Die charakteristischen Positionierungen der Elemente des Labels lassen sich mit der Kamera eines Smartphones und zus\"atzlichen Technologien pr\"azise messen. Dadurch kann f\"ur die breite Bev\"olkerung ohne die Notwendigkeit einer bestehenden Netzwerkverbindung ein offline verwendbares Verifikationsverfahren erschaffen werden. Die vorliegende Arbeit liefert einen ersten Teil des Machbarkeitsnachweises, dass ein derartiges System und insbesondere das damit einhergehende algorithmische Berechnungsverfahren in einer mobilen Applikation implementier -- und effizient einsetzbar ist. Zudem wird je eine in der Praxis geeignete Methode zur \"Ubermittlung und Sicherung der ben\"otigten Informationen eruiert. Des Weiteren werden die Resultate der Validierung von f\"alschungssicheren Produktlabeln ausf\"uhrlich analysiert und vorhandene Schw\"achen aufgezeigt.
Integrated Architecture for Neural Networks and Security Primitives using RRAM Crossbar
Abstract
This paper proposes an architecture that integrates neural networks (NNs) and hardware security modules using a single resistive random access memory (RRAM) crossbar. The proposed architecture enables using a single crossbar to implement NN, true random number generator (TRNG), and physical unclonable function (PUF) applications while exploiting the multi-state storage characteristic of the RRAM crossbar for the vector-matrix multiplication operation required for the implementation of NN. The TRNG is implemented by utilizing the crossbar's variation in device switching thresholds to generate random bits. The PUF is implemented using the same crossbar initialized as an entropy source for the TRNG. Additionally, the weights locking concept is introduced to enhance the security of NNs by preventing unauthorized access to the NN weights. The proposed architecture provides flexibility to configure the RRAM device in multiple modes to suit different applications. It shows promise in achieving a more efficient and compact design for the hardware implementation of NNs and security primitives.
A Two-Step Rule for Backpropagation
Authors: Ahmed Boughammoura
Subjects: Neural and Evolutionary Computing (cs.NE)
Abstract
We present a simplified computational rule for the back-propagation formulas for artificial neural networks. In this work, we provide a generic two-step rule for the back-propagation algorithm in matrix notation. Moreover, this rule incorporates both the forward and backward phases of the computations involved in the learning process. Specifically, this recursive computing rule permits the propagation of the changes to all synaptic weights in the network, layer by layer, efficiently. In particular, we use this rule to compute both the up and down partial derivatives of the cost function of all the connections feeding into the output layer.
ElegansNet: a brief scientific report and initial experiments
Authors: Francesco Bardozzo, Andrea Terlizzi, Pietro Liò, Roberto Tagliaferri
Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI)
Abstract
This research report introduces ElegansNet, a neural network that mimics real-world neuronal network circuitry, with the goal of better understanding the interplay between connectome topology and deep learning systems. The proposed approach utilizes the powerful representational capabilities of living beings' neuronal circuitry to design and generate improved deep learning systems with a topology similar to natural networks. The Caenorhabditis elegans connectome is used as a reference due to its completeness, reasonable size, and functional neuron classes annotations. It is demonstrated that the connectome of simple organisms exhibits specific functional relationships between neurons, and once transformed into learnable tensor networks and integrated into modern architectures, it offers bio-plausible structures that efficiently solve complex tasks. The performance of the models is demonstrated against randomly wired networks and compared to artificial networks ranked on global benchmarks. In the first case, ElegansNet outperforms randomly wired networks. Interestingly, ElegansNet models show slightly similar performance with only those based on the Watts-Strogatz small-world property. When compared to state-of-the-art artificial neural networks, such as transformers or attention-based autoencoders, ElegansNet outperforms well-known deep learning and traditional models in both supervised image classification tasks and unsupervised hand-written digits reconstruction, achieving top-1 accuracy of 99.99% on Cifar10 and 99.84% on MNIST Unsup on the validation sets.
D-STACK: High Throughput DNN Inference by Effective Multiplexing and Spatio-Temporal Scheduling of GPUs
Authors: Aditya Dhakal, Sameer G. Kulkarni, K. K. Ramakrishnan
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Performance (cs.PF); Systems and Control (eess.SY)
Abstract
Hardware accelerators such as GPUs are required for real-time, low-latency inference with Deep Neural Networks (DNN). However, due to the inherent limits to the parallelism they can exploit, DNNs often under-utilize the capacity of today's high-end accelerators. Although spatial multiplexing of the GPU, leads to higher GPU utilization and higher inference throughput, there remain a number of challenges. Finding the GPU percentage for right-sizing the GPU for each DNN through profiling, determining an optimal batching of requests to balance throughput improvement while meeting application-specific deadlines and service level objectives (SLOs), and maximizing throughput by appropriately scheduling DNNs are still significant challenges. This paper introduces a dynamic and fair spatio-temporal scheduler (D-STACK) that enables multiple DNNs to run in the GPU concurrently. To help allocate the appropriate GPU percentage (we call it the "Knee"), we develop and validate a model that estimates the parallelism each DNN can utilize. We also develop a lightweight optimization formulation to find an efficient batch size for each DNN operating with D-STACK. We bring together our optimizations and our spatio-temporal scheduler to provide a holistic inference framework. We demonstrate its ability to provide high throughput while meeting application SLOs. We compare D-STACK with an ideal scheduler that can allocate the right GPU percentage for every DNN kernel. D-STACK gets higher than 90 percent throughput and GPU utilization compared to the ideal scheduler. We also compare D-STACK with other GPU multiplexing and scheduling methods (e.g., NVIDIA Triton, Clipper, Nexus), using popular DNN models. Our controlled experiments with multiplexing several popular DNN models achieve up to 1.6X improvement in GPU utilization and up to 4X improvement in inference throughput.
On the Order of Power Series and the Sum of Square Roots Problem
Abstract
This paper focuses on the study of the order of power series that are linear combinations of a given finite set of power series. The order of a formal power series, known as $\textrm{ord}(f)$, is defined as the minimum exponent of $x$ that has a non-zero coefficient in $f(x)$. Our first result is that the order of the Wronskian of these power series is equivalent up to a polynomial factor, to the maximum order which occurs in the linear combination of these power series. This implies that the Wronskian approach used in (Kayal and Saha, TOCT'2012) to upper bound the order of sum of square roots is optimal up to a polynomial blowup. We also demonstrate similar upper bounds, similar to those of (Kayal and Saha, TOCT'2012), for the order of power series in a variety of other scenarios. We also solve a special case of the inequality testing problem outlined in (Etessami et al., TOCT'2014). In the second part of the paper, we study the equality variant of the sum of square roots problem, which is decidable in polynomial time due to (Bl\"omer, FOCS'1991). We investigate a natural generalization of this problem when the input integers are given as straight line programs. Under the assumption of the Generalized Riemann Hypothesis (GRH), we show that this problem can be reduced to the so-called one dimensional'' variant. We identify the key mathematical challenges for solving thisone dimensional'' variant.
The Roles of Symbols in Neural-based AI: They are Not What You Think!
Abstract
We propose that symbols are first and foremost external communication tools used between intelligent agents that allow knowledge to be transferred in a more efficient and effective manner than having to experience the world directly. But, they are also used internally within an agent through a form of self-communication to help formulate, describe and justify subsymbolic patterns of neural activity that truly implement thinking. Symbols, and our languages that make use of them, not only allow us to explain our thinking to others and ourselves, but also provide beneficial constraints (inductive bias) on learning about the world. In this paper we present relevant insights from neuroscience and cognitive science, about how the human brain represents symbols and the concepts they refer to, and how today's artificial neural networks can do the same. We then present a novel neuro-symbolic hypothesis and a plausible architecture for intelligent agents that combines subsymbolic representations for symbols and concepts for learning and reasoning. Our hypothesis and associated architecture imply that symbols will remain critical to the future of intelligent systems NOT because they are the fundamental building blocks of thought, but because they are characterizations of subsymbolic processes that constitute thought.
Experimental Validation of Model-less Robust Voltage Control using Measurement-based Estimated Voltage Sensitivity Coefficients
Abstract
Increasing adoption of smart meters and phasor measurement units (PMUs) in power distribution networks are enabling the adoption of data-driven/model-less control schemes to mitigate grid issues such as over/under voltages and power-flow congestions. However, such a scheme can lead to infeasible/inaccurate control decisions due to measurement inaccuracies. In this context, the authors' previous work proposed a robust measurement-based control scheme accounting for the uncertainties of the estimated models. In this scheme, a recursive least squares (RLS)-based method estimates the grid model (in the form of voltage magnitude sensitivity coefficients). Then, a robust control problem optimizes power set-points of distributed energy resources (DERs) such that the nodal voltage limits are satisfied. The estimated voltage sensitivity coefficients are used to model the nodal voltages, and the control robustness is achieved by accounting for their uncertainties. This work presents the first experimental validation of such a robust model-less control scheme on a real power distribution grid. The scheme is applied for voltage control by regulating two photovoltaic (PV) inverters connected in a real microgrid which is a replica of the CIGRE benchmark microgrid network at the EPFL Distributed Electrical Systems Laboratory.
Abstract
Large-scale pre-trained transformers have demonstrated remarkable success in various computer vision tasks. However, it is still highly challenging to fully fine-tune these models for downstream tasks due to their high computational and storage costs. Recently, Parameter-Efficient Tuning (PETuning) techniques, e.g., Visual Prompt Tuning (VPT) and Low-Rank Adaptation (LoRA), have significantly reduced the computation and storage cost by inserting lightweight prompt modules into the pre-trained models and tuning these prompt modules with a small number of trainable parameters, while keeping the transformer backbone frozen. Although only a few parameters need to be adjusted, most PETuning methods still require a significant amount of downstream task training data to achieve good results. The performance is inadequate on low-data regimes, especially when there are only one or two examples per class. To this end, we first empirically identify the poor performance is mainly due to the inappropriate way of initializing prompt modules, which has also been verified in the pre-trained language models. Next, we propose a Pre-trained Visual Parameter-efficient (PVP) Tuning framework, which pre-trains the parameter-efficient tuning modules first and then leverages the pre-trained modules along with the pre-trained transformer backbone to perform parameter-efficient tuning on downstream tasks. Experiment results on five Fine-Grained Visual Classification (FGVC) and VTAB-1k datasets demonstrate that our proposed method significantly outperforms state-of-the-art PETuning methods.
Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning
Authors: Tuomas Haarnoja, Ben Moran, Guy Lever, Sandy H. Huang, Dhruva Tirumala, Markus Wulfmeier, Jan Humplik, Saran Tunyasuvunakool, Noah Y. Siegel, Roland Hafner, Michael Bloesch, Kristian Hartikainen, Arunkumar Byravan, Leonard Hasenclever, Yuval Tassa, Fereshteh Sadeghi, Nathan Batchelor, Federico Casarini, Stefano Saliceti, Charles Game, Neil Sreendra, Kushal Patel, Marlon Gwira, Andrea Huber, Nicole Hurley, Francesco Nori, Raia Hadsell, Nicolas Heess
Abstract
We investigate whether Deep Reinforcement Learning (Deep RL) is able to synthesize sophisticated and safe movement skills for a low-cost, miniature humanoid robot that can be composed into complex behavioral strategies in dynamic environments. We used Deep RL to train a humanoid robot with 20 actuated joints to play a simplified one-versus-one (1v1) soccer game. We first trained individual skills in isolation and then composed those skills end-to-end in a self-play setting. The resulting policy exhibits robust and dynamic movement skills such as rapid fall recovery, walking, turning, kicking and more; and transitions between them in a smooth, stable, and efficient manner - well beyond what is intuitively expected from the robot. The agents also developed a basic strategic understanding of the game, and learned, for instance, to anticipate ball movements and to block opponent shots. The full range of behaviors emerged from a small set of simple rewards. Our agents were trained in simulation and transferred to real robots zero-shot. We found that a combination of sufficiently high-frequency control, targeted dynamics randomization, and perturbations during training in simulation enabled good-quality transfer, despite significant unmodeled effects and variations across robot instances. Although the robots are inherently fragile, minor hardware modifications together with basic regularization of the behavior during training led the robots to learn safe and effective movements while still performing in a dynamic and agile way. Indeed, even though the agents were optimized for scoring, in experiments they walked 156% faster, took 63% less time to get up, and kicked 24% faster than a scripted baseline, while efficiently combining the skills to achieve the longer term objectives. Examples of the emergent behaviors and full 1v1 matches are available on the supplementary website.
A Personalized Dense Retrieval Framework for Unified Information Access
Abstract
Developing a universal model that can efficiently and effectively respond to a wide range of information access requests -- from retrieval to recommendation to question answering -- has been a long-lasting goal in the information retrieval community. This paper argues that the flexibility, efficiency, and effectiveness brought by the recent development in dense retrieval and approximate nearest neighbor search have smoothed the path towards achieving this goal. We develop a generic and extensible dense retrieval framework, called \framework, that can handle a wide range of (personalized) information access requests, such as keyword search, query by example, and complementary item recommendation. Our proposed approach extends the capabilities of dense retrieval models for ad-hoc retrieval tasks by incorporating user-specific preferences through the development of a personalized attentive network. This allows for a more tailored and accurate personalized information access experience. Our experiments on real-world e-commerce data suggest the feasibility of developing universal information access models by demonstrating significant improvements even compared to competitive baselines specifically developed for each of these individual information access tasks. This work opens up a number of fundamental research directions for future exploration.
Building K-Anonymous User Cohorts with\ Consecutive Consistent Weighted Sampling (CCWS)
Authors: Xinyi Zheng, Weijie Zhao, Xiaoyun Li, Ping Li
Abstract
To retrieve personalized campaigns and creatives while protecting user privacy, digital advertising is shifting from member-based identity to cohort-based identity. Under such identity regime, an accurate and efficient cohort building algorithm is desired to group users with similar characteristics. In this paper, we propose a scalable $K$-anonymous cohort building algorithm called {\em consecutive consistent weighted sampling} (CCWS). The proposed method combines the spirit of the ($p$-powered) consistent weighted sampling and hierarchical clustering, so that the $K$-anonymity is ensured by enforcing a lower bound on the size of cohorts. Evaluations on a LinkedIn dataset consisting of $>70$M users and ads campaigns demonstrate that CCWS achieves substantial improvements over several hashing-based methods including sign random projections (SignRP), minwise hashing (MinHash), as well as the vanilla CWS.
Hitting Subgraphs in Sparse Graphs and Geometric Intersection Graphs
Authors: Daniel Lokshtanov, Fahad Panolan, Saket Saurabh, Jie Xue, Meirav Zehavi
Abstract
We investigate a fundamental vertex-deletion problem called (Induced) Subgraph Hitting: given a graph $G$ and a set $\mathcal{F}$ of forbidden graphs, the goal is to compute a minimum-sized set $S$ of vertices of $G$ such that $G-S$ does not contain any graph in $\mathcal{F}$ as an (induced) subgraph. This is a generic problem that encompasses many well-known problems that were extensively studied on their own, particularly (but not only) from the perspectives of both approximation and parameterization. We focus on the design of efficient approximation schemes, i.e., with running time $f(\varepsilon,\mathcal{F}) \cdot n^{O(1)}$, which are also of significant interest to both communities. Technically, our main contribution is a linear-time approximation-preserving reduction from (Induced) Subgraph Hitting on any graph class $\mathcal{G}$ of bounded expansion to the same problem on bounded degree graphs within $\mathcal{G}$. This yields a novel algorithmic technique to design (efficient) approximation schemes for the problem on very broad graph classes, well beyond the state-of-the-art. Specifically, applying this reduction, we derive approximation schemes with (almost) linear running time for the problem on any graph classes that have strongly sublinear separators and many important classes of geometric intersection graphs (such as fat-object graphs, pseudo-disk graphs, etc.). Our proofs introduce novel concepts and combinatorial observations that may be of independent interest (and, which we believe, will find other uses) for studies of approximation algorithms, parameterized complexity, sparse graph classes, and geometric intersection graphs. As a byproduct, we also obtain the first robust algorithm for $k$-Subgraph Isomorphism on intersection graphs of fat objects and pseudo-disks, with running time $f(k) \cdot n \log n + O(m)$.
An Investigation into Active Control for Accessible Orbital Flight
Abstract
Recently, a practical and publicly accessible satellite standard called the SmallSat has amplified public involvement in orbital research. This allows for flexible and efficient deployments of impactful low-earth-orbit experiments that would otherwise never be flown. However, the launch industry responsible for flying these experiments is not flexible nor efficient. This project aims to make orbital technologies accessible at the miniature scale, specifically thrust-vector-control, through an iterative engineering process simplifying and miniaturizing technologies from launch vehicles such as the Space Shuttle and Falcon 9. An Arduino-based custom flight computer was developed alongside state machine control software and active-control hardware, all designed to scale. Together, these three major components emulate the methods used in the aerospace industry. Initial test flights and recent ground test data have indicated stable control with a maximum of 7{\deg} and 2.62{\deg} of deviation from the intended flight path respectively, an acceptable stability range when compared to similar finned flights. Results show that scalable thrust vectoring is possible at a small scale, giving adaptability and control applicable to both small and large test vehicles. With accessible orbital flight, countless experiments can be completed concurrently, allowing for faster amateur rocket development and opening another path to space.
Association Rules Mining with Auto-Encoders
Authors: Théophile Berteloot, Richard Khoury, Audrey Durand
Abstract
Association rule mining is one of the most studied research fields of data mining, with applications ranging from grocery basket problems to explainable classification systems. Classical association rule mining algorithms have several limitations, especially with regards to their high execution times and number of rules produced. Over the past decade, neural network solutions have been used to solve various optimization problems, such as classification, regression or clustering. However there are still no efficient way association rules using neural networks. In this paper, we present an auto-encoder solution to mine association rule called ARM-AE. We compare our algorithm to FP-Growth and NSGAII on three categorical datasets, and show that our algorithm discovers high support and confidence rule set and has a better execution time than classical methods while preserving the quality of the rule set produced.
Controllable Image Generation via Collage Representations
Authors: Arantxa Casanova, Marlène Careil, Adriana Romero-Soriano, Christopher J. Pal, Jakob Verbeek, Michal Drozdzal
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Abstract
Recent advances in conditional generative image models have enabled impressive results. On the one hand, text-based conditional models have achieved remarkable generation quality, by leveraging large-scale datasets of image-text pairs. To enable fine-grained controllability, however, text-based models require long prompts, whose details may be ignored by the model. On the other hand, layout-based conditional models have also witnessed significant advances. These models rely on bounding boxes or segmentation maps for precise spatial conditioning in combination with coarse semantic labels. The semantic labels, however, cannot be used to express detailed appearance characteristics. In this paper, we approach fine-grained scene controllability through image collages which allow a rich visual description of the desired scene as well as the appearance and location of the objects therein, without the need of class nor attribute labels. We introduce "mixing and matching scenes" (M&Ms), an approach that consists of an adversarially trained generative image model which is conditioned on appearance features and spatial positions of the different elements in a collage, and integrates these into a coherent image. We train our model on the OpenImages (OI) dataset and evaluate it on collages derived from OI and MS-COCO datasets. Our experiments on the OI dataset show that M&Ms outperforms baselines in terms of fine-grained scene controllability while being very competitive in terms of image quality and sample diversity. On the MS-COCO dataset, we highlight the generalization ability of our model by outperforming DALL-E in terms of the zero-shot FID metric, despite using two magnitudes fewer parameters and data. Collage based generative models have the potential to advance content creation in an efficient and effective way as they are intuitive to use and yield high quality generations.
Keyword: faster
GENIE-NF-AI: Identifying Neurofibromatosis Tumors using Liquid Neural Network (LTC) trained on AACR GENIE Datasets
Authors: Michael Bidollahkhani, Ferhat Atasoy, Elnaz Abedini, Ali Davar, Omid Hamza, Fırat Sefaoğlu, Amin Jafari, Muhammed Nadir Yalçın, Hamdan Abdellatef
Subjects: Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Abstract
In recent years, the field of medicine has been increasingly adopting artificial intelligence (AI) technologies to provide faster and more accurate disease detection, prediction, and assessment. In this study, we propose an interpretable AI approach to diagnose patients with neurofibromatosis using blood tests and pathogenic variables. We evaluated the proposed method using a dataset from the AACR GENIE project and compared its performance with modern approaches. Our proposed approach outperformed existing models with 99.86% accuracy. We also conducted NF1 and interpretable AI tests to validate our approach. Our work provides an explainable approach model using logistic regression and explanatory stimulus as well as a black-box model. The explainable models help to explain the predictions of black-box models while the glass-box models provide information about the best-fit features. Overall, our study presents an interpretable AI approach for diagnosing patients with neurofibromatosis and demonstrates the potential of AI in the medical field.
From Chaos Comes Order: Ordering Event Representations for Object Detection
Authors: Nikola Zubić, Daniel Gehrig, Mathias Gehrig, Davide Scaramuzza
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Abstract
Today, state-of-the-art deep neural networks that process events first convert them into dense, grid-like input representations before using an off-the-shelf network. However, selecting the appropriate representation for the task traditionally requires training a neural network for each representation and selecting the best one based on the validation score, which is very time-consuming. In this work, we eliminate this bottleneck by selecting the best representation based on the Gromov-Wasserstein Discrepancy (GWD) between the raw events and their representation. It is approximately 200 times faster to compute than training a neural network and preserves the task performance ranking of event representations across multiple representations, network backbones, and datasets. This means that finding a representation with a high task score is equivalent to finding a representation with a low GWD. We use this insight to, for the first time, perform a hyperparameter search on a large family of event representations, revealing new and powerful representations that exceed the state-of-the-art. On object detection, our optimized representation outperforms existing representations by 1.9% mAP on the 1 Mpx dataset and 8.6% mAP on the Gen1 dataset and even outperforms the state-of-the-art by 1.8% mAP on Gen1 and state-of-the-art feed-forward methods by 6.0% mAP on the 1 Mpx dataset. This work opens a new unexplored field of explicit representation optimization for event-based learning methods.
Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning
Authors: Tuomas Haarnoja, Ben Moran, Guy Lever, Sandy H. Huang, Dhruva Tirumala, Markus Wulfmeier, Jan Humplik, Saran Tunyasuvunakool, Noah Y. Siegel, Roland Hafner, Michael Bloesch, Kristian Hartikainen, Arunkumar Byravan, Leonard Hasenclever, Yuval Tassa, Fereshteh Sadeghi, Nathan Batchelor, Federico Casarini, Stefano Saliceti, Charles Game, Neil Sreendra, Kushal Patel, Marlon Gwira, Andrea Huber, Nicole Hurley, Francesco Nori, Raia Hadsell, Nicolas Heess
Abstract
We investigate whether Deep Reinforcement Learning (Deep RL) is able to synthesize sophisticated and safe movement skills for a low-cost, miniature humanoid robot that can be composed into complex behavioral strategies in dynamic environments. We used Deep RL to train a humanoid robot with 20 actuated joints to play a simplified one-versus-one (1v1) soccer game. We first trained individual skills in isolation and then composed those skills end-to-end in a self-play setting. The resulting policy exhibits robust and dynamic movement skills such as rapid fall recovery, walking, turning, kicking and more; and transitions between them in a smooth, stable, and efficient manner - well beyond what is intuitively expected from the robot. The agents also developed a basic strategic understanding of the game, and learned, for instance, to anticipate ball movements and to block opponent shots. The full range of behaviors emerged from a small set of simple rewards. Our agents were trained in simulation and transferred to real robots zero-shot. We found that a combination of sufficiently high-frequency control, targeted dynamics randomization, and perturbations during training in simulation enabled good-quality transfer, despite significant unmodeled effects and variations across robot instances. Although the robots are inherently fragile, minor hardware modifications together with basic regularization of the behavior during training led the robots to learn safe and effective movements while still performing in a dynamic and agile way. Indeed, even though the agents were optimized for scoring, in experiments they walked 156% faster, took 63% less time to get up, and kicked 24% faster than a scripted baseline, while efficiently combining the skills to achieve the longer term objectives. Examples of the emergent behaviors and full 1v1 matches are available on the supplementary website.
An Investigation into Active Control for Accessible Orbital Flight
Abstract
Recently, a practical and publicly accessible satellite standard called the SmallSat has amplified public involvement in orbital research. This allows for flexible and efficient deployments of impactful low-earth-orbit experiments that would otherwise never be flown. However, the launch industry responsible for flying these experiments is not flexible nor efficient. This project aims to make orbital technologies accessible at the miniature scale, specifically thrust-vector-control, through an iterative engineering process simplifying and miniaturizing technologies from launch vehicles such as the Space Shuttle and Falcon 9. An Arduino-based custom flight computer was developed alongside state machine control software and active-control hardware, all designed to scale. Together, these three major components emulate the methods used in the aerospace industry. Initial test flights and recent ground test data have indicated stable control with a maximum of 7{\deg} and 2.62{\deg} of deviation from the intended flight path respectively, an acceptable stability range when compared to similar finned flights. Results show that scalable thrust vectoring is possible at a small scale, giving adaptability and control applicable to both small and large test vehicles. With accessible orbital flight, countless experiments can be completed concurrently, allowing for faster amateur rocket development and opening another path to space.
Keyword: mobile
Learning to Predict Navigational Patterns from Partial Observations
Authors: Robin Karlsson, Alexander Carballo, Francisco Lepe-Salazar, Keisuke Fujii, Kento Ohtani, Kazuya Takeda
Abstract
Human beings cooperatively navigate rule-constrained environments by adhering to mutually known navigational patterns, which may be represented as directional pathways or road lanes. Inferring these navigational patterns from incompletely observed environments is required for intelligent mobile robots operating in unmapped locations. However, algorithmically defining these navigational patterns is nontrivial. This paper presents the first self-supervised learning (SSL) method for learning to infer navigational patterns in real-world environments from partial observations only. We explain how geometric data augmentation, predictive world modeling, and an information-theoretic regularizer enables our model to predict an unbiased local directional soft lane probability (DSLP) field in the limit of infinite data. We demonstrate how to infer global navigational patterns by fitting a maximum likelihood graph to the DSLP field. Experiments show that our SSL model outperforms two SOTA supervised lane graph prediction models on the nuScenes dataset. We propose our SSL method as a scalable and interpretable continual learning paradigm for navigation by perception. Code released upon publication.
ESCM: An Efficient and Secure Communication Mechanism for UAV Networks
Abstract
UAV (unmanned aerial vehicle) is gradually entering various human activities. It has also become an important part of satellite-air-ground-sea integrated network (SAGS) for 6G communication. In order to achieve high mobility, UAV has strict requirements on communication latency, and it cannot be illegally controlled as weapons of attack with malicious intentions. Therefore, an efficient and secure communication method specifically designed for UAV network is required. This paper proposes a communication mechanism named ESCM for the above requirements. For high efficiency of communication, ESCM designs a routing protocol based on artificial bee colony algorithm (ABC) for UAV network to accelerate communication between UAVs. Meanwhile, we plan to use blockchain to guarantee the communication security of UAV networks. However, blockchain has unstable links in high mobility network scenarios, resulting in low consensus efficiency and high communication overhead. Therefore, ESCM also introduces the concept of the digital twin, mapping the UAVs from the physical world into Cyberspace, transforming the UAV network into a static network. And this virtual UAV network is called CyberUAV. Then, in CyberUAV, we design a blockchain system and propose a consensus algorithm based on network coding, named proof of network coding (PoNC). PoNC not only ensures the security of ESCM, but also further improves the performance of ESCM through network coding. Simulation results show that ESCM has obvious advantages in communication efficiency and security. Moreover, encoding messages through PoNC consensus can increase the network throughput, and make mobile blockchain static through digital twin can improve the consensus success rate.
CrowdCache: A Decentralized Game-Theoretic Framework for Mobile Edge Content Sharing
Abstract
Mobile edge computing (MEC) is a promising solution for enhancing the user experience, minimizing content delivery expenses, and reducing backhaul traffic. In this paper, we propose a novel privacy-preserving decentralized game-theoretic framework for resource crowdsourcing in MEC. Our framework models the interactions between a content provider (CP) and multiple mobile edge device users (MEDs) as a non-cooperative game, in which MEDs offer idle storage resources for content caching in exchange for rewards. We introduce efficient decentralized gradient play algorithms for Nash equilibrium (NE) computation by exchanging local information among neighboring MEDs only, thus preventing attackers from learning users' private information. The key challenge in designing such algorithms is that communication among MEDs is not fixed and is facilitated by a sequence of undirected time-varying graphs. Our approach achieves linear convergence to the NE without imposing any assumptions on the values of parameters in the local objective functions, such as requiring strong monotonicity to be stronger than its dependence on other MEDs' actions, which is commonly required in existing literature when the graph is directed time-varying. Extensive simulations demonstrate the effectiveness of our approach in achieving efficient resource outsourcing decisions while preserving the privacy of the edge devices.
Digital technologies in the context of university transition and disability: Theoretical and empirical advances
Authors: Edgar Pacheco
Subjects: Computers and Society (cs.CY); Human-Computer Interaction (cs.HC)
Abstract
Since transition to higher education emerged as a research topic in the early 1970s, scholarly inquiry has focused on students without impairments and, what is more, little attention has been paid to the role of digital technologies. This article seeks to address this knowledge gap by looking at the university experiences of a group of first-year students with vision impairments from New Zealand, and the way they use digital tools, such as social media and mobile devices, to manage their transition-related challenges. The article summarises the findings from a longitudinal qualitative project which was methodologically informed by action research (AR). The article explores and discusses scholarly inquiry of transition to university and introduces a conceptual framework which includes five overlapping stages, the transition issues faced by the students and the roles played by digital technologies. The article updates and expands the theoretical understanding of transition to higher education and provides empirical evidence for practitioners to support the needs, inclusion, and participation of young people with disabilities in the tertiary setting.
Konzeption und Umsetzung einer mobilen Applikation zur Validierung von fälschungssicheren Produktlabeln
Abstract
Due to increasing numbers of product piracy worldwide, a cost-effective method for verifying the origin of a product is to be developed. For this purpose, a certificate of authenticity can be created using precisely measurable, unique properties of special physical objects that are difficult to reconstruct. In the context of the present work, this is a counterfeit-proof label composed of randomly distributed gold nanospheres or rods in a semi-transparent material. The characteristic positioning of the label's elements can be precisely measured using a smartphone's camera and additional technologies. This can create an offline usable verification method for the general public without the need for an existing network connection. The present work provides a first part of the proof of concept that such a system and especially the associated algorithmic computation method can be implemented and efficiently used in a mobile application. In addition, a method suitable in practice for transmitting and securing the required information is determined in each case. Furthermore, the results of the validation of counterfeit-proof product labels are analyzed in detail and existing weaknesses are pointed out. -- Auf Grund weltweit steigender Zahlen der Produktpiraterie soll ein kosteng\"unstiges Verfahren zur Verifizierung der Herkunft eines Produktes entwickelt werden. Daf\"ur l\"asst sich durch exakt messbare, einzigartige, jedoch schwer rekonstruierbare Eigenschaften spezieller physischer Objekte ein Echtheitszertifikat kreieren. Dieses ist im Kontext der vorliegenden Arbeit ein f\"alschungssicheres Label, das sich in einem semi-transparenten Material aus zuf\"allig verteilten Goldnanok\"ugelchen oder -st\"abchen zusammensetzt. Die charakteristischen Positionierungen der Elemente des Labels lassen sich mit der Kamera eines Smartphones und zus\"atzlichen Technologien pr\"azise messen. Dadurch kann f\"ur die breite Bev\"olkerung ohne die Notwendigkeit einer bestehenden Netzwerkverbindung ein offline verwendbares Verifikationsverfahren erschaffen werden. Die vorliegende Arbeit liefert einen ersten Teil des Machbarkeitsnachweises, dass ein derartiges System und insbesondere das damit einhergehende algorithmische Berechnungsverfahren in einer mobilen Applikation implementier -- und effizient einsetzbar ist. Zudem wird je eine in der Praxis geeignete Methode zur \"Ubermittlung und Sicherung der ben\"otigten Informationen eruiert. Des Weiteren werden die Resultate der Validierung von f\"alschungssicheren Produktlabeln ausf\"uhrlich analysiert und vorhandene Schw\"achen aufgezeigt.
Thermal Vision for Soil Assessment in a Multipurpose Environmental Chamber under Martian Conditions towards Robot Navigation
Authors: Raúl Castilla-Arquillo, Anthony Mandow, Carlos J. Pérez-del-Pulgar, César Álvarez-Llamas, José M. Vadillo, Javier Laserna
Abstract
Soil assessment is important for mobile robot planning and navigation on natural and planetary environments. Terramechanic characteristics can be inferred from the thermal behaviour of soils under the influence of sunlight using remote sensors such as Long-Wave Infrared cameras. However, this behaviour is greatly affected by the low atmospheric pressures of planets such as Mars, so practical models are needed to relate robot remote sensing data on Earth to target planetary exploration conditions. This article proposes a general framework based on multipurpose environmental chambers to generate representative diurnal cycle dataset pairs that can be useful to relate the thermal behaviour of a soil on Earth to the corresponding behaviour under planetary pressure conditions using remote sensing. Furthermore, we present an application of the proposed framework to generate datasets using the UMA-Laserlab chamber, which can replicate the atmospheric \ch{CO2} composition of Mars. In particular, we analyze the thermal behaviour of four soil samples of different granularity by comparing replicated Martian surface conditions and their Earth's diurnal cycle equivalent. Results indicate a correlation between granularity and thermal inertia that is consistent with available Mars surface measurements recorded by rovers. The resulting dataset pairs, consisting of representative diurnal cycle thermal images with heater, air, and subsurface temperatures, have been made available for the scientific community.
Keyword: pruning
Optimizing Deep Learning Models For Raspberry Pi
Authors: Salem Ameen, Kangaranmulle Siriwardana, Theo Theodoridis
Subjects: Systems and Control (eess.SY); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Performance (cs.PF)
Abstract
Deep learning models have become increasingly popular for a wide range of applications, including computer vision, natural language processing, and speech recognition. However, these models typically require large amounts of computational resources, making them challenging to run on low-power devices such as the Raspberry Pi. One approach to addressing this challenge is to use pruning techniques to reduce the size of the deep learning models. Pruning involves removing unimportant weights and connections from the model, resulting in a smaller and more efficient model. Pruning can be done during training or after the model has been trained. Another approach is to optimize the deep learning models specifically for the Raspberry Pi architecture. This can include optimizing the model's architecture and parameters to take advantage of the Raspberry Pi's hardware capabilities, such as its CPU and GPU. Additionally, the model can be optimized for energy efficiency by minimizing the amount of computation required. Pruning and optimizing deep learning models for the Raspberry Pi can help overcome the computational and energy constraints of low-power devices, making it possible to run deep learning models on a wider range of devices. In the following sections, we will explore these approaches in more detail and discuss their effectiveness for optimizing deep learning models for the Raspberry Pi.
Towards Compute-Optimal Transfer Learning
Authors: Massimo Caccia, Alexandre Galashov, Arthur Douillard, Amal Rannen-Triki, Dushyant Rao, Michela Paganini, Laurent Charlin, Marc'Aurelio Ranzato, Razvan Pascanu
Abstract
The field of transfer learning is undergoing a significant shift with the introduction of large pretrained models which have demonstrated strong adaptability to a variety of downstream tasks. However, the high computational and memory requirements to finetune or use these models can be a hindrance to their widespread use. In this study, we present a solution to this issue by proposing a simple yet effective way to trade computational efficiency for asymptotic performance which we define as the performance a learning algorithm achieves as compute tends to infinity. Specifically, we argue that zero-shot structured pruning of pretrained models allows them to increase compute efficiency with minimal reduction in performance. We evaluate our method on the Nevis'22 continual learning benchmark that offers a diverse set of transfer scenarios. Our results show that pruning convolutional filters of pretrained models can lead to more than 20% performance improvement in low computational regimes.
Machine Vision-Based Crop-Load Estimation Using YOLOv8
Authors: Dawood Ahmed, Ranjan Sapkota, Martin Churuvija, Manoj Karkee
Abstract
Labor shortages in fruit crop production have prompted the development of mechanized and automated machines as alternatives to labor-intensive orchard operations such as harvesting, pruning, and thinning. Agricultural robots capable of identifying tree canopy parts and estimating geometric and topological parameters, such as branch diameter, length, and angles, can optimize crop yields through automated pruning and thinning platforms. In this study, we proposed a machine vision system to estimate canopy parameters in apple orchards and determine an optimal number of fruit for individual branches, providing a foundation for robotic pruning, flower thinning, and fruitlet thinning to achieve desired yield and quality.Using color and depth information from an RGB-D sensor (Microsoft Azure Kinect DK), a YOLOv8-based instance segmentation technique was developed to identify trunks and branches of apple trees during the dormant season. Principal Component Analysis was applied to estimate branch diameter (used to calculate limb cross-sectional area, or LCSA) and orientation. The estimated branch diameter was utilized to calculate LCSA, which served as an input for crop-load estimation, with larger LCSA values indicating a higher potential fruit-bearing capacity.RMSE for branch diameter estimation was 2.08 mm, and for crop-load estimation, 3.95. Based on commercial apple orchard management practices, the target crop-load (number of fruit) for each segmented branch was estimated with a mean absolute error (MAE) of 2.99 (ground truth crop-load was 6 apples per LCSA). This study demonstrated a promising workflow with high performance in identifying trunks and branches of apple trees in dynamic commercial orchard environments and integrating farm management practices into automated decision-making.
Concept-Monitor: Understanding DNN training through individual neurons
Authors: Mohammad Ali Khan, Tuomas Oikarinen, Tsui-Wei Weng
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Abstract
In this work, we propose a general framework called Concept-Monitor to help demystify the black-box DNN training processes automatically using a novel unified embedding space and concept diversity metric. Concept-Monitor enables human-interpretable visualization and indicators of the DNN training processes and facilitates transparency as well as deeper understanding on how DNNs develop along the during training. Inspired by these findings, we also propose a new training regularizer that incentivizes hidden neurons to learn diverse concepts, which we show to improve training performance. Finally, we apply Concept-Monitor to conduct several case studies on different training paradigms including adversarial training, fine-tuning and network pruning via the Lottery Ticket Hypothesis
Filter Pruning via Filters Similarity in Consecutive Layers
Abstract
Filter pruning is widely adopted to compress and accelerate the Convolutional Neural Networks (CNNs), but most previous works ignore the relationship between filters and channels in different layers. Processing each layer independently fails to utilize the collaborative relationship across layers. In this paper, we intuitively propose a novel pruning method by explicitly leveraging the Filters Similarity in Consecutive Layers (FSCL). FSCL compresses models by pruning filters whose corresponding features are more worthless in the model. The extensive experiments demonstrate the effectiveness of FSCL, and it yields remarkable improvement over state-of-the-art on accuracy, FLOPs and parameter reduction on several benchmark models and datasets.
Sparsified Model Zoo Twins: Investigating Populations of Sparsified Neural Network Models
Authors: Dominik Honegger, Konstantin Schürholt, Damian Borth
Abstract
With growing size of Neural Networks (NNs), model sparsification to reduce the computational cost and memory demand for model inference has become of vital interest for both research and production. While many sparsification methods have been proposed and successfully applied on individual models, to the best of our knowledge their behavior and robustness has not yet been studied on large populations of models. With this paper, we address that gap by applying two popular sparsification methods on populations of models (so called model zoos) to create sparsified versions of the original zoos. We investigate the performance of these two methods for each zoo, compare sparsification layer-wise, and analyse agreement between original and sparsified populations. We find both methods to be very robust with magnitude pruning able outperform variational dropout with the exception of high sparsification ratios above 80%. Further, we find sparsified models agree to a high degree with their original non-sparsified counterpart, and that the performance of original and sparsified model is highly correlated. Finally, all models of the model zoos and their sparsified model twins are publicly available: modelzoos.cc.
Keyword: voxel
VGOS: Voxel Grid Optimization for View Synthesis from Sparse Inputs
Authors: Jiakai Sun, Zhanjie Zhang, Jiafu Chen, Guangyuan Li, Boyan Ji, Lei Zhao, Wei Xing
Abstract
Neural Radiance Fields (NeRF) has shown great success in novel view synthesis due to its state-of-the-art quality and flexibility. However, NeRF requires dense input views (tens to hundreds) and a long training time (hours to days) for a single scene to generate high-fidelity images. Although using the voxel grids to represent the radiance field can significantly accelerate the optimization process, we observe that for sparse inputs, the voxel grids are more prone to overfitting to the training views and will have holes and floaters, which leads to artifacts. In this paper, we propose VGOS, an approach for fast (3-5 minutes) radiance field reconstruction from sparse inputs (3-10 views) to address these issues. To improve the performance of voxel-based radiance field in sparse input scenarios, we propose two methods: (a) We introduce an incremental voxel training strategy, which prevents overfitting by suppressing the optimization of peripheral voxels in the early stage of reconstruction. (b) We use several regularization techniques to smooth the voxels, which avoids degenerate solutions. Experiments demonstrate that VGOS achieves state-of-the-art performance for sparse inputs with super-fast convergence. Code will be available at https://github.com/SJoJoK/VGOS.
Keyword: lidar
Single-View Height Estimation with Conditional Diffusion Probabilistic Models
Authors: Isaac Corley, Peyman Najafirad
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
Abstract
Digital Surface Models (DSM) offer a wealth of height information for understanding the Earth's surface as well as monitoring the existence or change in natural and man-made structures. Classical height estimation requires multi-view geospatial imagery or LiDAR point clouds which can be expensive to acquire. Single-view height estimation using neural network based models shows promise however it can struggle with reconstructing high resolution features. The latest advancements in diffusion models for high resolution image synthesis and editing have yet to be utilized for remote sensing imagery, particularly height estimation. Our approach involves training a generative diffusion model to learn the joint distribution of optical and DSM images across both domains as a Markov chain. This is accomplished by minimizing a denoising score matching objective while being conditioned on the source image to generate realistic high resolution 3D surfaces. In this paper we experiment with conditional denoising diffusion probabilistic models (DDPM) for height estimation from a single remotely sensed image and show promising results on the Vaihingen benchmark dataset.
Keyword: diffusion
Diffusion Probabilistic Model Based Accurate and High-Degree-of-Freedom Metasurface Inverse Design
Abstract
Conventional meta-atom designs rely heavily on researchers' prior knowledge and trial-and-error searches using full-wave simulations, resulting in time-consuming and inefficient processes. Inverse design methods based on optimization algorithms, such as evolutionary algorithms, and topological optimizations, have been introduced to design metamaterials. However, none of these algorithms are general enough to fulfill multi-objective tasks. Recently, deep learning methods represented by Generative Adversarial Networks (GANs) have been applied to inverse design of metamaterials, which can directly generate high-degree-of-freedom meta-atoms based on S-parameter requirements. However, the adversarial training process of GANs makes the network unstable and results in high modeling costs. This paper proposes a novel metamaterial inverse design method based on the diffusion probability theory. By learning the Markov process that transforms the original structure into a Gaussian distribution, the proposed method can gradually remove the noise starting from the Gaussian distribution and generate new high-degree-of-freedom meta-atoms that meet S-parameter conditions, which avoids the model instability introduced by the adversarial training process of GANs and ensures more accurate and high-quality generation results. Experiments have proven that our method is superior to representative methods of GANs in terms of model convergence speed, generation accuracy, and quality.
Directed Chain Generative Adversarial Networks
Authors: Ming Min, Ruimeng Hu, Tomoyuki Ichiba
Subjects: Machine Learning (cs.LG); Probability (math.PR)
Abstract
Real-world data can be multimodal distributed, e.g., data describing the opinion divergence in a community, the interspike interval distribution of neurons, and the oscillators natural frequencies. Generating multimodal distributed real-world data has become a challenge to existing generative adversarial networks (GANs). For example, neural stochastic differential equations (Neural SDEs), treated as infinite-dimensional GANs, have demonstrated successful performance mainly in generating unimodal time series data. In this paper, we propose a novel time series generator, named directed chain GANs (DC-GANs), which inserts a time series dataset (called a neighborhood process of the directed chain or input) into the drift and diffusion coefficients of the directed chain SDEs with distributional constraints. DC-GANs can generate new time series of the same distribution as the neighborhood process, and the neighborhood process will provide the key step in learning and generating multimodal distributed time series. The proposed DC-GANs are examined on four datasets, including two stochastic models from social sciences and computational neuroscience, and two real-world datasets on stock prices and energy consumption. To our best knowledge, DC-GANs are the first work that can generate multimodal time series data and consistently outperforms state-of-the-art benchmarks with respect to measures of distribution, data similarity, and predictive ability.
Single-View Height Estimation with Conditional Diffusion Probabilistic Models
Authors: Isaac Corley, Peyman Najafirad
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
Abstract
Digital Surface Models (DSM) offer a wealth of height information for understanding the Earth's surface as well as monitoring the existence or change in natural and man-made structures. Classical height estimation requires multi-view geospatial imagery or LiDAR point clouds which can be expensive to acquire. Single-view height estimation using neural network based models shows promise however it can struggle with reconstructing high resolution features. The latest advancements in diffusion models for high resolution image synthesis and editing have yet to be utilized for remote sensing imagery, particularly height estimation. Our approach involves training a generative diffusion model to learn the joint distribution of optical and DSM images across both domains as a Markov chain. This is accomplished by minimizing a denoising score matching objective while being conditioned on the source image to generate realistic high resolution 3D surfaces. In this paper we experiment with conditional denoising diffusion probabilistic models (DDPM) for height estimation from a single remotely sensed image and show promising results on the Vaihingen benchmark dataset.
Score-based Generative Modeling Through Backward Stochastic Differential Equations: Inversion and Generation
Abstract
The proposed BSDE-based diffusion model represents a novel approach to diffusion modeling, which extends the application of stochastic differential equations (SDEs) in machine learning. Unlike traditional SDE-based diffusion models, our model can determine the initial conditions necessary to reach a desired terminal distribution by adapting an existing score function. We demonstrate the theoretical guarantees of the model, the benefits of using Lipschitz networks for score matching, and its potential applications in various areas such as diffusion inversion, conditional diffusion, and uncertainty quantification. Our work represents a contribution to the field of score-based generative learning and offers a promising direction for solving real-world problems.
Preconditioned discontinuous Galerkin method and convection-diffusion-reaction problems with guaranteed bounds to resulting spectra
Authors: Liya Gaynutdinova, Martin Ladecký, Ivana Pultarová, Miloslav Vlasák, Jan Zeman
Abstract
This paper focuses on the design, analysis and implementation of a new preconditioning concept for linear second order partial differential equations, including the convection-diffusion-reaction problems discretized by Galerkin or discontinuous Galerkin methods. We expand on the approach introduced by Gergelits et al. and adapt it to the more general settings, assuming that both the original and preconditioning matrices are composed of sparse matrices of very low ranks, representing local contributions to the global matrices. When applied to a symmetric problem, the method provides bounds to all individual eigenvalues of the preconditioned matrix. We show that this preconditioning strategy works not only for Galerkin discretization, but also for the discontinuous Galerkin discretization, where local contributions are associated with individual edges of the triangulation. In the case of non-symmetric problems, the method yields guaranteed bounds to real and imaginary parts of the resulting eigenvalues. We include some numerical experiments illustrating the method and its implementation, showcasing its effectiveness for the two variants of discretized (convection-)diffusion-reaction problems.
Event-triggered Boundary Control of a Class of Reaction-Diffusion PDEs with Time-dependent Reactivity
Abstract
This paper presents an event-triggered boundary control strategy for a class of reaction-diffusion PDEs with time-varying reactivity under Robin actuation. The control approach consists of a backstepping full-state feedback boundary controller and a dynamic event-triggering condition, which determines the time instants when the control input needs to be updated. It is proved that under the proposed event-triggered boundary control approach, there is a uniform minimal dwell-time between two event times. Furthermore, the well-posedness and the global exponential convergence of the closed-loop system to zero in $L^2$-sense are established. A simulation is conducted to validate the theoretical developments.
Mixed finite element methods for nonlinear reaction-diffusion equations with interfaces
Abstract
We develop mixed finite element methods for nonlinear reaction-diffusion equations with interfaces which have Robin-type interface conditions. We introduce the velocity of chemicals as new variables and reformulate the governing equations. The stability of semidiscrete solutions, existence and the a priori error estimates of fully discrete solutions are proved by fixed point theorem and continuous/discrete Gr${\"o}$nwall inequalities. Numerical results illustrating our theoretical analysis are included.
DiffuseExpand: Expanding dataset for 2D medical image segmentation using diffusion models
Abstract
Dataset expansion can effectively alleviate the problem of data scarcity for medical image segmentation, due to privacy concerns and labeling difficulties. However, existing expansion algorithms still face great challenges due to their inability of guaranteeing the diversity of synthesized images with paired segmentation masks. In recent years, Diffusion Probabilistic Models (DPMs) have shown powerful image synthesis performance, even better than Generative Adversarial Networks. Based on this insight, we propose an approach called DiffuseExpand for expanding datasets for 2D medical image segmentation using DPM, which first samples a variety of masks from Gaussian noise to ensure the diversity, and then synthesizes images to ensure the alignment of images and masks. After that, DiffuseExpand chooses high-quality samples to further enhance the effectiveness of data expansion. Our comparison and ablation experiments on COVID-19 and CGMH Pelvis datasets demonstrate the effectiveness of DiffuseExpand. Our code is released at https://anonymous.4open.science/r/DiffuseExpand.
Abstract
Current large-scale generative models have impressive efficiency in generating high-quality images based on text prompts. However, they lack the ability to precisely control the size and position of objects in the generated image. In this study, we analyze the generative mechanism of the stable diffusion model and propose a new interactive generation paradigm that allows users to specify the position of generated objects without additional training. Moreover, we propose an object detection-based evaluation metric to assess the control capability of location aware generation task. Our experimental results show that our method outperforms state-of-the-art methods on both control capacity and image quality.
Keyword: dynamic
Model Extraction Attacks Against Reinforcement Learning Based Controllers
Abstract
We introduce the problem of model-extraction attacks in cyber-physical systems in which an attacker attempts to estimate (or extract) the feedback controller of the system. Extracting (or estimating) the controller provides an unmatched edge to attackers since it allows them to predict the future control actions of the system and plan their attack accordingly. Hence, it is important to understand the ability of the attackers to perform such an attack. In this paper, we focus on the setting when a Deep Neural Network (DNN) controller is trained using Reinforcement Learning (RL) algorithms and is used to control a stochastic system. We play the role of the attacker that aims to estimate such an unknown DNN controller, and we propose a two-phase algorithm. In the first phase, also called the offline phase, the attacker uses side-channel information about the RL-reward function and the system dynamics to identify a set of candidate estimates of the unknown DNN. In the second phase, also called the online phase, the attacker observes the behavior of the unknown DNN and uses these observations to shortlist the set of final policy estimates. We provide theoretical analysis of the error between the unknown DNN and the estimated one. We also provide numerical results showing the effectiveness of the proposed algorithm.
Uncovering the Representation of Spiking Neural Networks Trained with Surrogate Gradient
Authors: Yuhang Li, Youngeun Kim, Hyoungseob Park, Priyadarshini Panda
Subjects: Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Abstract
Spiking Neural Networks (SNNs) are recognized as the candidate for the next-generation neural networks due to their bio-plausibility and energy efficiency. Recently, researchers have demonstrated that SNNs are able to achieve nearly state-of-the-art performance in image recognition tasks using surrogate gradient training. However, some essential questions exist pertaining to SNNs that are little studied: Do SNNs trained with surrogate gradient learn different representations from traditional Artificial Neural Networks (ANNs)? Does the time dimension in SNNs provide unique representation power? In this paper, we aim to answer these questions by conducting a representation similarity analysis between SNNs and ANNs using Centered Kernel Alignment (CKA). We start by analyzing the spatial dimension of the networks, including both the width and the depth. Furthermore, our analysis of residual connections shows that SNNs learn a periodic pattern, which rectifies the representations in SNNs to be ANN-like. We additionally investigate the effect of the time dimension on SNN representation, finding that deeper layers encourage more dynamics along the time dimension. We also investigate the impact of input data such as event-stream data and adversarial attacks. Our work uncovers a host of new findings of representations in SNNs. We hope this work will inspire future research to fully comprehend the representation power of SNNs. Code is released at https://github.com/Intelligent-Computing-Lab-Yale/SNNCKA.
Time-Selective RNN for Device-Free Multi-Room Human Presence Detection Using WiFi CSI
Abstract
Human presence detection is a crucial technology for various applications, including home automation, security, and healthcare. While camera-based systems have traditionally been used for this purpose, they raise privacy concerns. To address this issue, recent research has explored the use of channel state information (CSI) approaches that can be extracted from commercial WiFi access points (APs) and provide detailed channel characteristics. In this thesis, we propose a device-free human presence detection system for multi-room scenarios using a time-selective conditional dual feature extract recurrent Network (TCD-FERN). Our system is designed to capture significant time features with the condition on current human features using a dynamic and static (DaS) data preprocessing technique to extract moving and spatial features of people and differentiate between line-of-sight (LoS) path blocking and non-blocking cases. To mitigate the feature attenuation problem caused by room partitions, we employ a voting scheme. We conduct evaluation and real-time experiments to demonstrate that our proposed TCD-FERN system can achieve human presence detection for multi-room scenarios using fewer commodity WiFi APs.
Analysis and Mitigation of Shared Resource Contention on Heterogeneous Multicore: An Industrial Case Study
Abstract
In this paper, we address the industrial challenge put forth by ARM in ECRTS 2022. We systematically analyze the effect of shared resource contention to an augmented reality head-up display (AR-HUD) case-study application of the industrial challenge on a heterogeneous multicore platform, NVIDIA Jetson Nano. We configure the AR-HUD application such that it can process incoming image frames in real-time at 20Hz on the platform. We use micro-architectural denial-of-service (DoS) attacks as aggressor tasks of the challenge and show that they can dramatically impact the latency and accuracy of the AR-HUD application, which results in significant deviations of the estimated trajectories from the ground truth, despite our best effort to mitigate their influence by using cache partitioning and real-time scheduling of the AR-HUD application. We show that dynamic LLC (or DRAM depending on the aggressor) bandwidth throttling of the aggressor tasks is an effective mean to ensure real-time performance of the AR-HUD application without resorting to over-provisioning the system.
Roll-Drop: accounting for observation noise with a single parameter
Authors: Luigi Campanaro, Daniele De Martini, Siddhant Gangapurwala, Wolfgang Merkt, Ioannis Havoutis
Abstract
This paper proposes a simple strategy for sim-to-real in Deep-Reinforcement Learning (DRL) -- called Roll-Drop -- that uses dropout during simulation to account for observation noise during deployment without explicitly modelling its distribution for each state. DRL is a promising approach to control robots for highly dynamic and feedback-based manoeuvres, and accurate simulators are crucial to providing cheap and abundant data to learn the desired behaviour. Nevertheless, the simulated data are noiseless and generally show a distributional shift that challenges the deployment on real machines where sensor readings are affected by noise. The standard solution is modelling the latter and injecting it during training; while this requires a thorough system identification, Roll-Drop enhances the robustness to sensor noise by tuning only a single parameter. We demonstrate an 80% success rate when up to 25% noise is injected in the observations, with twice higher robustness than the baselines. We deploy the controller trained in simulation on a Unitree A1 platform and assess this improved robustness on the physical system.
The Limited Integrator Model Regulator And its Use in Vehicle Steering Control
Abstract
Unexpected yaw disturbances like braking on unilaterally icy road, side wind forces and tire rupture are very difficult to handle by the driver of a road vehicle, due to his/her large panic reaction period ranging between 0.5 to 2 seconds. Automatic driver assist systems provide counteracting yaw moments during this driver panic reaction period to maintain the stability of the yaw dynamics of the vehicle. An active steering based driver assist system that uses the model regulator control architecture is introduced and used here for yaw dynamics stabilization in such situations. The model regulator which is a special form of a two degree of freedom control architecture is introduced and explained in detail in a tutorial fashion whereby its integral action capability, among others, is also shown. An auxiliary steering actuation system is assumed and a limited integrator version of the model regulator based steering controller is developed in order not to saturate the auxiliary steering actuator. This low frequency limited integrator implementation also allows the driver to take care of low frequency steering and disturbance rejection tasks. Linear simulation results are used to demonstrate the effectiveness of the proposed method.
How to design, and tune, a computed torque controller: An introduction and a Matlab example
Abstract
This note briefly introduces the computed torque control method for trajectory tracking. The method is applicable to fully actuated robots, i.e, those whose inverse dynamics can be solved for any feasible acceleration. This includes many systems, like robot arms or hands, or any tree-like mechanism with all its joints actuated. Using simple explanations, we see how such a controller can be obtained using feedback linearization, and how its gains can be tuned to satisfy a desired settling time for the error signal. We end up discussing the advantages and shortcomings of the controller. A companion Matlab script can be downloaded from https://bit.ly/3QShxYi that implements and tests the controller on a simple actuated pendulum.
Dynamic Datasets and Market Environments for Financial Reinforcement Learning
Authors: Xiao-Yang Liu, Ziyi Xia, Hongyang Yang, Jiechao Gao, Daochen Zha, Ming Zhu, Christina Dan Wang, Zhaoran Wang, Jian Guo
Abstract
The financial market is a particularly challenging playground for deep reinforcement learning due to its unique feature of dynamic datasets. Building high-quality market environments for training financial reinforcement learning (FinRL) agents is difficult due to major factors such as the low signal-to-noise ratio of financial data, survivorship bias of historical data, and model overfitting. In this paper, we present FinRL-Meta, a data-centric and openly accessible library that processes dynamic datasets from real-world markets into gym-style market environments and has been actively maintained by the AI4Finance community. First, following a DataOps paradigm, we provide hundreds of market environments through an automatic data curation pipeline. Second, we provide homegrown examples and reproduce popular research papers as stepping stones for users to design new trading strategies. We also deploy the library on cloud platforms so that users can visualize their own results and assess the relative performance via community-wise competitions. Third, we provide dozens of Jupyter/Python demos organized into a curriculum and a documentation website to serve the rapidly growing community. The open-source codes for the data curation pipeline are available at https://github.com/AI4Finance-Foundation/FinRL-Meta
Splitting physics-informed neural networks for inferring the dynamics of integer- and fractional-order neuron models
Authors: Simin Shekarpaz, Fanhai Zeng, George Karniadakis
Abstract
We introduce a new approach for solving forward systems of differential equations using a combination of splitting methods and physics-informed neural networks (PINNs). The proposed method, splitting PINN, effectively addresses the challenge of applying PINNs to forward dynamical systems and demonstrates improved accuracy through its application to neuron models. Specifically, we apply operator splitting to decompose the original neuron model into sub-problems that are then solved using PINNs. Moreover, we develop an $L^1$ scheme for discretizing fractional derivatives in fractional neuron models, leading to improved accuracy and efficiency. The results of this study highlight the potential of splitting PINNs in solving both integer- and fractional-order neuron models, as well as other similar systems in computational science and engineering.
Performance of the Gittins Policy in the G/G/1 and G/G/k, With and Without Setup Times
Authors: Yige Hong, Ziv Scully
Subjects: Performance (cs.PF); Probability (math.PR)
Abstract
How should we schedule jobs to minimize mean queue length? In the preemptive M/G/1 queue, we know the optimal policy is the Gittins policy, which uses any available information about jobs' remaining service times to dynamically prioritize jobs. For models more complex than the M/G/1, optimal scheduling is generally intractable. This leads us to ask: beyond the M/G/1, does Gittins still perform well? Recent results indicate that Gittins performs well in the M/G/k, meaning that its additive suboptimality gap is bounded by an expression which is negligible in heavy traffic. But allowing multiple servers is just one way to extend the M/G/1, and most other extensions remain open. Does Gittins still perform well with non-Poisson arrival processes? Or if servers require setup times when transitioning from idle to busy? In this paper, we give the first analysis of the Gittins policy that can handle any combination of (a) multiple servers, (b) non-Poisson arrivals, and (c) setup times. Our results thus cover the G/G/1 and G/G/k, with and without setup times, bounding Gittins's suboptimality gap in each case. Each of (a), (b), and (c) adds a term to our bound, but all the terms are negligible in heavy traffic, thus implying Gittins's heavy-traffic optimality in all the systems we consider. Another consequence of our results is that Gittins is optimal in the M/G/1 with setup times at all loads.
Analyzing In-browser Cryptojacking
Authors: Muhammad Saad, David Mohaisen
Subjects: Cryptography and Security (cs.CR); Computers and Society (cs.CY); Machine Learning (cs.LG); Software Engineering (cs.SE)
Abstract
Cryptojacking is the permissionless use of a target device to covertly mine cryptocurrencies. With cryptojacking, attackers use malicious JavaScript codes to force web browsers into solving proof-of-work puzzles, thus making money by exploiting the resources of the website visitors. To understand and counter such attacks, we systematically analyze the static, dynamic, and economic aspects of in-browser cryptojacking. For static analysis, we perform content, currency, and code-based categorization of cryptojacking samples to 1) measure their distribution across websites, 2) highlight their platform affinities, and 3) study their code complexities. We apply machine learning techniques to distinguish cryptojacking scripts from benign and malicious JavaScript samples with 100\% accuracy. For dynamic analysis, we analyze the effect of cryptojacking on critical system resources, such as CPU and battery usage. We also perform web browser fingerprinting to analyze the information exchange between the victim node and the dropzone cryptojacking server. We also build an analytical model to empirically evaluate the feasibility of cryptojacking as an alternative to online advertisement. Our results show a sizeable negative profit and loss gap, indicating that the model is economically infeasible. Finally, leveraging insights from our analyses, we build countermeasures for in-browser cryptojacking that improve the existing remedies.
Abstract
Federated learning (FL) demonstrates its advantages in integrating distributed infrastructure, communication, computing and learning in a privacy-preserving manner. However, the robustness and capabilities of existing FL methods are challenged by limited and dynamic data and conditions, complexities including heterogeneities and uncertainties, and analytical explainability. Bayesian federated learning (BFL) has emerged as a promising approach to address these issues. This survey presents a critical overview of BFL, including its basic concepts, its relations to Bayesian learning in the context of FL, and a taxonomy of BFL from both Bayesian and federated perspectives. We categorize and discuss client- and server-side and FL-based BFL methods and their pros and cons. The limitations of the existing BFL methods and the future directions of BFL research further address the intricate requirements of real-life FL applications.
Game-based Platforms for Artificial Intelligence Research
Authors: Chengpeng Hu, Yunlong Zhao, Ziqi Wang, Haocheng Du, Jialin Liu
Abstract
Games have been the perfect test-beds for artificial intelligence research for the characteristics that widely exist in real-world scenarios. Learning and optimisation, decision making in dynamic and uncertain environments, game theory, planning and scheduling, design and education are common research areas shared between games and real-world problems. Numerous open-sourced games or game-based environments have been implemented for studying artificial intelligence. In addition to single- or multi-player, collaborative or adversarial games, there has also been growing interest in implementing platforms for creative design in recent years. Those platforms provide ideal benchmarks for exploring and comparing artificial intelligence ideas and techniques. This paper reviews the game-based platforms for artificial intelligence research, discusses the research trend induced by the evolution of those platforms, and gives an outlook.
Machine Vision-Based Crop-Load Estimation Using YOLOv8
Authors: Dawood Ahmed, Ranjan Sapkota, Martin Churuvija, Manoj Karkee
Abstract
Labor shortages in fruit crop production have prompted the development of mechanized and automated machines as alternatives to labor-intensive orchard operations such as harvesting, pruning, and thinning. Agricultural robots capable of identifying tree canopy parts and estimating geometric and topological parameters, such as branch diameter, length, and angles, can optimize crop yields through automated pruning and thinning platforms. In this study, we proposed a machine vision system to estimate canopy parameters in apple orchards and determine an optimal number of fruit for individual branches, providing a foundation for robotic pruning, flower thinning, and fruitlet thinning to achieve desired yield and quality.Using color and depth information from an RGB-D sensor (Microsoft Azure Kinect DK), a YOLOv8-based instance segmentation technique was developed to identify trunks and branches of apple trees during the dormant season. Principal Component Analysis was applied to estimate branch diameter (used to calculate limb cross-sectional area, or LCSA) and orientation. The estimated branch diameter was utilized to calculate LCSA, which served as an input for crop-load estimation, with larger LCSA values indicating a higher potential fruit-bearing capacity.RMSE for branch diameter estimation was 2.08 mm, and for crop-load estimation, 3.95. Based on commercial apple orchard management practices, the target crop-load (number of fruit) for each segmented branch was estimated with a mean absolute error (MAE) of 2.99 (ground truth crop-load was 6 apples per LCSA). This study demonstrated a promising workflow with high performance in identifying trunks and branches of apple trees in dynamic commercial orchard environments and integrating farm management practices into automated decision-making.
Membrane Potential Distribution Adjustment and Parametric Surrogate Gradient in Spiking Neural Networks
Authors: Siqi Wang, Tee Hiang Cheng, Meng-Hiot Lim
Subjects: Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Abstract
As an emerging network model, spiking neural networks (SNNs) have aroused significant research attentions in recent years. However, the energy-efficient binary spikes do not augur well with gradient descent-based training approaches. Surrogate gradient (SG) strategy is investigated and applied to circumvent this issue and train SNNs from scratch. Due to the lack of well-recognized SG selection rule, most SGs are chosen intuitively. We propose the parametric surrogate gradient (PSG) method to iteratively update SG and eventually determine an optimal surrogate gradient parameter, which calibrates the shape of candidate SGs. In SNNs, neural potential distribution tends to deviate unpredictably due to quantization error. We evaluate such potential shift and propose methodology for potential distribution adjustment (PDA) to minimize the loss of undesired pre-activations. Experimental results demonstrate that the proposed methods can be readily integrated with backpropagation through time (BPTT) algorithm and help modulated SNNs to achieve state-of-the-art performance on both static and dynamic dataset with fewer timesteps.
Systems Modeling for novice engineers to comprehend software products better
Abstract
One of the key challenges for a novice engineer in a product company is to comprehend the product sufficiently and quickly. It can take anywhere from six months to several years for them to attain mastery but they need to start delivering results much before. SaaS (Software-as-a-Service) products have sophisticated system architecture which adds to the time and effort of understanding them. On the other hand, time available to new hires for product understanding continues to be short and getting shorter, given the pressure to deliver more in less time. Constructivist theory views learning as a personal process in which the learner constructs new knowledge for themselves. Building and refining a mental model is the key way in which they learn, similar to how the brain operates. This paper presents an approach to improve system comprehension process by using a system model that a) acts as a transitional object to aid and refine the mental model of the learner, and b) captures the current understanding of the dynamics of the software system in a way that can be reasoned with and simulated. We have adapted discrete systems modeling techniques and used a transition system as a lightweight modeling language. Such a model can be used by novice engineers during their product ramp-up phase to build a model of the software system that captures their knowledge of the system and aid their mental model. The paper also presents a learning approach in which the learners create and refine these models iteratively using the available and newly uncovered knowledge about the software system. We hypothesize that by leveraging this modeling language and approach, novice engineers can reduce the time it takes them to achieve desired proficiency level of system comprehension. This paper presents early ideas on this language and approach.
HiQ -- A Declarative, Non-intrusive, Dynamic and Transparent Observability and Optimization System
Abstract
This paper proposes a non-intrusive, declarative, dynamic and transparent system called HiQ to track Python program runtime information without compromising on the run-time system performance and losing insight. HiQ can be used for monolithic and distributed systems, offline and online applications. HiQ is developed when we optimize our large deep neural network (DNN) models which are written in Python, but it can be generalized to any Python program or distributed system, or even other languages like Java. We have implemented the system and adopted it in our deep learning model life cycle management system to catch the bottleneck while keeping our production code clean and highly performant. The implementation is open-sourced at: https://github.com/oracle/hiq.
The physical Church thesis and the sensitivity to initial conditions
Abstract
The physical Church thesis is a thesis about nature that expresses that all that can be computed by a physical system-a machine-is computable in the sense of computability theory. At a first look, this thesis seems contradictory with the existence, in nature, of chaotic dynamical systems, that is systems whose evolution cannot be ''computed'' because of their sensitivity to initial conditions. The goal of this note is to show that there exist dynamical systems that are both computable and chaotic, and thus that the existence in nature of chaotic dynamical system is not, per se, a refutation of the physical Church thesis. Thus, chaos seems to be compatible with computability, in the same way as it is compatible with determinism.
Event-triggered Boundary Control of a Class of Reaction-Diffusion PDEs with Time-dependent Reactivity
Abstract
This paper presents an event-triggered boundary control strategy for a class of reaction-diffusion PDEs with time-varying reactivity under Robin actuation. The control approach consists of a backstepping full-state feedback boundary controller and a dynamic event-triggering condition, which determines the time instants when the control input needs to be updated. It is proved that under the proposed event-triggered boundary control approach, there is a uniform minimal dwell-time between two event times. Furthermore, the well-posedness and the global exponential convergence of the closed-loop system to zero in $L^2$-sense are established. A simulation is conducted to validate the theoretical developments.
Evaluation of Regularization-based Continual Learning Approaches: Application to HAR
Authors: Bonpagna Kann (UGA, M-PSI), Sandra Castellanos-Paez (UGA, M-PSI), Philippe Lalanda (UGA, M-PSI)
Abstract
Pervasive computing allows the provision of services in many important areas, including the relevant and dynamic field of health and well-being. In this domain, Human Activity Recognition (HAR) has gained a lot of attention in recent years. Current solutions rely on Machine Learning (ML) models and achieve impressive results. However, the evolution of these models remains difficult, as long as a complete retraining is not performed. To overcome this problem, the concept of Continual Learning is very promising today and, more particularly, the techniques based on regularization. These techniques are particularly interesting for their simplicity and their low cost. Initial studies have been conducted and have shown promising outcomes. However, they remain very specific and difficult to compare. In this paper, we provide a comprehensive comparison of three regularization-based methods that we adapted to the HAR domain, highlighting their strengths and limitations. Our experiments were conducted on the UCI HAR dataset and the results showed that no single technique outperformed all others in all scenarios considered.
Abstract
Recently, 3D object detection has attracted significant attention and achieved continuous improvement in real road scenarios. The environmental information is collected from a single sensor or multi-sensor fusion to detect interested objects. However, most of the current 3D object detection approaches focus on developing advanced network architectures to improve the detection precision of the object rather than considering the dynamic driving scenes, where data collected from sensors equipped in the vehicle contain various perturbation features. As a result, existing work cannot still tackle the perturbation issue. In order to solve this problem, we propose a group equivariant bird's eye view network (GeqBevNet) based on the group equivariant theory, which introduces the concept of group equivariant into the BEV fusion object detection network. The group equivariant network is embedded into the fused BEV feature map to facilitate the BEV-level rotational equivariant feature extraction, thus leading to lower average orientation error. In order to demonstrate the effectiveness of the GeqBevNet, the network is verified on the nuScenes validation dataset in which mAOE can be decreased to 0.325. Experimental results demonstrate that GeqBevNet can extract more rotational equivariant features in the 3D object detection of the actual road scene and improve the performance of object orientation prediction.
Acceleration for Timing-Aware Gate-Level Logic Simulation with One-Pass GPU Parallelism
Abstract
Witnessing the advancing scale and complexity of chip design and benefiting from high-performance computation technologies, the simulation of Very Large Scale Integration (VLSI) Circuits imposes an increasing requirement for acceleration through parallel computing with GPU devices. However, the conventional parallel strategies do not fully align with modern GPU abilities, leading to new challenges in the parallelism of VLSI simulation when using GPU, despite some previous successful demonstrations of significant acceleration. In this paper, we propose a novel approach to accelerate 4-value logic timing-aware gate-level logic simulation using waveform-based GPU parallelism. Our approach utilizes a new strategy that can effectively handle the dependency between tasks during the parallelism, reducing the synchronization requirement between CPU and GPU when parallelizing the simulation on combinational circuits. This approach requires only one round of data transfer and hence achieves one-pass parallelism. Moreover, to overcome the difficulty within the adoption of our strategy in GPU devices, we design a series of data structures and tune them to dynamically allocate and store new-generated output with uncertain scale. Finally, experiments are carried out on industrial-scale open-source benchmarks to demonstrate the performance gain of our approach compared to several state-of-the-art baselines.
Secure Communication Model For Quantum Federated Learning: A Post Quantum Cryptography (PQC) Framework
Authors: Dev Gurung, Shiva Raj Pokhrel, Gang Li
Subjects: Cryptography and Security (cs.CR); Machine Learning (cs.LG)
Abstract
We design a model of Post Quantum Cryptography (PQC) Quantum Federated Learning (QFL). We develop a framework with a dynamic server selection and study convergence and security conditions. The implementation and results are publicly available1.
FLEX: an Adaptive Exploration Algorithm for Nonlinear Systems
Abstract
Model-based reinforcement learning is a powerful tool, but collecting data to fit an accurate model of the system can be costly. Exploring an unknown environment in a sample-efficient manner is hence of great importance. However, the complexity of dynamics and the computational limitations of real systems make this task challenging. In this work, we introduce FLEX, an exploration algorithm for nonlinear dynamics based on optimal experimental design. Our policy maximizes the information of the next step and results in an adaptive exploration algorithm, compatible with generic parametric learning models and requiring minimal resources. We test our method on a number of nonlinear environments covering different settings, including time-varying dynamics. Keeping in mind that exploration is intended to serve an exploitation objective, we also test our algorithm on downstream model-based classical control tasks and compare it to other state-of-the-art model-based and model-free approaches. The performance achieved by FLEX is competitive and its computational cost is low.
On MPC-based Strategies for Optimal Voltage References in DC Microgrids
Authors: Pol Jané-Soneira, Ionela Prodan, Albertus Johannes Malan, Sören Hohmann
Abstract
Modern power systems are characterized by low inertia and fast voltage dynamics due to the increase of sources connecting via power electronics and the removal of large traditional thermal generators. Power electronics are commonly equipped with fast controllers that are able to reach a desired voltage setpoint within seconds. In this paper, we propose and compare two approaches using Model Predictive Control (MPC) to compute optimal voltage references for the power electronic devices in order to minimize the losses in a DC microgrid: i) a traditional setpoint-tracking MPC which receives a previously computed optimal setpoint; ii) an economic MPC which does not require a priori computed setpoints. We show that the economic MPC outperforms the setpoint-tracking MPC in simulations with the CIGRE benchmark system when multiple load disturbances occur. Some insights and discussions related to the stability of the closed-loop system using its dissipativity properties are highlighted for both approaches.
Techno-Economic Assessment in Communications: New Challenges
Authors: Carlos Bendicho, Daniel Bendicho
Subjects: Networking and Internet Architecture (cs.NI)
Abstract
This article shows a brief history of Techno-Economic Assessment (TEA) in Communications, a proposed redefinition of TEA as well as the new challenges derived from a dynamic context with cloud-native virtualized networks, the Helium Network & alike blockchain-based decentralized networks, the new network as a platform (NaaP) paradigm, carbon pricing, network sharing, and web3, metaverse and blockchain technologies. The authors formulate the research question and show the need to improve TEA models to integrate and manage all this increasing complexity. This paper also proposes the characteristics TEA models should have and their current degree of compliance for several use cases: 5G and beyond, software-defined wide area network (SD-WAN), secure access service edge (SASE), secure service edge (SSE), and cloud cybersecurity risk assessment. The authors also present TEA extensibility to request for proposals (RFP) processes and other industries, to conclude that there is an urgent need for agile and effective TEA in Comms that allows industrialization of agile decision-making for all market stakeholders to choose the optimal solution for any technology, scenario and use case.
A Secure Medical Record Sharing Scheme Based on Blockchain and Two-fold Encryption
Authors: Md. Ahsan Habib, Kazi Md. Rokibul Alam, Yasuhiko Morimoto
Abstract
Usually, a medical record (MR) contains the patients disease-oriented sensitive information. In addition, the MR needs to be shared among different bodies, e.g., diagnostic centres, hospitals, physicians, etc. Hence, retaining the privacy and integrity of MR is crucial. A blockchain based secure MR sharing system can manage these aspects properly. This paper proposes a blockchain based electronic (e-) MR sharing scheme that (i) considers the medical image and the text as the input, (ii) enriches the data privacy through a two-fold encryption mechanism consisting of an asymmetric cryptosystem and the dynamic DNA encoding, (iii) assures data integrity by storing the encrypted e-MR in the distinct block designated for each user in the blockchain, and (iv) eventually, enables authorized entities to regain the e-MR through decryption. Preliminary evaluations, analyses, comparisons with state-of-the-art works, etc., imply the efficacy of the proposed scheme.
D-STACK: High Throughput DNN Inference by Effective Multiplexing and Spatio-Temporal Scheduling of GPUs
Authors: Aditya Dhakal, Sameer G. Kulkarni, K. K. Ramakrishnan
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Performance (cs.PF); Systems and Control (eess.SY)
Abstract
Hardware accelerators such as GPUs are required for real-time, low-latency inference with Deep Neural Networks (DNN). However, due to the inherent limits to the parallelism they can exploit, DNNs often under-utilize the capacity of today's high-end accelerators. Although spatial multiplexing of the GPU, leads to higher GPU utilization and higher inference throughput, there remain a number of challenges. Finding the GPU percentage for right-sizing the GPU for each DNN through profiling, determining an optimal batching of requests to balance throughput improvement while meeting application-specific deadlines and service level objectives (SLOs), and maximizing throughput by appropriately scheduling DNNs are still significant challenges. This paper introduces a dynamic and fair spatio-temporal scheduler (D-STACK) that enables multiple DNNs to run in the GPU concurrently. To help allocate the appropriate GPU percentage (we call it the "Knee"), we develop and validate a model that estimates the parallelism each DNN can utilize. We also develop a lightweight optimization formulation to find an efficient batch size for each DNN operating with D-STACK. We bring together our optimizations and our spatio-temporal scheduler to provide a holistic inference framework. We demonstrate its ability to provide high throughput while meeting application SLOs. We compare D-STACK with an ideal scheduler that can allocate the right GPU percentage for every DNN kernel. D-STACK gets higher than 90 percent throughput and GPU utilization compared to the ideal scheduler. We also compare D-STACK with other GPU multiplexing and scheduling methods (e.g., NVIDIA Triton, Clipper, Nexus), using popular DNN models. Our controlled experiments with multiplexing several popular DNN models achieve up to 1.6X improvement in GPU utilization and up to 4X improvement in inference throughput.
FLCC: Efficient Distributed Federated Learning on IoMT over CSMA/CA
Authors: Abdelaziz Salama, Syed Ali Zaidi, Des McLernon, Mohammed M. H. Qazzaz
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG); Systems and Control (eess.SY)
Abstract
Federated Learning (FL) has emerged as a promising approach for privacy preservation, allowing sharing of the model parameters between users and the cloud server rather than the raw local data. FL approaches have been adopted as a cornerstone of distributed machine learning (ML) to solve several complex use cases. FL presents an interesting interplay between communication and ML performance when implemented over distributed wireless nodes. Both the dynamics of networking and learning play an important role. In this article, we investigate the performance of FL on an application that might be used to improve a remote healthcare system over ad hoc networks which employ CSMA/CA to schedule its transmissions. Our FL over CSMA/CA (FLCC) model is designed to eliminate untrusted devices and harness frequency reuse and spatial clustering techniques to improve the throughput required for coordinating a distributed implementation of FL in the wireless network. In our proposed model, frequency allocation is performed on the basis of spatial clustering performed using virtual cells. Each cell assigns a FL server and dedicated carrier frequencies to exchange the updated model's parameters within the cell. We present two metrics to evaluate the network performance: 1) probability of successful transmission while minimizing the interference, and 2) performance of distributed FL model in terms of accuracy and loss while considering the networking dynamics. We benchmark the proposed approach using a well-known MNIST dataset for performance evaluation. We demonstrate that the proposed approach outperforms the baseline FL algorithms in terms of explicitly defining the chosen users' criteria and achieving high accuracy in a robust network.
Turning block-sequential automata networks into smaller parallel networks with isomorphic limit dynamics
Abstract
We state an algorithm that, given an automata network and a block-sequential update schedule, produces an automata network of the same size or smaller with the same limit dynamics under the parallel update schedule. Then, we focus on the family of automata cycles which share a unique path of automata, called tangential cycles, and show that a restriction of our algorithm allows to reduce any instance of these networks under a block-sequential update schedule into a smaller parallel network of the family and to characterize the number of reductions operated while conserving their limit dynamics. We also show that any tangential cycles reduced by our main algorithm are transformed into a network whose size is that of the largest cycle of the initial network. We end by showing that the restricted algorithm allows the direct characterization of block-sequential double cycles as parallel ones.
Latency Target based Analysis of the DASH.js Player
Authors: Piers O'Hanlon, Adil Aslam
Subjects: Multimedia (cs.MM); Networking and Internet Architecture (cs.NI); Performance (cs.PF)
Abstract
We analyse the low latency performance of the three Adaptive Bitrate (ABR) algorithms in the dash.js Dynamic Adaptive Streaming over HTTP (DASH) player with respect to a range of latency targets and configuration options. We perform experiments on our DASH Testbed which allows for testing with a range of real world derived network profiles. Our experiments enable a better understanding of how latency targets affect quality of experience (QoE), and how well the different algorithms adhere to their targets. We find that with dash.js v4.5.0 the default Dynamic algorithm achieves the best overall QoE. We show that whilst the other algorithms can achieve higher video quality at lower latencies, they do so only at the expense of increased stalling. We analyse the poor performance of L2A-LL in our tests and develop modifications which demonstrate significant improvements. We also highlight how some low latency configuration settings can be detrimental to performance.
Leapfrog methods for relativistic charged-particle dynamics
Authors: Ernst Hairer, Christian Lubich, Yanyan Shi
Abstract
A basic leapfrog integrator and its energy-preserving and variational / symplectic variants are proposed and studied for the numerical integration of the equations of motion of relativistic charged particles in an electromagnetic field. The methods are based on a four-dimensional formulation of the equations of motion. Structure-preserving properties of the numerical methods are analysed, in particular conservation and long-time near-conservation of energy and mass shell as well as preservation of volume in phase space. In the non-relativistic limit, the considered methods reduce to the Boris algorithm for non-relativistic charged-particle dynamics and its energy-preserving and variational / symplectic variants.
Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning
Authors: Tuomas Haarnoja, Ben Moran, Guy Lever, Sandy H. Huang, Dhruva Tirumala, Markus Wulfmeier, Jan Humplik, Saran Tunyasuvunakool, Noah Y. Siegel, Roland Hafner, Michael Bloesch, Kristian Hartikainen, Arunkumar Byravan, Leonard Hasenclever, Yuval Tassa, Fereshteh Sadeghi, Nathan Batchelor, Federico Casarini, Stefano Saliceti, Charles Game, Neil Sreendra, Kushal Patel, Marlon Gwira, Andrea Huber, Nicole Hurley, Francesco Nori, Raia Hadsell, Nicolas Heess
Abstract
We investigate whether Deep Reinforcement Learning (Deep RL) is able to synthesize sophisticated and safe movement skills for a low-cost, miniature humanoid robot that can be composed into complex behavioral strategies in dynamic environments. We used Deep RL to train a humanoid robot with 20 actuated joints to play a simplified one-versus-one (1v1) soccer game. We first trained individual skills in isolation and then composed those skills end-to-end in a self-play setting. The resulting policy exhibits robust and dynamic movement skills such as rapid fall recovery, walking, turning, kicking and more; and transitions between them in a smooth, stable, and efficient manner - well beyond what is intuitively expected from the robot. The agents also developed a basic strategic understanding of the game, and learned, for instance, to anticipate ball movements and to block opponent shots. The full range of behaviors emerged from a small set of simple rewards. Our agents were trained in simulation and transferred to real robots zero-shot. We found that a combination of sufficiently high-frequency control, targeted dynamics randomization, and perturbations during training in simulation enabled good-quality transfer, despite significant unmodeled effects and variations across robot instances. Although the robots are inherently fragile, minor hardware modifications together with basic regularization of the behavior during training led the robots to learn safe and effective movements while still performing in a dynamic and agile way. Indeed, even though the agents were optimized for scoring, in experiments they walked 156% faster, took 63% less time to get up, and kicked 24% faster than a scripted baseline, while efficiently combining the skills to achieve the longer term objectives. Examples of the emergent behaviors and full 1v1 matches are available on the supplementary website.
Learning battery model parameter dynamics from data with recursive Gaussian process regression
Authors: Antti Aitio, Dominik Jöst, Dirk Uwe Sauer, David A. Howey
Subjects: Systems and Control (eess.SY); Machine Learning (cs.LG)
Abstract
Estimating state of health is a critical function of a battery management system but remains challenging due to the variability of operating conditions and usage requirements of real applications. As a result, techniques based on fitting equivalent circuit models may exhibit inaccuracy at extremes of performance and over long-term ageing, or instability of parameter estimates. Pure data-driven techniques, on the other hand, suffer from lack of generality beyond their training dataset. In this paper, we propose a hybrid approach combining data- and model-driven techniques for battery health estimation. Specifically, we demonstrate a Bayesian data-driven method, Gaussian process regression, to estimate model parameters as functions of states, operating conditions, and lifetime. Computational efficiency is ensured through a recursive approach yielding a unified joint state-parameter estimator that learns parameter dynamics from data and is robust to gaps and varying operating conditions. Results show the efficacy of the method, on both simulated and measured data, including accurate estimates and forecasts of battery capacity and internal resistance. This opens up new opportunities to understand battery ageing in real applications.
Abstract
Video is a promising source of knowledge for embodied agents to learn models of the world's dynamics. Large deep networks have become increasingly effective at modeling complex video data in a self-supervised manner, as evaluated by metrics based on human perceptual similarity or pixel-wise comparison. However, it remains unclear whether current metrics are accurate indicators of performance on downstream tasks. We find empirically that for planning robotic manipulation, existing metrics can be unreliable at predicting execution success. To address this, we propose a benchmark for action-conditioned video prediction in the form of a control benchmark that evaluates a given model for simulated robotic manipulation through sampling-based planning. Our benchmark, Video Prediction for Visual Planning ($VP^2$), includes simulated environments with 11 task categories and 310 task instance definitions, a full planning implementation, and training datasets containing scripted interaction trajectories for each task category. A central design goal of our benchmark is to expose a simple interface -- a single forward prediction call -- so it is straightforward to evaluate almost any action-conditioned video prediction model. We then leverage our benchmark to study the effects of scaling model size, quantity of training data, and model ensembling by analyzing five highly-performant video prediction models, finding that while scale can improve perceptual quality when modeling visually diverse settings, other attributes such as uncertainty awareness can also aid planning performance.
Keyword: efficient
VeML: An End-to-End Machine Learning Lifecycle for Large-scale and High-dimensional Data
Diffusion Probabilistic Model Based Accurate and High-Degree-of-Freedom Metasurface Inverse Design
Optimizing Deep Learning Models For Raspberry Pi
Organizational Governance of Emerging Technologies: AI Adoption in Healthcare
Bridging graph data models: RDF, RDF-star, and property graphs as directed acyclic graphs
Exponentially Convergent Numerical Method for Abstract Cauchy Problem with Fractional Derivative of Caputo Type
Application of Transformers for Nonlinear Channel Compensation in Optical Systems
Directed Chain Generative Adversarial Networks
ESimCSE Unsupervised Contrastive Learning Jointly with UDA Semi-Supervised Learning for Large Label System Text Classification Mode
LumiGAN: Unconditional Generation of Relightable 3D Human Faces
LEMaRT: Label-Efficient Masked Region Transform for Image Harmonization
Connector 0.5: A unified framework for graph representation learning
Numerical methods for computing the discrete and continuous Laplace transforms
Cooperative Hierarchical Deep Reinforcement Learning based Joint Sleep, Power, and RIS Control for Energy-Efficient HetNet
Generating Adversarial Examples with Task Oriented Multi-Objective Optimization
Numerical Approximation of Andrews Plots with Optimal Spatial-Spectral Smoothing
smooth'' on average, and solve an infinite-dimensional quadratic minimization program over the set of linear isometries from the Euclidean data space to $L^2([0,1])$. By building technical machinery that characterizes the solutions to general infinite-dimensional quadratic minimization programs over linear isometries, we further show that the solution set is (in the generic case) a manifold. To avoid the ambiguities presented by this manifold of solutions, we add
spectral smoothing'' terms to the infinite-dimensional optimization program to induce Andrews plots with optimal spatial-spectral smoothing. We characterize the (generic) set of solutions to this program and prove that the resulting plots admit efficient numerical approximations. These spatial-spectral smooth Andrews plots tend to avoid some ``visual clutter'' that arises due to the oscillation of trigonometric polynomials.Structure Diagram Recognition in Financial Announcements
ESCM: An Efficient and Secure Communication Mechanism for UAV Networks
CrowdCache: A Decentralized Game-Theoretic Framework for Mobile Edge Content Sharing
Solution of planar elastic stress problems using stress basis functions
C2PI: An Efficient Crypto-Clear Two-Party Neural Network Private Inference
Making Models Shallow Again: Jointly Learning to Reduce Non-Linearity and Depth for Latency-Efficient Private Inference
Membrane Potential Distribution Adjustment and Parametric Surrogate Gradient in Spiking Neural Networks
Scene Graph Lossless Compression with Adaptive Prediction for Objects and Relations
Efficient Explainable Face Verification based on Similarity Score Argument Backpropagation
Fair Selection of Edge Nodes to Participate in Clustered Federated Multitask Learning
FLEX: an Adaptive Exploration Algorithm for Nonlinear Systems
An efficient multiple harmonic balance method for computing quasi-periodic responses of nonlinear systems
An Improved Modular Addition Checksum Algorithm
Leveraging Compositional Methods for Modeling and Verification of an Autonomous Taxi System
Konzeption und Umsetzung einer mobilen Applikation zur Validierung von fälschungssicheren Produktlabeln
Integrated Architecture for Neural Networks and Security Primitives using RRAM Crossbar
A Two-Step Rule for Backpropagation
ElegansNet: a brief scientific report and initial experiments
D-STACK: High Throughput DNN Inference by Effective Multiplexing and Spatio-Temporal Scheduling of GPUs
On the Order of Power Series and the Sum of Square Roots Problem
one dimensional'' variant. We identify the key mathematical challenges for solving this
one dimensional'' variant.The Roles of Symbols in Neural-based AI: They are Not What You Think!
Experimental Validation of Model-less Robust Voltage Control using Measurement-based Estimated Voltage Sensitivity Coefficients
PVP: Pre-trained Visual Parameter-Efficient Tuning
Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning
A Personalized Dense Retrieval Framework for Unified Information Access
Building K-Anonymous User Cohorts with\ Consecutive Consistent Weighted Sampling (CCWS)
Hitting Subgraphs in Sparse Graphs and Geometric Intersection Graphs
An Investigation into Active Control for Accessible Orbital Flight
Association Rules Mining with Auto-Encoders
Controllable Image Generation via Collage Representations
Keyword: faster
GENIE-NF-AI: Identifying Neurofibromatosis Tumors using Liquid Neural Network (LTC) trained on AACR GENIE Datasets
From Chaos Comes Order: Ordering Event Representations for Object Detection
Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning
An Investigation into Active Control for Accessible Orbital Flight
Keyword: mobile
Learning to Predict Navigational Patterns from Partial Observations
ESCM: An Efficient and Secure Communication Mechanism for UAV Networks
CrowdCache: A Decentralized Game-Theoretic Framework for Mobile Edge Content Sharing
Digital technologies in the context of university transition and disability: Theoretical and empirical advances
Konzeption und Umsetzung einer mobilen Applikation zur Validierung von fälschungssicheren Produktlabeln
Thermal Vision for Soil Assessment in a Multipurpose Environmental Chamber under Martian Conditions towards Robot Navigation
Keyword: pruning
Optimizing Deep Learning Models For Raspberry Pi
Towards Compute-Optimal Transfer Learning
Machine Vision-Based Crop-Load Estimation Using YOLOv8
Concept-Monitor: Understanding DNN training through individual neurons
Filter Pruning via Filters Similarity in Consecutive Layers
Sparsified Model Zoo Twins: Investigating Populations of Sparsified Neural Network Models
Keyword: voxel
VGOS: Voxel Grid Optimization for View Synthesis from Sparse Inputs
Keyword: lidar
Single-View Height Estimation with Conditional Diffusion Probabilistic Models
Keyword: diffusion
Diffusion Probabilistic Model Based Accurate and High-Degree-of-Freedom Metasurface Inverse Design
Directed Chain Generative Adversarial Networks
Single-View Height Estimation with Conditional Diffusion Probabilistic Models
Score-based Generative Modeling Through Backward Stochastic Differential Equations: Inversion and Generation
Preconditioned discontinuous Galerkin method and convection-diffusion-reaction problems with guaranteed bounds to resulting spectra
Event-triggered Boundary Control of a Class of Reaction-Diffusion PDEs with Time-dependent Reactivity
Mixed finite element methods for nonlinear reaction-diffusion equations with interfaces
DiffuseExpand: Expanding dataset for 2D medical image segmentation using diffusion models
Training-Free Location-Aware Text-to-Image Synthesis
Keyword: dynamic
Model Extraction Attacks Against Reinforcement Learning Based Controllers
Uncovering the Representation of Spiking Neural Networks Trained with Surrogate Gradient
Time-Selective RNN for Device-Free Multi-Room Human Presence Detection Using WiFi CSI
Analysis and Mitigation of Shared Resource Contention on Heterogeneous Multicore: An Industrial Case Study
Roll-Drop: accounting for observation noise with a single parameter
The Limited Integrator Model Regulator And its Use in Vehicle Steering Control
How to design, and tune, a computed torque controller: An introduction and a Matlab example
Dynamic Datasets and Market Environments for Financial Reinforcement Learning
Splitting physics-informed neural networks for inferring the dynamics of integer- and fractional-order neuron models
Performance of the Gittins Policy in the G/G/1 and G/G/k, With and Without Setup Times
Analyzing In-browser Cryptojacking
Bayesian Federated Learning: A Survey
Game-based Platforms for Artificial Intelligence Research
Machine Vision-Based Crop-Load Estimation Using YOLOv8
Membrane Potential Distribution Adjustment and Parametric Surrogate Gradient in Spiking Neural Networks
Systems Modeling for novice engineers to comprehend software products better
HiQ -- A Declarative, Non-intrusive, Dynamic and Transparent Observability and Optimization System
HiQ
to track Python program runtime information without compromising on the run-time system performance and losing insight. HiQ can be used for monolithic and distributed systems, offline and online applications. HiQ is developed when we optimize our large deep neural network (DNN) models which are written in Python, but it can be generalized to any Python program or distributed system, or even other languages like Java. We have implemented the system and adopted it in our deep learning model life cycle management system to catch the bottleneck while keeping our production code clean and highly performant. The implementation is open-sourced at: https://github.com/oracle/hiq.The physical Church thesis and the sensitivity to initial conditions
Event-triggered Boundary Control of a Class of Reaction-Diffusion PDEs with Time-dependent Reactivity
Evaluation of Regularization-based Continual Learning Approaches: Application to HAR
Group Equivariant BEV for 3D Object Detection
Acceleration for Timing-Aware Gate-Level Logic Simulation with One-Pass GPU Parallelism
Secure Communication Model For Quantum Federated Learning: A Post Quantum Cryptography (PQC) Framework
FLEX: an Adaptive Exploration Algorithm for Nonlinear Systems
On MPC-based Strategies for Optimal Voltage References in DC Microgrids
Techno-Economic Assessment in Communications: New Challenges
A Secure Medical Record Sharing Scheme Based on Blockchain and Two-fold Encryption
D-STACK: High Throughput DNN Inference by Effective Multiplexing and Spatio-Temporal Scheduling of GPUs
FLCC: Efficient Distributed Federated Learning on IoMT over CSMA/CA
Turning block-sequential automata networks into smaller parallel networks with isomorphic limit dynamics
Latency Target based Analysis of the DASH.js Player
Leapfrog methods for relativistic charged-particle dynamics
Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning
Learning battery model parameter dynamics from data with recursive Gaussian process regression
A Control-Centric Benchmark for Video Prediction