Abstract
Differential Privacy (DP) is a key property to protect data and models from integrity attacks. In the Deep Learning (DL) field, it is commonly implemented through the Differentially Private Stochastic Gradient Descent (DP-SGD). However, when a model is shared or released, there is no way to check whether it is differentially private, that is, it required to trust the model provider. This situation poses a problem when data privacy is mandatory, specially with current data regulations, as the presence of DP can not be certificated consistently by any third party. Thus, we face the challenge of determining whether a DL model has been trained with DP, according to the title question: Can we infer the presence of Differential Privacy in Deep Learning models' weights? Since the DP-SGD significantly changes the training process of a DL model, we hypothesize that DP leaves an imprint in the weights of a DL model, which can be used to predict whether a model has been trained with DP regardless of its architecture and the training dataset. In this paper, we propose to employ the imprint in model weights of using DP to infer the presence of DP training in a DL model. To substantiate our hypothesis, we developed an experimental methodology based on two datasets of weights of DL models, each with models with and without DP training and a meta-classifier to infer whether DP was used in the training process of a DL model, by accessing its weights. We accomplish both, the removal of the requirement of a trusted model provider and a strong foundation for this interesting line of research. Thus, our contribution is an additional layer of security on top of the strict private requirements of DP training in DL models, towards to DL models.
Keyword: optimization
Towards an Automatic AI Agent for Reaction Condition Recommendation in Chemical Synthesis
Abstract
Artificial intelligence (AI) for reaction condition optimization has become an important topic in the pharmaceutical industry, given that a data-driven AI model can assist drug discovery and accelerate reaction design. However, existing AI models lack the chemical insights and real-time knowledge acquisition abilities of experienced human chemists. This paper proposes a Large Language Model (LLM) empowered AI agent to bridge this gap. We put forth a novel three-phase paradigm and applied advanced intelligence-enhancement methods like in-context learning and multi-LLM debate so that the AI agent can borrow human insight and update its knowledge by searching the latest chemical literature. Additionally, we introduce a novel Coarse-label Contrastive Learning (CCL) based chemical fingerprint that greatly enhances the agent's performance in optimizing the reaction condition. With the above efforts, the proposed AI agent can autonomously generate the optimal reaction condition recommendation without any human interaction. Further, the agent is highly professional in terms of chemical reactions. It demonstrates close-to-human performance and strong generalization capability in both dry-lab and wet-lab experiments. As the first attempt in the chemical AI agent, this work goes a step further in the field of "AI for chemistry" and opens up new possibilities for computer-aided synthesis planning.
The Next 700 ML-Enabled Compiler Optimizations
Authors: Authors: S. VenkataKeerthy, Siddharth Jain, Umesh Kalvakuntla, Pranav Sai Gorantla, Rajiv Shailesh Chitale, Eugene Brevdo, Albert Cohen, Mircea Trofin, Ramakrishna Upadrasta
Subjects: Programming Languages (cs.PL); Machine Learning (cs.LG); Performance (cs.PF)
Abstract
There is a growing interest in enhancing compiler optimizations with ML models, yet interactions between compilers and ML frameworks remain challenging. Some optimizations require tightly coupled models and compiler internals,raising issues with modularity, performance and framework independence. Practical deployment and transparency for the end-user are also important concerns. We propose ML-Compiler-Bridge to enable ML model development within a traditional Python framework while making end-to-end integration with an optimizing compiler possible and efficient. We evaluate it on both research and production use cases, for training and inference, over several optimization problems, multiple compilers and its versions, and gym infrastructures.
SplatArmor: Articulated Gaussian splatting for animatable humans from monocular RGB videos
Abstract
We propose SplatArmor, a novel approach for recovering detailed and animatable human models by `armoring' a parameterized body model with 3D Gaussians. Our approach represents the human as a set of 3D Gaussians within a canonical space, whose articulation is defined by extending the skinning of the underlying SMPL geometry to arbitrary locations in the canonical space. To account for pose-dependent effects, we introduce a SE(3) field, which allows us to capture both the location and anisotropy of the Gaussians. Furthermore, we propose the use of a neural color field to provide color regularization and 3D supervision for the precise positioning of these Gaussians. We show that Gaussian splatting provides an interesting alternative to neural rendering based methods by leverging a rasterization primitive without facing any of the non-differentiability and optimization challenges typically faced in such approaches. The rasterization paradigms allows us to leverage forward skinning, and does not suffer from the ambiguities associated with inverse skinning and warping. We show compelling results on the ZJU MoCap and People Snapshot datasets, which underscore the effectiveness of our method for controllable human synthesis.
Compact and Intuitive Airfoil Parameterization Method through Physics-aware Variational Autoencoder
Abstract
Airfoil shape optimization plays a critical role in the design of high-performance aircraft. However, the high-dimensional nature of airfoil representation causes the challenging problem known as the "curse of dimensionality". To overcome this problem, numerous airfoil parameterization methods have been developed, which can be broadly classified as polynomial-based and data-driven approaches. Each of these methods has desirable characteristics such as flexibility, parsimony, feasibility, and intuitiveness, but a single approach that encompasses all of these attributes has yet to be found. For example, polynomial-based methods struggle to balance parsimony and flexibility, while data-driven methods lack in feasibility and intuitiveness. In recent years, generative models, such as generative adversarial networks and variational autoencoders, have shown promising potential in airfoil parameterization. However, these models still face challenges related to intuitiveness due to their black-box nature. To address this issue, we developed a novel airfoil parameterization method using physics-aware variational autoencoder. The proposed method not only explicitly separates the generation of thickness and camber distributions to produce smooth and non-intersecting airfoils, thereby improving feasibility, but it also directly aligns its latent dimensions with geometric features of the airfoil, significantly enhancing intuitiveness. Finally, extensive comparative studies were performed to demonstrate the effectiveness of our approach.
Bridging Data-Driven and Knowledge-Driven Approaches for Safety-Critical Scenario Generation in Automated Vehicle Validation
Authors: Authors: Kunkun Hao, Lu Liu, Wen Cui, Jianxing Zhang, Songyang Yan, Yuxi Pan, Zijiang Yang
Abstract
Automated driving vehicles~(ADV) promise to enhance driving efficiency and safety, yet they face intricate challenges in safety-critical scenarios. As a result, validating ADV within generated safety-critical scenarios is essential for both development and performance evaluations. This paper investigates the complexities of employing two major scenario-generation solutions: data-driven and knowledge-driven methods. Data-driven methods derive scenarios from recorded datasets, efficiently generating scenarios by altering the existing behavior or trajectories of traffic participants but often falling short in considering ADV perception; knowledge-driven methods provide effective coverage through expert-designed rules, but they may lead to inefficiency in generating safety-critical scenarios within that coverage. To overcome these challenges, we introduce BridgeGen, a safety-critical scenario generation framework, designed to bridge the benefits of both methodologies. Specifically, by utilizing ontology-based techniques, BridgeGen models the five scenario layers in the operational design domain (ODD) from knowledge-driven methods, ensuring broad coverage, and incorporating data-driven strategies to efficiently generate safety-critical scenarios. An optimized scenario generation toolkit is developed within BridgeGen. This expedites the crafting of safety-critical scenarios through a combination of traditional optimization and reinforcement learning schemes. Extensive experiments conducted using Carla simulator demonstrate the effectiveness of BridgeGen in generating diverse safety-critical scenarios.
Bit Cipher -- A Simple yet Powerful Word Representation System that Integrates Efficiently with Language Models
Authors: Authors: Haoran Zhao, Jake Ryland Williams
Abstract
While Large Language Models (LLMs) become ever more dominant, classic pre-trained word embeddings sustain their relevance through computational efficiency and nuanced linguistic interpretation. Drawing from recent studies demonstrating that the convergence of GloVe and word2vec optimizations all tend towards log-co-occurrence matrix variants, we construct a novel word representation system called Bit-cipher that eliminates the need of backpropagation while leveraging contextual information and hyper-efficient dimensionality reduction techniques based on unigram frequency, providing strong interpretability, alongside efficiency. We use the bit-cipher algorithm to train word vectors via a two-step process that critically relies on a hyperparameter -- bits -- that controls the vector dimension. While the first step trains the bit-cipher, the second utilizes it under two different aggregation modes -- summation or concatenation -- to produce contextually rich representations from word co-occurrences. We extend our investigation into bit-cipher's efficacy, performing probing experiments on part-of-speech (POS) tagging and named entity recognition (NER) to assess its competitiveness with classic embeddings like word2vec and GloVe. Additionally, we explore its applicability in LM training and fine-tuning. By replacing embedding layers with cipher embeddings, our experiments illustrate the notable efficiency of cipher in accelerating the training process and attaining better optima compared to conventional training paradigms. Experiments on the integration of bit-cipher embedding layers with Roberta, T5, and OPT, prior to or as a substitute for fine-tuning, showcase a promising enhancement to transfer learning, allowing rapid model convergence while preserving competitive performance.
Implicit Event-RGBD Neural SLAM
Authors: Authors: Delin Qu, Chi Yan, Dong Wang, Jie Yin, Dan Xu, Bin Zhao, Xuelong Li
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Abstract
Implicit neural SLAM has achieved remarkable progress recently. Nevertheless, existing methods face significant challenges in non-ideal scenarios, such as motion blur or lighting variation, which often leads to issues like convergence failures, localization drifts, and distorted mapping. To address these challenges, we propose $\textbf{EN-SLAM}$, the first event-RGBD implicit neural SLAM framework, which effectively leverages the high rate and high dynamic range advantages of event data for tracking and mapping. Specifically, EN-SLAM proposes a differentiable CRF (Camera Response Function) rendering technique to generate distinct RGB and event camera data via a shared radiance field, which is optimized by learning a unified implicit representation with the captured event and RGBD supervision. Moreover, based on the temporal difference property of events, we propose a temporal aggregating optimization strategy for the event joint tracking and global bundle adjustment, capitalizing on the consecutive difference constraints of events, significantly enhancing tracking accuracy and robustness. Finally, we construct the simulated dataset $\textbf{DEV-Indoors}$ and real captured dataset $\textbf{DEV-Reals}$ containing 6 scenes, 17 sequences with practical motion blur and lighting changes for evaluations. Experimental results show that our method outperforms the SOTA methods in both tracking ATE and mapping ACC with a real-time $17$ FPS in various challenging environments. The code and dataset will be released upon the paper publication.
SNI-SLAM: Semantic Neural Implicit SLAM
Authors: Authors: Siting Zhu, Guangming Wang, Hermann Blum, Jiuming Liu, Liang Song, Marc Pollefeys, Hesheng Wang
Abstract
We propose SNI-SLAM, a semantic SLAM system utilizing neural implicit representation, that simultaneously performs accurate semantic mapping, high-quality surface reconstruction, and robust camera tracking. In this system, we introduce hierarchical semantic representation to allow multi-level semantic comprehension for top-down structured semantic mapping of the scene. In addition, to fully utilize the correlation between multiple attributes of the environment, we integrate appearance, geometry and semantic features through cross-attention for feature collaboration. This strategy enables a more multifaceted understanding of the environment, thereby allowing SNI-SLAM to remain robust even when single attribute is defective. Then, we design an internal fusion-based decoder to obtain semantic, RGB, Truncated Signed Distance Field (TSDF) values from multi-level features for accurate decoding. Furthermore, we propose a feature loss to update the scene representation at the feature level. Compared with low-level losses such as RGB loss and depth loss, our feature loss is capable of guiding the network optimization on a higher-level. Our SNI-SLAM method demonstrates superior performance over all recent NeRF-based SLAM methods in terms of mapping and tracking accuracy on Replica and ScanNet datasets, while also showing excellent capabilities in accurate semantic segmentation and real-time semantic mapping.
Capacity Maximization for FAS-assisted Multiple Access Channels
Authors: Authors: Hao Xu, Kai-Kit Wong, Wee Kiat New, Gui Zhou, Ross Murch, Chan-Byoung Chae, Yongxu Zhu, Shi Jin
Abstract
This paper investigates a multiuser millimeter-wave (mmWave) uplink system in which each user is equipped with a multi-antenna fluid antenna system (FAS) while the base station (BS) has multiple fixed-position antennas. Our primary objective is to maximize the system capacity by optimizing the transmit covariance matrices and the antenna position vectors of the users jointly. To gain deeper insights, we commence by deriving upper bounds and approximations for the maximum capacity. Then we delve into the capacity maximization problem. Beginning with the simple scenario of a single user equipped with a single-antenna FAS, we reveal that a closed-form optimal solution exists when there are only two propagation paths between the user and the BS. In the case where multiple propagation paths are present, a near-optimal solution can be obtained through a one-dimensional search method. Expanding our focus to multiuser cases, where users are equipped with either single- or multi-antenna FAS, we show that the original capacity maximization problems can be reformulated into distinct rank-one programmings. Then, we propose alternating optimization algorithms to deal with the transformed problems. Simulation results indicate that FAS can improve the capacity of the multiple access (MAC) system greatly, and the proposed algorithms outperform all the benchmarks.
CLIPSwarm: Converting text into formations of robots
Authors: Authors: Pablo Pueyo, Eduardo Montijano, Ana C. Murillo, Mac Schwager
Abstract
We present CLIPSwarm, an algorithm to generate robot swarm formations from natural language descriptions. CLIPSwarm receives an input text and finds the position of the robots to form a shape that corresponds to the given text. To do so, we implement a variation of the Montecarlo particle filter to obtain a matching formation iteratively. In every iteration, we generate a set of new formations and evaluate their Clip Similarity with the given text, selecting the best formations according to this metric. This metric is obtained using Clip, [1], an existing foundation model trained to encode images and texts into vectors within a common latent space. The comparison between these vectors determines how likely the given text describes the shapes. Our initial proof of concept shows the potential of this solution to generate robot swarm formations just from natural language descriptions and demonstrates a novel application of foundation models, such as CLIP, in the field of multi-robot systems. In this first approach, we create formations using a Convex-Hull approach. Next steps include more robust and generic representation and optimization steps in the process of obtaining a suitable swarm formation.
SBTRec- A Transformer Framework for Personalized Tour Recommendation Problem with Sentiment Analysis
Abstract
When traveling to an unfamiliar city for holidays, tourists often rely on guidebooks, travel websites, or recommendation systems to plan their daily itineraries and explore popular points of interest (POIs). However, these approaches may lack optimization in terms of time feasibility, localities, and user preferences. In this paper, we propose the SBTRec algorithm: a BERT-based Trajectory Recommendation with sentiment analysis, for recommending personalized sequences of POIs as itineraries. The key contributions of this work include analyzing users' check-ins and uploaded photos to understand the relationship between POI visits and distance. We introduce SBTRec, which encompasses sentiment analysis to improve recommendation accuracy by understanding users' preferences and satisfaction levels from reviews and comments about different POIs. Our proposed algorithms are evaluated against other sequence prediction methods using datasets from 8 cities. The results demonstrate that SBTRec achieves an average F1 score of 61.45%, outperforming baseline algorithms. The paper further discusses the flexibility of the SBTRec algorithm, its ability to adapt to different scenarios and cities without modification, and its potential for extension by incorporating additional information for more reliable predictions. Overall, SBTRec provides personalized and relevant POI recommendations, enhancing tourists' overall trip experiences. Future work includes fine-tuning personalized embeddings for users, with evaluation of users' comments on POIs,~to further enhance prediction accuracy.
6G Fresnel Spot Beamfocusing using Large-Scale Metasurfaces: A Distributed DRL-Based Approach
Authors: Authors: Mehdi Monemi, Mohammad Amir Fallah, Mehdi Rasti, Matti Latva-Aho
Abstract
In this paper, we introduce the concept of spot beamfocusing (SBF) in the Fresnel zone through extremely large-scale programmable metasurfaces (ELPMs) as a key enabling technology for 6G networks. A smart SBF scheme aims to adaptively concentrate the aperture's radiating power exactly at a desired focal point (DFP) in the 3D space utilizing some Machine Learning (ML) method. This offers numerous advantages for next-generation networks including efficient wireless power transfer (WPT), interference mitigation, reduced RF pollution, and improved information security. SBF necessitates ELPMs with precise channel state information (CSI) for all ELPM elements. However, obtaining exact CSI for ELPMs is not feasible in all environments; we alleviate this by proposing an adaptive novel CSI-independent ML scheme based on the TD3 deep-reinforcement-learning (DRL) method. While the proposed ML-based scheme is well-suited for relatively small-size arrays, the computational complexity is unaffordable for ELPMs. To overcome this limitation, we introduce a modular highly scalable structure composed of multiple sub-arrays, each equipped with a TD3-DRL optimizer. This setup enables collaborative optimization of the radiated power at the DFP, significantly reducing computational complexity while enhancing learning speed. The proposed structures benefits in terms of 3D spot-like power distribution, convergence rate, and scalability are validated through simulation results.
A Model for Multi-Agent Autonomy That Uses Opinion Dynamics and Multi-Objective Behavior Optimization
Authors: Authors: Tyler M. Paine, Michael R. Benjamin
Subjects: Robotics (cs.RO); Multiagent Systems (cs.MA)
Abstract
This paper reports a new hierarchical architecture for modeling autonomous multi-robot systems (MRSs): a non-linear dynamical opinion process is used to model high-level group choice, and multi-objective behavior optimization is used to model individual decisions. Using previously reported theoretical results, we show it is possible to design the behavior of the MRS by the selection of a relatively small set of parameters. The resulting behavior - both collective actions and individual actions - can be understood intuitively. The approach is entirely decentralized and the communication cost scales by the number of group options, not agents. We demonstrated the effectiveness of this approach using a hypothetical `explore-exploit-migrate' scenario in a two hour field demonstration with eight unmanned surface vessels (USVs). The results from our preliminary field experiment show the collective behavior is robust even with time-varying network topology and agent dropouts.
LOSTU: Fast, Scalable, and Uncertainty-Aware Triangulation
Authors: Authors: Sébastien Henry, John A. Christian
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Abstract
Triangulation algorithms often aim to minimize the reprojection ($L_2$) error, but this only provides the maximum likelihood estimate when there are no errors in the camera parameters or camera poses. Although recent advancements have yielded techniques to estimate camera parameters accounting for 3D point uncertainties, most structure from motion (SfM) pipelines still use older triangulation algorithms. This work leverages recent discoveries to provide a fast, scalable, and statistically optimal way to triangulate called LOSTU. Results show that LOSTU consistently produces lower 3D reconstruction errors than conventional $L_2$ triangulation methods -- often allowing LOSTU to successfully triangulate more points. Moreover, in addition to providing a better 3D reconstruction, LOSTU can be substantially faster than Levenberg-Marquardt (or similar) optimization schemes.
Abstract
We introduce and study the problem of dueling optimization with a monotone adversary, which is a generalization of (noiseless) dueling convex optimization. The goal is to design an online algorithm to find a minimizer $\mathbf{x}^{}$ for a function $f\colon X \to \mathbb{R}$, where $X \subseteq \mathbb{R}^d$. In each round, the algorithm submits a pair of guesses, i.e., $\mathbf{x}^{(1)}$ and $\mathbf{x}^{(2)}$, and the adversary responds with any point in the space that is at least as good as both guesses. The cost of each query is the suboptimality of the worse of the two guesses; i.e., ${\max} \left( f(\mathbf{x}^{(1)}), f(\mathbf{x}^{(2)}) \right) - f(\mathbf{x}^{})$. The goal is to minimize the number of iterations required to find an $\varepsilon$-optimal point and to minimize the total cost (regret) of the guesses over many rounds. Our main result is an efficient randomized algorithm for several natural choices of the function $f$ and set $X$ that incurs cost $O(d)$ and iteration complexity $O(d\log(1/\varepsilon)^2)$. Moreover, our dependence on $d$ is asymptotically optimal, as we show examples in which any randomized algorithm for this problem must incur $\Omega(d)$ cost and iteration complexity.
Robust Network Slicing: Multi-Agent Policies, Adversarial Attacks, and Defensive Strategies
Authors: Authors: Feng Wang, M. Cenk Gursoy, Senem Velipasalar
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Multiagent Systems (cs.MA)
Abstract
In this paper, we present a multi-agent deep reinforcement learning (deep RL) framework for network slicing in a dynamic environment with multiple base stations and multiple users. In particular, we propose a novel deep RL framework with multiple actors and centralized critic (MACC) in which actors are implemented as pointer networks to fit the varying dimension of input. We evaluate the performance of the proposed deep RL algorithm via simulations to demonstrate its effectiveness. Subsequently, we develop a deep RL based jammer with limited prior information and limited power budget. The goal of the jammer is to minimize the transmission rates achieved with network slicing and thus degrade the network slicing agents' performance. We design a jammer with both listening and jamming phases and address jamming location optimization as well as jamming channel optimization via deep RL. We evaluate the jammer at the optimized location, generating interference attacks in the optimized set of channels by switching between the jamming phase and listening phase. We show that the proposed jammer can significantly reduce the victims' performance without direct feedback or prior knowledge on the network slicing policies. Finally, we devise a Nash-equilibrium-supervised policy ensemble mixed strategy profile for network slicing (as a defensive measure) and jamming. We evaluate the performance of the proposed policy ensemble algorithm by applying on the network slicing agents and the jammer agent in simulations to show its effectiveness.
Multi-Timescale Control and Communications with Deep Reinforcement Learning -- Part I: Communication-Aware Vehicle Control
Authors: Authors: Tong Liu, Lei Lei, Kan Zheng, Xuemin (Sherman)Shen
Subjects: Systems and Control (eess.SY); Machine Learning (cs.LG)
Abstract
An intelligent decision-making system enabled by Vehicle-to-Everything (V2X) communications is essential to achieve safe and efficient autonomous driving (AD), where two types of decisions have to be made at different timescales, i.e., vehicle control and radio resource allocation (RRA) decisions. The interplay between RRA and vehicle control necessitates their collaborative design. In this two-part paper (Part I and Part II), taking platoon control (PC) as an example use case, we propose a joint optimization framework of multi-timescale control and communications (MTCC) based on Deep Reinforcement Learning (DRL). In this paper (Part I), we first decompose the problem into a communication-aware DRL-based PC sub-problem and a control-aware DRL-based RRA sub-problem. Then, we focus on the PC sub-problem assuming an RRA policy is given, and propose the MTCC-PC algorithm to learn an efficient PC policy. To improve the PC performance under random observation delay, the PC state space is augmented with the observation delay and PC action history. Moreover, the reward function with respect to the augmented state is defined to construct an augmented state Markov Decision Process (MDP). It is proved that the optimal policy for the augmented state MDP is optimal for the original PC problem with observation delay. Different from most existing works on communication-aware control, the MTCC-PC algorithm is trained in a delayed environment generated by the fine-grained embedded simulation of C-V2X communications rather than by a simple stochastic delay model. Finally, experiments are performed to compare the performance of MTCC-PC with those of the baseline DRL algorithms.
What Lies beyond the Pareto Front? A Survey on Decision-Support Methods for Multi-Objective Optimization
Authors: Authors: Zuzanna Osika, Jazmin Zatarain Salazar, Diederik M. Roijers, Frans A. Oliehoek, Pradeep K. Murukannaiah
Abstract
We present a review that unifies decision-support methods for exploring the solutions produced by multi-objective optimization (MOO) algorithms. As MOO is applied to solve diverse problems, approaches for analyzing the trade-offs offered by MOO algorithms are scattered across fields. We provide an overview of the advances on this topic, including methods for visualization, mining the solution set, and uncertainty exploration as well as emerging research directions, including interactivity, explainability, and ethics. We synthesize these methods drawing from different fields of research to build a unified approach, independent of the application. Our goals are to reduce the entry barrier for researchers and practitioners on using MOO algorithms and to provide novel research directions.
Threshold-Based Algorithms for an Online Rolling Horizon Framework Under Uncertainty -- With an Application to Energy Management
Authors: Authors: Jens Hönen, Johann L. Hurink, Bert Zwart
Subjects: Systems and Control (eess.SY); Optimization and Control (math.OC)
Abstract
Decision problems encountered in practice often possess a highly dynamic and uncertain nature. In particular fast changing forecasts for parameters (e.g., photovoltaic generation forecasts in the context of energy management) pose large challenges for the classical rolling horizon framework. Within this work, we propose an online scheduling algorithm for a rolling horizon framework, which directly uses short-term forecasts and observations of the uncertainty. The online scheduling algorithm is based on insights and results from combinatorial online optimization problems and makes use of key properties of robust optimization. Applied within a robust energy management approach, we show that the online scheduling algorithm is able to reduce the total electricity costs within a local microgrid by more than 85% compared to a classical rolling horizon framework and by more than 50% compared to a tailor-made dynamic, yet still offline rolling horizon framework. A detailed analysis provides insights into the working of the online scheduling algorithm under different underlying forecast error distributions.
LABCAT: Locally adaptive Bayesian optimization using principal component-aligned trust regions
Authors: Authors: E. Visser, C.E. van Daalen, J.C. Schoeman
Abstract
Bayesian optimization (BO) is a popular method for optimizing expensive black-box functions. BO has several well-documented shortcomings, including computational slowdown with longer optimization runs, poor suitability for non-stationary or ill-conditioned objective functions, and poor convergence characteristics. Several algorithms have been proposed that incorporate local strategies, such as trust regions, into BO to mitigate these limitations; however, none address all of them satisfactorily. To address these shortcomings, we propose the LABCAT algorithm, which extends trust-region-based BO by adding principal-component-aligned rotation and an adaptive rescaling strategy based on the length-scales of a local Gaussian process surrogate model with automatic relevance determination. Through extensive numerical experiments using a set of synthetic test functions and the well-known COCO benchmarking software, we show that the LABCAT algorithm outperforms several state-of-the-art BO and other black-box optimization algorithms.
On the Communication Complexity of Decentralized Bilevel Optimization
Authors: Authors: Yihan Zhang, My T. Thai, Jie Wu, Hongchang Gao
Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC); Optimization and Control (math.OC)
Abstract
Decentralized bilevel optimization has been actively studied in the past few years since it has widespread applications in machine learning. However, existing algorithms suffer from large communication complexity caused by the estimation of stochastic hypergradient, limiting their application to real-world tasks. To address this issue, we develop a novel decentralized stochastic bilevel gradient descent algorithm under the heterogeneous setting, which enjoys a small communication cost in each round and small communication rounds. As such, it can achieve a much better communication complexity than existing algorithms. Moreover, we extend our algorithm to the more challenging decentralized multi-level optimization. To the best of our knowledge, this is the first time achieving these theoretical results under the heterogeneous setting. At last, the experimental results confirm the efficacy of our algorithm.
Make me an Offer: Forward and Reverse Auctioning Problems in the Tourism Industry
Authors: Authors: Ioannis T. Christou, Dimitris Doukas, Konstantina Skouri, Gerasimos Meletiou
Subjects: Artificial Intelligence (cs.AI); Computational Engineering, Finance, and Science (cs.CE); Computers and Society (cs.CY)
Abstract
Most tourist destinations are facing regular and consistent seasonality with significant economic and social impacts. This phenomenon is more pronounced in the post-covid era, where demand for travel has increased but unevenly among different geographic areas. To counter these problems that both customers and hoteliers are facing, we have developed two auctioning systems that allow hoteliers of lower popularity tier areas or during low season periods to auction their rooms in what we call a forward auction model, and also allows customers to initiate a bidding process whereby hoteliers in an area may make offers to the customer for their rooms, in what constitutes a reverse auction model initiated by the customer, similar to the bidding concept of priceline.com. We develop mathematical programming models that define explicitly both types of auctions, and show that in each type, there are significant benefits to be gained both on the side of the hotelier as well as on the side of the customer. We discuss algorithmic techniques for the approximate solution of these optimization problems, and present results using exact optimization solvers to solve them to guaranteed optimality. These techniques could be beneficial to both customer and hotelier reducing seasonality during middle and low season and providing the customer with attractive offers.
DiffSCI: Zero-Shot Snapshot Compressive Imaging via Iterative Spectral Diffusion Model
Abstract
This paper endeavors to advance the precision of snapshot compressive imaging (SCI) reconstruction for multispectral image (MSI). To achieve this, we integrate the advantageous attributes of established SCI techniques and an image generative model, propose a novel structured zero-shot diffusion model, dubbed DiffSCI. DiffSCI leverages the structural insights from the deep prior and optimization-based methodologies, complemented by the generative capabilities offered by the contemporary denoising diffusion model. Specifically, firstly, we employ a pre-trained diffusion model, which has been trained on a substantial corpus of RGB images, as the generative denoiser within the Plug-and-Play framework for the first time. This integration allows for the successful completion of SCI reconstruction, especially in the case that current methods struggle to address effectively. Secondly, we systematically account for spectral band correlations and introduce a robust methodology to mitigate wavelength mismatch, thus enabling seamless adaptation of the RGB diffusion model to MSIs. Thirdly, an accelerated algorithm is implemented to expedite the resolution of the data subproblem. This augmentation not only accelerates the convergence rate but also elevates the quality of the reconstruction process. We present extensive testing to show that DiffSCI exhibits discernible performance enhancements over prevailing self-supervised and zero-shot approaches, surpassing even supervised transformer counterparts across both simulated and real datasets. Our code will be available.
Appearance Codes using Joint Embedding Learning of Multiple Modalities
Authors: Authors: Alex Zhang, Evan Dogariu
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Abstract
The use of appearance codes in recent work on generative modeling has enabled novel view renders with variable appearance and illumination, such as day-time and night-time renders of a scene. A major limitation of this technique is the need to re-train new appearance codes for every scene on inference, so in this work we address this problem proposing a framework that learns a joint embedding space for the appearance and structure of the scene by enforcing a contrastive loss constraint between different modalities. We apply our framework to a simple Variational Auto-Encoder model on the RADIATE dataset \cite{sheeny2021radiate} and qualitatively demonstrate that we can generate new renders of night-time photos using day-time appearance codes without additional optimization iterations. Additionally, we compare our model to a baseline VAE that uses the standard per-image appearance code technique and show that our approach achieves generations of similar quality without learning appearance codes for any unseen images on inference.
Establishing Dynamic Secure Sessions for ECQV Implicit Certificates in Embedded Systems
Authors: Authors: Fikret Basic, Christian Steger, Robert Kofler
Abstract
Be it in the IoT or automotive domain, implicit certificates are gaining ever more prominence in constrained embedded devices. They present a resource-efficient security solution against common threat concerns. The computational requirements are not the main issue anymore. The focus is now placed on determining a good balance between the provided security level and the derived threat model. A security aspect that often gets overlooked is the establishment of secure communication sessions, as most design solutions are based only on the use of static key derivation, and therefore, lack the perfect forward secrecy. This leaves the transmitted data open for potential future exposures by having keys tied to the certificates rather than the communication sessions. We aim to patch this gap, by presenting a design that utilizes the Station to Station (STS) protocol with implicit certificates. In addition, we propose potential protocol optimization implementation steps and run a comprehensive study on the performance and security level between the proposed design and the state-of-the-art key derivation protocols. In our comparative study, we show that with a slight computational increase of 20\% compared to a static ECDSA key derivation, we are able to mitigate many session-related security vulnerabilities that would otherwise remain open.
Abstract
We note that decoupled weight decay regularization is a particular case of weight norm control where the target norm of weights is set to 0. Any optimization method (e.g., Adam) which uses decoupled weight decay regularization (respectively, AdamW) can be viewed as a particular case of a more general algorithm with weight norm control (respectively, AdamWN). We argue that setting the target norm of weights to 0 can be suboptimal and other target norm values can be considered. For instance, any training run where AdamW achieves a particular norm of weights can be challenged by AdamWN scheduled to achieve a comparable norm of weights. We discuss various implications of introducing weight norm control instead of weight decay.
Controlling Grid-Connected Inverters under Time-Varying Voltage Constraints
Abstract
Inverter-based resources (IBRs) are becoming increasingly prevalent in power systems. Due to the inherently low inertia of inverters, there is a heightened risk of disruptive voltage oscillations. A particular challenge in the operation of grid connected IBRs is the variations in the grid side voltage. The changes in the grid side voltage introduces nonlinear and time-varying constriants on the inverter voltages themselves. For an operator, it would be useful to know the set of active and reactive powers that can be tracked under these time-varying conditons. This paper introduces an optimization model designed to assess the achievability of power setpoints within the framework of constrained static state-feedback power control. Additionally, we present a Monte Carlo simulation-based method to optimize the set of achievable power setpoints. The efficacy of the proposed approach is validated through simulation results.
HexGen: Generative Inference of Foundation Model over Heterogeneous Decentralized Environment
Abstract
Serving foundation model inference is a pivotal component of contemporary AI applications, where this service is usually hosted in a centralized data center on a group of homogeneous high-performance GPUs. In this paper, we explore how to deploy such a service in a heterogeneous environment in terms of both computation capacity and network connection as an alternative to reduce the high inference cost. We propose HexGen, a distributed inference engine that supports asymmetric partitioning of the inference computation according to tensor model parallelism and pipeline parallelism. HexGen can be deployed with a set of different GPUs connected by a fully heterogeneous network, where the key technique contribution is a scheduling algorithm that allocates the asymmetric inference tasklets among these GPUs connected by different networks. We define the scheduling problem as a constrained optimization problem and further propose an efficient evolutionary algorithm to find the optimal allocation strategy. We conduct an extensive empirical study to evaluate the efficiency of HexGen by serving the state-of-the-art Llama-2 (70B) model. The experimental results suggest that HexGen can choose to achieve up to 2.3 times lower latency deadlines or tolerate up to 4 times more traffic request rates compared with the homogeneous baseline given the same budget. Our implementation is available at https://github.com/Relaxed-System-Lab/HexGen.
Cryogenic quasi-static embedded DRAM for energy-efficient compute-in-memory applications
Abstract
Compute-in-memory (CIM) presents an attractive approach for energy-efficient computing in data-intensive applications. However, the development of suitable memory designs to achieve high-performance CIM remains a challenging task. Here, we propose a cryogenic quasi-static embedded DRAM to address the logic-memory mismatch of CIM. Guided by the re-calibrated cryogenic device model, the designed four-transistor bit-cell achieves full-swing data storage, low power consumption, and extended retention time at cryogenic temperatures. Combined with the adoption of cryogenic write bitline biasing technique and readout circuitry optimization, our 4Kb cryogenic eDRAM chip demonstrates a 1.37$\times$10$^6$ times improvement in retention time, while achieving a 75 times improvement in retention variability, compared to room-temperature operation. Moreover, it also achieves outstanding power performance with a retention power of 112 fW and a dynamic power of 108 $\mu$W at 4.2 K, which can be further decreased by 7.1% and 13.6% using the dynamic voltage scaling technique. This work reveals the great potential of cryogenic CMOS for high-density data storage and lays a solid foundation for energy-efficient CIM implementations.
A Framework on Complex Matrix Derivatives with Special Structure Constraints for Wireless Systems
Abstract
Matrix-variate optimization plays a central role in advanced wireless system designs. In this paper, we aim to explore optimal solutions of matrix variables under two special structure constraints using complex matrix derivatives, including diagonal structure constraints and constant modulus constraints, both of which are closely related to the state-of-the-art wireless applications. Specifically, for diagonal structure constraints mostly considered in the uplink multi-user single-input multiple-output (MU-SIMO) system and the amplitude-adjustable intelligent reflecting surface (IRS)-aided multiple-input multiple-output (MIMO) system, the capacity maximization problem, the mean-squared error (MSE) minimization problem and their variants are rigorously investigated. By leveraging complex matrix derivatives, the optimal solutions of these problems are directly obtained in closed forms. Nevertheless, for constant modulus constraints with the intrinsic nature of element-wise decomposability, which are often seen in the hybrid analog-digital MIMO system and the fully-passive IRS-aided MIMO system, we firstly explore inherent structures of the element-wise phase derivatives associated with different optimization problems. Then, we propose a novel alternating optimization (AO) algorithm with the aid of several arbitrary feasible solutions, which avoids the complicated matrix inversion and matrix factorization involved in conventional element-wise iterative algorithms. Numerical simulations reveal that the proposed algorithm can dramatically reduce the computational complexity without loss of system performance.
Multi-stage optimisation towards transformation pathways for municipal energy systems
Authors: Authors: Paul Maximilian Röhrig, Nils Körber, Julius Zocher, Andreas Ulbig
Subjects: Systems and Control (eess.SY); General Economics (econ.GN)
Abstract
An essential facet of achieving climate neutrality by 2045 is the decarbonization of municipal energy systems. To accomplish this, it is necessary to establish implementation concepts that detail the timing, location, and specific measures required to achieve decarbonization. This restructuring process involves identifying the measures that offer the most compelling techno-economic and ecological advantages. In particular, measures that contribute to the interconnection of energy vectors and domains, e.g. heating, cooling, and electricity supply, in the sense of decentralized multi-energy systems are a promising future development option. Due to the high complexity resulting from a multitude of decision options as well as a temporal coupling across the transformation path, the use of optimization methods is required, which enable a bottom-up identification of suitable transformation solutions in a high spatial resolution. For the design of reasonable concepts, we develop a multistage optimization problem for the derivation of transformation pathways in the context of a multi-location structure, expansion, and operation problem. The results show that the heat supply in the future will mainly be provided by heat pumps with a share of 60%. It can also be shown that an early dismantling of the gas network will lead to the need for transitional technologies such as pellet heating. Overall, the conversion of the municipal energy system can significantly reduce emissions (97%).
Asymptotic CRB Analysis of Random RIS-Assisted Large-Scale Localization Systems
Abstract
This paper studies the performance of a randomly RIS-assisted multi-target localization system, in which the configurations of the RIS are randomly set to avoid high-complexity optimization. We first focus on the scenario where the number of RIS elements is significantly large, and then obtain the scaling law of Cram\'er-Rao bound (CRB) under certain conditions, which shows that CRB decreases in the third or fourth order as the RIS dimension increases. Second, we extend our analysis to large systems where both the number of targets and sensors is substantial. Under this setting, we explore two common RIS models: the constant module model and the discrete amplitude model, and illustrate how the random RIS configuration impacts the value of CRB. Numerical results demonstrate that asymptotic formulas provide a good approximation to the exact CRB in the proposed randomly configured RIS systems.
Deep Equilibrium Diffusion Restoration with Parallel Sampling
Authors: Authors: Jiezhang Cao, Yue Shi, Kai Zhang, Yulun Zhang, Radu Timofte, Luc Van Gool
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Abstract
Diffusion-based image restoration (IR) methods aim to use diffusion models to recover high-quality (HQ) images from degraded images and achieve promising performance. Due to the inherent property of diffusion models, most of these methods need long serial sampling chains to restore HQ images step-by-step. As a result, it leads to expensive sampling time and high computation costs. Moreover, such long sampling chains hinder understanding the relationship between the restoration results and the inputs since it is hard to compute the gradients in the whole chains. In this work, we aim to rethink the diffusion-based IR models through a different perspective, i.e., a deep equilibrium (DEQ) fixed point system. Specifically, we derive an analytical solution by modeling the entire sampling chain in diffusion-based IR models as a joint multivariate fixed point system. With the help of the analytical solution, we are able to conduct single-image sampling in a parallel way and restore HQ images without training. Furthermore, we compute fast gradients in DEQ and found that initialization optimization can boost performance and control the generation direction. Extensive experiments on benchmarks demonstrate the effectiveness of our proposed method on typical IR tasks and real-world settings. The code and models will be made publicly available.
ART-Owen Scrambling
Authors: Authors: Abdalla G. M. Ahmed, Matt Pharr, Peter Wonka
Abstract
We present a novel algorithm for implementing Owen-scrambling, combining the generation and distribution of the scrambling bits in a single self-contained compact process. We employ a context-free grammar to build a binary tree of symbols, and equip each symbol with a scrambling code that affects all descendant nodes. We nominate the grammar of adaptive regular tiles (ART) derived from the repetition-avoiding Thue-Morse word, and we discuss its potential advantages and shortcomings. Our algorithm has many advantages, including random access to samples, fixed time complexity, GPU friendliness, and scalability to any memory budget. Further, it provides two unique features over known methods: it admits optimization, and it is invertible, enabling screen-space scrambling of the high-dimensional Sobol sampler.
GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting
Authors: Authors: Chi Yan, Delin Qu, Dong Wang, Dan Xu, Zhigang Wang, Bin Zhao, Xuelong Li
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Abstract
In this paper, we introduce $\textbf{GS-SLAM}$ that first utilizes 3D Gaussian representation in the Simultaneous Localization and Mapping (SLAM) system. It facilitates a better balance between efficiency and accuracy. Compared to recent SLAM methods employing neural implicit representations, our method utilizes a real-time differentiable splatting rendering pipeline that offers significant speedup to map optimization and RGB-D re-rendering. Specifically, we propose an adaptive expansion strategy that adds new or deletes noisy 3D Gaussian in order to efficiently reconstruct new observed scene geometry and improve the mapping of previously observed areas. This strategy is essential to extend 3D Gaussian representation to reconstruct the whole scene rather than synthesize a static object in existing methods. Moreover, in the pose tracking process, an effective coarse-to-fine technique is designed to select reliable 3D Gaussian representations to optimize camera pose, resulting in runtime reduction and robust estimation. Our method achieves competitive performance compared with existing state-of-the-art real-time methods on the Replica, TUM-RGBD datasets. The source code will be released upon acceptance.
Configuring an heterogeneous smartgrid network: complexity and approximations for tree topologies
Abstract
We address the problem of configuring a power distribution network with reliability and resilience objectives by satisfying the demands of the consumers and saturating each production source as little as possible. We consider power distribution networks containing source nodes producing electricity, nodes representing electricity consumers and switches between them. Configuring this network consists in deciding the orientation of the links between the nodes of the network. The electric flow is a direct consequence of the chosen configuration and can be computed in polynomial time. It is valid if it satisfies the demand of each consumer and capacity constraints on the network. In such a case, we study the problem of determining a feasible solution that balances the loads of the sources, that is their production rates. We use three metrics to measure the quality of a solution: minimizing the maximum load, maximizing the minimum load and minimizing the difference of the maximum and the minimum loads. This defines optimization problems called respectively min-M, max-m and min-R. In the case where the graph of the network is a tree, it is known that the problem of building a valid configuration is polynomial. We show the three optimization variants have distinct properties regarding the theoretical complexity and the approximability. Particularly, we show that min-M is polynomial, that max-m is NP-Hard but belongs to the class FPTAS and that min-R is NP-Hard, cannot 1 be approximated to within any exponential relative ratio but, for any $\epsilon$ > 0, there exists an algorithm for which the value of the returned solution equals the value of an optimal solution shifted by at most $\epsilon$.
Operator Learning for Continuous Spatial-Temporal Model with A Hybrid Optimization Scheme
Abstract
Partial differential equations are often used in the spatial-temporal modeling of complex dynamical systems in many engineering applications. In this work, we build on the recent progress of operator learning and present a data-driven modeling framework that is continuous in both space and time. A key feature of the proposed model is the resolution-invariance with respect to both spatial and temporal discretizations. To improve the long-term performance of the calibrated model, we further propose a hybrid optimization scheme that leverages both gradient-based and derivative-free optimization methods and efficiently trains on both short-term time series and long-term statistics. We investigate the performance of the spatial-temporal continuous learning framework with three numerical examples, including the viscous Burgers' equation, the Navier-Stokes equations, and the Kuramoto-Sivashinsky equation. The results confirm the resolution-invariance of the proposed modeling framework and also demonstrate stable long-term simulations with only short-term time series data. In addition, we show that the proposed model can better predict long-term statistics via the hybrid optimization scheme with a combined use of short-term and long-term data.
Improving Real Estate Appraisal with POI Integration and Areal Embedding
Authors: Authors: Sumin Han, Youngjun Park, Sonia Sabir, Jisun An, Dongman Lee
Abstract
Despite advancements in real estate appraisal methods, this study primarily focuses on two pivotal challenges. Firstly, we explore the often-underestimated impact of Points of Interest (POI) on property values, emphasizing the necessity for a comprehensive, data-driven approach to feature selection. Secondly, we integrate road-network-based Areal Embedding to enhance spatial understanding for real estate appraisal. We first propose a revised method for POI feature extraction, and discuss the impact of each POI for house price appraisal. Then we present the Areal embedding-enabled Masked Multihead Attention-based Spatial Interpolation for House Price Prediction (AMMASI) model, an improvement upon the existing ASI model, which leverages masked multi-head attention on geographic neighbor houses and similar-featured houses. Our model outperforms current baselines and also offers promising avenues for future optimization in real estate appraisal methodologies.
Abstract
Recently, Zhuang, Roth, \& Sudhakar [1] proposed a method that allows simultaneous computation of the rigid transformations from world frame to robot base frame and from hand frame to camera frame. Their method attempts to solve a homogeneous matrix equation of the form AX=ZB. They use quaternions to derive explicit linear solutions for X and Z. In this short paper, we present two new solutions that attempt to solve the homogeneous matrix equation mentioned above: (i) a closed-form method which uses quaternion algebra and a positive quadratic error function associated with this representation and (ii) a method based on non-linear constrained minimization and which simultaneously solves for rotations and translations. These results may be useful to other problems that can be formulated in the same mathematical form. We perform a sensitivity analysis for both our two methods and the linear method developed by Zhuang et al. This analysis allows the comparison of the three methods. In the light of this comparison the non-linear optimization method, which solves for rotations and translations simultaneously, seems to be the most stable one with respect to noise and to measurement errors.
Zero redundancy distributed learning with differential privacy
Authors: Authors: Zhiqi Bu, Justin Chiu, Ruixuan Liu, Sheng Zha, George Karypis
Subjects: Machine Learning (cs.LG); Computational Complexity (cs.CC); Cryptography and Security (cs.CR); Distributed, Parallel, and Cluster Computing (cs.DC)
Abstract
Deep learning using large models have achieved great success in a wide range of domains. However, training these models on billions of parameters is very challenging in terms of the training speed, memory cost, and communication efficiency, especially under the privacy-preserving regime with differential privacy (DP). On the one hand, DP optimization has comparable efficiency to the standard non-private optimization on a single GPU, but on multiple GPUs, existing DP distributed learning (such as pipeline parallel) has suffered from significantly worse efficiency. On the other hand, the Zero Redundancy Optimizer (ZeRO) is a state-of-the-art solution to the standard distributed learning, exhibiting excellent training efficiency on large models, but to work compatibly with DP is technically complicated. In this work, we develop a new systematic solution, DP-ZeRO, (I) to scale up the trainable DP model size, e.g. to GPT-100B, (II) to obtain the same computation and communication efficiency as the standard ZeRO, and (III) to enable mixed-precision DP training. Our DP-ZeRO, like the standard ZeRO, has the potential to train models with arbitrary size and is evaluated on the world's largest DP models in terms of the number of trainable parameters.
Holistic Inverse Rendering of Complex Facade via Aerial 3D Scanning
Authors: Authors: Zixuan Xie, Rengan Xie, Rong Li, Kai Huang, Pengju Qiao, Jingsen Zhu, Xu Yin, Qi Ye, Wei Hua, Yuchi Huo, Hujun Bao
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
Abstract
In this work, we use multi-view aerial images to reconstruct the geometry, lighting, and material of facades using neural signed distance fields (SDFs). Without the requirement of complex equipment, our method only takes simple RGB images captured by a drone as inputs to enable physically based and photorealistic novel-view rendering, relighting, and editing. However, a real-world facade usually has complex appearances ranging from diffuse rocks with subtle details to large-area glass windows with specular reflections, making it hard to attend to everything. As a result, previous methods can preserve the geometry details but fail to reconstruct smooth glass windows or verse vise. In order to address this challenge, we introduce three spatial- and semantic-adaptive optimization strategies, including a semantic regularization approach based on zero-shot segmentation techniques to improve material consistency, a frequency-aware geometry regularization to balance surface smoothness and details in different surfaces, and a visibility probe-based scheme to enable efficient modeling of the local lighting in large-scale outdoor environments. In addition, we capture a real-world facade aerial 3D scanning image set and corresponding point clouds for training and benchmarking. The experiment demonstrates the superior quality of our method on facade holistic inverse rendering, novel view synthesis, and scene editing compared to state-of-the-art baselines.
LLMs as Visual Explainers: Advancing Image Classification with Evolving Visual Descriptions
Authors: Authors: Songhao Han, Le Zhuo, Yue Liao, Si Liu
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG)
Abstract
Vision-language models (VLMs) offer a promising paradigm for image classification by comparing the similarity between images and class embeddings. A critical challenge lies in crafting precise textual representations for class names. While previous studies have leveraged recent advancements in large language models (LLMs) to enhance these descriptors, their outputs often suffer from ambiguity and inaccuracy. We identify two primary causes: 1) The prevalent reliance on textual interactions with LLMs, leading to a mismatch between the generated text and the visual content in VLMs' latent space - a phenomenon we term the "explain without seeing" dilemma. 2) The oversight of the inter-class relationships, resulting in descriptors that fail to differentiate similar classes effectively. To address these issues, we propose a novel image classification framework combining VLMs with LLMs, named Iterative Optimization with Visual Feedback. In particular, our method develops an LLM-based agent, employing an evolutionary optimization strategy to refine class descriptors. Crucially, we incorporate visual feedback from VLM classification metrics, thereby guiding the optimization process with concrete visual data. Our method leads to improving accuracy on a wide range of image classification benchmarks, with 3.47\% average gains over state-of-the-art methods. We also highlight the resulting descriptions serve as explainable and robust features that can consistently improve the performance across various backbone models.
Certification of Distributional Individual Fairness
Authors: Authors: Matthew Wicker, Vihari Piratia, Adrian Weller
Subjects: Machine Learning (cs.LG); Computers and Society (cs.CY)
Abstract
Providing formal guarantees of algorithmic fairness is of paramount importance to socially responsible deployment of machine learning algorithms. In this work, we study formal guarantees, i.e., certificates, for individual fairness (IF) of neural networks. We start by introducing a novel convex approximation of IF constraints that exponentially decreases the computational cost of providing formal guarantees of local individual fairness. We highlight that prior methods are constrained by their focus on global IF certification and can therefore only scale to models with a few dozen hidden neurons, thus limiting their practical impact. We propose to certify distributional individual fairness which ensures that for a given empirical distribution and all distributions within a $\gamma$-Wasserstein ball, the neural network has guaranteed individually fair predictions. Leveraging developments in quasi-convex optimization, we provide novel and efficient certified bounds on distributional individual fairness and show that our method allows us to certify and regularize neural networks that are several orders of magnitude larger than those considered by prior works. Moreover, we study real-world distribution shifts and find our bounds to be a scalable, practical, and sound source of IF guarantees.
Adaptive Training Distributions with Scalable Online Bilevel Optimization
Authors: Authors: David Grangier, Pierre Ablin, Awni Hannun
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL)
Abstract
Large neural networks pretrained on web-scale corpora are central to modern machine learning. In this paradigm, the distribution of the large, heterogeneous pretraining data rarely matches that of the application domain. This work considers modifying the pretraining distribution in the case where one has a small sample of data reflecting the targeted test conditions. We propose an algorithm motivated by a recent formulation of this setting as an online, bilevel optimization problem. With scalability in mind, our algorithm prioritizes computing gradients at training points which are likely to most improve the loss on the targeted distribution. Empirically, we show that in some cases this approach is beneficial over existing strategies from the domain adaptation literature but may not succeed in other cases. We propose a simple test to evaluate when our approach can be expected to work well and point towards further research to address current limitations.
Rate-Independent Gradient Crystal Plasticity Theory -- Robust Algorithmic Formulations based on Incremental Energy Minimization
Authors: Authors: Volker Fohrmeister, Jörn Mosler
Subjects: Computational Engineering, Finance, and Science (cs.CE)
Abstract
Numerically robust algorithmic formulations suitable for rate-independent crystal plasticity are presented. They cover classic local models as well as gradient-enhanced theories in which the gradients of the plastic slips are incorporated by means of the micromorphic approach. The elaborated algorithmic formulations rely on the underlying variational structure of (associative) crystal plasticity. To be more precise and in line with so-called variational constitutive updates or incremental energy minimization principles, an incrementally defined energy derived from the underlying time-continuous constitutive model represents the starting point of the novel numerically robust algorithmic formulations. This incrementally defined potential allows to compute all variables jointly as minimizers of this energy. While such discrete variational constitutive updates are not new in general, they are considered here in order to employ powerful techniques from non-linear constrained optimization theory in order to compute robustly the aforementioned minimizers. The analyzed prototype models are based on (1) nonlinear complementarity problem (NCP) functions as well as on (2) the augmented Lagrangian formulation. Numerical experiments show the numerical robustness of the resulting algorithmic formulations. Furthermore, it is shown that the novel algorithmic ideas can also be integrated into classic, non-variational, return-mapping schemes.
Abstract
We note that decoupled weight decay regularization is a particular case of weight norm control where the target norm of weights is set to 0. Any optimization method (e.g., Adam) which uses decoupled weight decay regularization (respectively, AdamW) can be viewed as a particular case of a more general algorithm with weight norm control (respectively, AdamWN). We argue that setting the target norm of weights to 0 can be suboptimal and other target norm values can be considered. For instance, any training run where AdamW achieves a particular norm of weights can be challenged by AdamWN scheduled to achieve a comparable norm of weights. We discuss various implications of introducing weight norm control instead of weight decay.
Optimal Hyperparameter $ε$ for Adaptive Stochastic Optimizers through Gradient Histograms
Abstract
Optimizers are essential components for successfully training deep neural network models. In order to achieve the best performance from such models, designers need to carefully choose the optimizer hyperparameters. However, this can be a computationally expensive and time-consuming process. Although it is known that all optimizer hyperparameters must be tuned for maximum performance, there is still a lack of clarity regarding the individual influence of minor priority hyperparameters, including the safeguard factor $\epsilon$ and momentum factor $\beta$, in leading adaptive optimizers (specifically, those based on the Adam optimizers). In this manuscript, we introduce a new framework based on gradient histograms to analyze and justify important attributes of adaptive optimizers, such as their optimal performance and the relationships and dependencies among hyperparameters. Furthermore, we propose a novel gradient histogram-based algorithm that automatically estimates a reduced and accurate search space for the safeguard hyperparameter $\epsilon$, where the optimal value can be easily found.
Keyword: gradient
Token-level Adaptation of LoRA Adapters for Downstream Task Generalization
Authors: Authors: Joshua Belofsky
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)
Abstract
This paper introduces a method for adapting LoRA adapters in smaller-sized language models to arbitrary downstream tasks. Unlike standard mixture-of-expert architectures, our method employs a gradient-free routing function to choose a weighted combination of experts without increasing the compute requirements for training or inference. The results show that token-level adaptation of LoRA adapters outperforms the base Llama-2-7b model across mathematical (GSM8K), scientific (ARC-Challenge), reading comprehension (SQuAD), and coding (CodeAlpaca-20k) tasks. Further evaluations also show that the average performance of token-level adaptation outperforms individual models fine-tuned for each of the tasks with the best performance observed in adaptation of every-other token during inference. The code for this study is made available through a public repository.
A Whole New Ball Game: A Primal Accelerated Method for Matrix Games and Minimizing the Maximum of Smooth Functions
Abstract
We design algorithms for minimizing $\max_{i\in[n]} f_i(x)$ over a $d$-dimensional Euclidean or simplex domain. When each $f_i$ is $1$-Lipschitz and $1$-smooth, our method computes an $\epsilon$-approximate solution using $\widetilde{O}(n \epsilon^{-1/3} + \epsilon^{-2})$ gradient and function evaluations, and $\widetilde{O}(n \epsilon^{-4/3})$ additional runtime. For large $n$, our evaluation complexity is optimal up to polylogarithmic factors. In the special case where each $f_i$ is linear -- which corresponds to finding a near-optimal primal strategy in a matrix game -- our method finds an $\epsilon$-approximate solution in runtime $\widetilde{O}(n (d/\epsilon)^{2/3} + nd + d\epsilon^{-2})$. For $n>d$ and $\epsilon=1/\sqrt{n}$ this improves over all existing first-order methods. When additionally $d = \omega(n^{8/11})$ our runtime also improves over all known interior point methods. Our algorithm combines three novel primitives: (1) A dynamic data structure which enables efficient stochastic gradient estimation in small $\ell_2$ or $\ell_1$ balls. (2) A mirror descent algorithm tailored to our data structure implementing an oracle which minimizes the objective over these balls. (3) A simple ball oracle acceleration framework suitable for non-Euclidean geometry.
The Hidden Linear Structure in Score-Based Models and its Application
Abstract
Score-based models have achieved remarkable results in the generative modeling of many domains. By learning the gradient of smoothed data distribution, they can iteratively generate samples from complex distribution e.g. natural images. However, is there any universal structure in the gradient field that will eventually be learned by any neural network? Here, we aim to find such structures through a normative analysis of the score function. First, we derived the closed-form solution to the scored-based model with a Gaussian score. We claimed that for well-trained diffusion models, the learned score at a high noise scale is well approximated by the linear score of Gaussian. We demonstrated this through empirical validation of pre-trained images diffusion model and theoretical analysis of the score function. This finding enabled us to precisely predict the initial diffusion trajectory using the analytical solution and to accelerate image sampling by 15-30\% by skipping the initial phase without sacrificing image quality. Our finding of the linear structure in the score-based model has implications for better model design and data pre-processing.
User-Centric Interactive AI for Distributed Diffusion Model-based AI-Generated Content
Authors: Authors: Hongyang Du, Ruichen Zhang, Dusit Niyato, Jiawen Kang, Zehui Xiong, Shuguang Cui, Xuemin Shen, Dong In Kim
Subjects: Networking and Internet Architecture (cs.NI)
Abstract
Distributed Artificial Intelligence-Generated Content (AIGC) has attracted increasing attention. However, it faces two significant challenges: how to maximize the subjective Quality of Experience (QoE) and how to enhance the energy efficiency, which are particularly pronounced in widely adopted Generative Diffusion Model (GDM)-based AIGC services for image generation. In this paper, we propose a novel user-centric Interactive AI (IAI) approach for service management, with a distributed GDM-based AIGC framework, prioritizing efficient and collaborative GDM deployment. Specifically, we restructure the GDM's inference process, i.e., the denoising chain, to enable users' semantically similar prompts to share a portion of diffusion steps. Furthermore, to maximize the users' subjective QoE, we propose an IAI approach, i.e., Reinforcement Learning With Large Language Models Interaction (RLLI), which utilizes Large Language Model (LLM)-empowered generative agents to replicate users interaction, providing real-time and subjective QoE feedback that reflects a spectrum of user personalities. Lastly, we present the GDM-based Deep Deterministic Policy Gradient (G-DDPG) algorithm, adapted to the proposed RLLI framework, for effective communication and computing resource allocation while considering user subjective personalities and dynamic wireless environments in decision-making. Simulation results show that G-DDPG can increase the sum QoE by 15%, compared with the conventional DDPG algorithm.
A Novel Perspective Process Simulation Framework Based on Automatic Differentiation
Abstract
Thermodynamic and flash equilibrium calculations are the cornerstones of simulation process calculations. The iterative approach, a widely used nonlinear problem-solving technique, relies on derivative calculations throughout the procedure that directly affect the stability and effectiveness of the solution. In this study, we use state-of-the-art automatic differentiation frameworks for thermodynamic calculations to obtain precise derivatives without altering the logic of the algorithm. This contrasts with traditional numerical differentiation algorithms and significantly improves the convergence and computational efficiency of process simulations in contrast to numerical differentiation algorithms. Standard chemical phase equilibrium calculations such as PT, PV, and PH flash are used to evaluate an automated differentiation approach with respect to numerical stability and iteration counts. It is used to evaluate the iteration count. The results of the experiment showed that the automatic differentiation method has a more uniform gradient distribution and requires fewer convergence iterations. The experimental results show that the system shows that the process is more uniform. The gradient distribution and computational convergence curves help to highlight the improvements provided by automatic differentiation. In addition, this method shows greater generalizability and can be used more easily in the calculation of various other chemical simulation modules.
TimeSQL: Improving Multivariate Time Series Forecasting with Multi-Scale Patching and Smooth Quadratic Loss
Authors: Authors: Site Mo, Haoxin Wang, Bixiong Li, Songhai Fan, Yuankai Wu, Xianggen Liu
Abstract
Time series is a special type of sequence data, a sequence of real-valued random variables collected at even intervals of time. The real-world multivariate time series comes with noises and contains complicated local and global temporal dynamics, making it difficult to forecast the future time series given the historical observations. This work proposes a simple and effective framework, coined as TimeSQL, which leverages multi-scale patching and smooth quadratic loss (SQL) to tackle the above challenges. The multi-scale patching transforms the time series into two-dimensional patches with different length scales, facilitating the perception of both locality and long-term correlations in time series. SQL is derived from the rational quadratic kernel and can dynamically adjust the gradients to avoid overfitting to the noises and outliers. Theoretical analysis demonstrates that, under mild conditions, the effect of the noises on the model with SQL is always smaller than that with MSE. Based on the two modules, TimeSQL achieves new state-of-the-art performance on the eight real-world benchmark datasets. Further ablation studies indicate that the key modules in TimeSQL could also enhance the results of other models for multivariate time series forecasting, standing as plug-and-play techniques.
On the Communication Complexity of Decentralized Bilevel Optimization
Authors: Authors: Yihan Zhang, My T. Thai, Jie Wu, Hongchang Gao
Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC); Optimization and Control (math.OC)
Abstract
Decentralized bilevel optimization has been actively studied in the past few years since it has widespread applications in machine learning. However, existing algorithms suffer from large communication complexity caused by the estimation of stochastic hypergradient, limiting their application to real-world tasks. To address this issue, we develop a novel decentralized stochastic bilevel gradient descent algorithm under the heterogeneous setting, which enjoys a small communication cost in each round and small communication rounds. As such, it can achieve a much better communication complexity than existing algorithms. Moreover, we extend our algorithm to the more challenging decentralized multi-level optimization. To the best of our knowledge, this is the first time achieving these theoretical results under the heterogeneous setting. At last, the experimental results confirm the efficacy of our algorithm.
Optimal Hyperparameter $ε$ for Adaptive Stochastic Optimizers through Gradient Histograms
Abstract
Optimizers are essential components for successfully training deep neural network models. In order to achieve the best performance from such models, designers need to carefully choose the optimizer hyperparameters. However, this can be a computationally expensive and time-consuming process. Although it is known that all optimizer hyperparameters must be tuned for maximum performance, there is still a lack of clarity regarding the individual influence of minor priority hyperparameters, including the safeguard factor $\epsilon$ and momentum factor $\beta$, in leading adaptive optimizers (specifically, those based on the Adam optimizers). In this manuscript, we introduce a new framework based on gradient histograms to analyze and justify important attributes of adaptive optimizers, such as their optimal performance and the relationships and dependencies among hyperparameters. Furthermore, we propose a novel gradient histogram-based algorithm that automatically estimates a reduced and accurate search space for the safeguard hyperparameter $\epsilon$, where the optimal value can be easily found.
Multilevel Picard approximations overcome the curse of dimensionality in the numerical approximation of general semilinear PDEs with gradient-dependent nonlinearities
Authors: Authors: Ariel Neufeld, Tuan Anh Nguyen, Sizhou Wu
Subjects: Numerical Analysis (math.NA); Analysis of PDEs (math.AP); Probability (math.PR)
Abstract
Neufeld and Wu (arXiv:2310.12545) developed a multilevel Picard (MLP) algorithm which can approximately solve general semilinear parabolic PDEs with gradient-dependent nonlinearities, allowing also for coefficient functions of the corresponding PDE to be non-constant. By introducing a particular stochastic fixed-point equation (SFPE) motivated by the Feynman-Kac representation and the Bismut-Elworthy-Li formula and identifying the first and second component of the unique fixed-point of the SFPE with the unique viscosity solution of the PDE and its gradient, they proved convergence of their algorithm. However, it remained an open question whether the proposed MLP schema in arXiv:2310.12545 does not suffer from the curse of dimensionality. In this paper, we prove that the MLP algorithm in arXiv:2310.12545 indeed can overcome the curse of dimensionality, i.e. that its computational complexity only grows polynomially in the dimension $d\in \mathbb{N}$ and the reciprocal of the accuracy $\varepsilon$, under some suitable assumptions on the nonlinear part of the corresponding PDE.
Deep Equilibrium Diffusion Restoration with Parallel Sampling
Authors: Authors: Jiezhang Cao, Yue Shi, Kai Zhang, Yulun Zhang, Radu Timofte, Luc Van Gool
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Abstract
Diffusion-based image restoration (IR) methods aim to use diffusion models to recover high-quality (HQ) images from degraded images and achieve promising performance. Due to the inherent property of diffusion models, most of these methods need long serial sampling chains to restore HQ images step-by-step. As a result, it leads to expensive sampling time and high computation costs. Moreover, such long sampling chains hinder understanding the relationship between the restoration results and the inputs since it is hard to compute the gradients in the whole chains. In this work, we aim to rethink the diffusion-based IR models through a different perspective, i.e., a deep equilibrium (DEQ) fixed point system. Specifically, we derive an analytical solution by modeling the entire sampling chain in diffusion-based IR models as a joint multivariate fixed point system. With the help of the analytical solution, we are able to conduct single-image sampling in a parallel way and restore HQ images without training. Furthermore, we compute fast gradients in DEQ and found that initialization optimization can boost performance and control the generation direction. Extensive experiments on benchmarks demonstrate the effectiveness of our proposed method on typical IR tasks and real-world settings. The code and models will be made publicly available.
Sparse Low-rank Adaptation of Pre-trained Language Models
Authors: Authors: Ning Ding, Xingtai Lv, Qiaosen Wang, Yulin Chen, Bowen Zhou, Zhiyuan Liu, Maosong Sun
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Abstract
Fine-tuning pre-trained large language models in a parameter-efficient manner is widely studied for its effectiveness and efficiency. The popular method of low-rank adaptation (LoRA) offers a notable approach, hypothesizing that the adaptation process is intrinsically low-dimensional. Although LoRA has demonstrated commendable performance, it is implemented with a fixed and unalterable intrinsic rank that might not always be the ideal choice. Recognizing the need for more flexible adaptation, we extend the methodology of LoRA to an innovative approach we call sparse low-rank adaptation (SoRA) that enables dynamic adjustments to the intrinsic rank during the adaptation process. We achieve this through the incorporation of a gate unit optimized with proximal gradient method in the training stage, controlling the cardinality of rank under the sparsity of the gate. In the subsequent inference stage, we eliminate the parameter blocks corresponding to the zeroed-out ranks, to reduce each SoRA module back to a concise yet rank-optimal LoRA. Our approach strengthens the representation power of LoRA by initializing it with a higher rank, while efficiently taming a temporarily increased number of parameters via updating in a sparse way. We further introduce a sparsifying scheduler for SoRA, aiming to examine the impact of the number of non-zero parameters on the model's memorization and generalization. Our experimental results demonstrate that SoRA can outperform other baselines even with 70% retained parameters and 70% training time.
Can we infer the presence of Differential Privacy in Deep Learning models' weights? Towards more secure Deep Learning
Authors: Authors: Jiménez-López, Daniel, Rodríguez-Barroso, Nuria, Luzón, M. Victoria, Herrera, Francisco
Abstract
Differential Privacy (DP) is a key property to protect data and models from integrity attacks. In the Deep Learning (DL) field, it is commonly implemented through the Differentially Private Stochastic Gradient Descent (DP-SGD). However, when a model is shared or released, there is no way to check whether it is differentially private, that is, it required to trust the model provider. This situation poses a problem when data privacy is mandatory, specially with current data regulations, as the presence of DP can not be certificated consistently by any third party. Thus, we face the challenge of determining whether a DL model has been trained with DP, according to the title question: Can we infer the presence of Differential Privacy in Deep Learning models' weights? Since the DP-SGD significantly changes the training process of a DL model, we hypothesize that DP leaves an imprint in the weights of a DL model, which can be used to predict whether a model has been trained with DP regardless of its architecture and the training dataset. In this paper, we propose to employ the imprint in model weights of using DP to infer the presence of DP training in a DL model. To substantiate our hypothesis, we developed an experimental methodology based on two datasets of weights of DL models, each with models with and without DP training and a meta-classifier to infer whether DP was used in the training process of a DL model, by accessing its weights. We accomplish both, the removal of the requirement of a trusted model provider and a strong foundation for this interesting line of research. Thus, our contribution is an additional layer of security on top of the strict private requirements of DP training in DL models, towards to DL models.
Operator Learning for Continuous Spatial-Temporal Model with A Hybrid Optimization Scheme
Abstract
Partial differential equations are often used in the spatial-temporal modeling of complex dynamical systems in many engineering applications. In this work, we build on the recent progress of operator learning and present a data-driven modeling framework that is continuous in both space and time. A key feature of the proposed model is the resolution-invariance with respect to both spatial and temporal discretizations. To improve the long-term performance of the calibrated model, we further propose a hybrid optimization scheme that leverages both gradient-based and derivative-free optimization methods and efficiently trains on both short-term time series and long-term statistics. We investigate the performance of the spatial-temporal continuous learning framework with three numerical examples, including the viscous Burgers' equation, the Navier-Stokes equations, and the Kuramoto-Sivashinsky equation. The results confirm the resolution-invariance of the proposed modeling framework and also demonstrate stable long-term simulations with only short-term time series data. In addition, we show that the proposed model can better predict long-term statistics via the hybrid optimization scheme with a combined use of short-term and long-term data.
AMES: A Differentiable Embedding Space Selection Framework for Latent Graph Inference
Authors: Authors: Yuan Lu, Haitz Sáez de Ocáriz Borde, Pietro Liò
Subjects: Machine Learning (cs.LG); Social and Information Networks (cs.SI); Machine Learning (stat.ML)
Abstract
In real-world scenarios, although data entities may possess inherent relationships, the specific graph illustrating their connections might not be directly accessible. Latent graph inference addresses this issue by enabling Graph Neural Networks (GNNs) to operate on point cloud data, dynamically learning the necessary graph structure. These graphs are often derived from a latent embedding space, which can be modeled using Euclidean, hyperbolic, spherical, or product spaces. However, currently, there is no principled differentiable method for determining the optimal embedding space. In this work, we introduce the Attentional Multi-Embedding Selection (AMES) framework, a differentiable method for selecting the best embedding space for latent graph inference through backpropagation, considering a downstream task. Our framework consistently achieves comparable or superior results compared to previous methods for latent graph inference across five benchmark datasets. Importantly, our approach eliminates the need for conducting multiple experiments to identify the optimal embedding space. Furthermore, we explore interpretability techniques that track the gradient contributions of different latent graphs, shedding light on how our attention-based, fully differentiable approach learns to choose the appropriate latent space. In line with previous works, our experiments emphasize the advantages of hyperbolic spaces in enhancing performance. More importantly, our interpretability framework provides a general approach for quantitatively comparing embedding spaces across different tasks based on their contributions, a dimension that has been overlooked in previous literature on latent graph inference.
Adaptive Training Distributions with Scalable Online Bilevel Optimization
Authors: Authors: David Grangier, Pierre Ablin, Awni Hannun
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL)
Abstract
Large neural networks pretrained on web-scale corpora are central to modern machine learning. In this paradigm, the distribution of the large, heterogeneous pretraining data rarely matches that of the application domain. This work considers modifying the pretraining distribution in the case where one has a small sample of data reflecting the targeted test conditions. We propose an algorithm motivated by a recent formulation of this setting as an online, bilevel optimization problem. With scalability in mind, our algorithm prioritizes computing gradients at training points which are likely to most improve the loss on the targeted distribution. Empirically, we show that in some cases this approach is beneficial over existing strategies from the domain adaptation literature but may not succeed in other cases. We propose a simple test to evaluate when our approach can be expected to work well and point towards further research to address current limitations.
Rate-Independent Gradient Crystal Plasticity Theory -- Robust Algorithmic Formulations based on Incremental Energy Minimization
Authors: Authors: Volker Fohrmeister, Jörn Mosler
Subjects: Computational Engineering, Finance, and Science (cs.CE)
Abstract
Numerically robust algorithmic formulations suitable for rate-independent crystal plasticity are presented. They cover classic local models as well as gradient-enhanced theories in which the gradients of the plastic slips are incorporated by means of the micromorphic approach. The elaborated algorithmic formulations rely on the underlying variational structure of (associative) crystal plasticity. To be more precise and in line with so-called variational constitutive updates or incremental energy minimization principles, an incrementally defined energy derived from the underlying time-continuous constitutive model represents the starting point of the novel numerically robust algorithmic formulations. This incrementally defined potential allows to compute all variables jointly as minimizers of this energy. While such discrete variational constitutive updates are not new in general, they are considered here in order to employ powerful techniques from non-linear constrained optimization theory in order to compute robustly the aforementioned minimizers. The analyzed prototype models are based on (1) nonlinear complementarity problem (NCP) functions as well as on (2) the augmented Lagrangian formulation. Numerical experiments show the numerical robustness of the resulting algorithmic formulations. Furthermore, it is shown that the novel algorithmic ideas can also be integrated into classic, non-variational, return-mapping schemes.
Keyword: sgd
Can we infer the presence of Differential Privacy in Deep Learning models' weights? Towards more secure Deep Learning
Keyword: optimization
Towards an Automatic AI Agent for Reaction Condition Recommendation in Chemical Synthesis
The Next 700 ML-Enabled Compiler Optimizations
SplatArmor: Articulated Gaussian splatting for animatable humans from monocular RGB videos
Compact and Intuitive Airfoil Parameterization Method through Physics-aware Variational Autoencoder
Bridging Data-Driven and Knowledge-Driven Approaches for Safety-Critical Scenario Generation in Automated Vehicle Validation
Bit Cipher -- A Simple yet Powerful Word Representation System that Integrates Efficiently with Language Models
Implicit Event-RGBD Neural SLAM
SNI-SLAM: Semantic Neural Implicit SLAM
Capacity Maximization for FAS-assisted Multiple Access Channels
CLIPSwarm: Converting text into formations of robots
SBTRec- A Transformer Framework for Personalized Tour Recommendation Problem with Sentiment Analysis
6G Fresnel Spot Beamfocusing using Large-Scale Metasurfaces: A Distributed DRL-Based Approach
A Model for Multi-Agent Autonomy That Uses Opinion Dynamics and Multi-Objective Behavior Optimization
LOSTU: Fast, Scalable, and Uncertainty-Aware Triangulation
Dueling Optimization with a Monotone Adversary
Robust Network Slicing: Multi-Agent Policies, Adversarial Attacks, and Defensive Strategies
Multi-Timescale Control and Communications with Deep Reinforcement Learning -- Part I: Communication-Aware Vehicle Control
What Lies beyond the Pareto Front? A Survey on Decision-Support Methods for Multi-Objective Optimization
Threshold-Based Algorithms for an Online Rolling Horizon Framework Under Uncertainty -- With an Application to Energy Management
LABCAT: Locally adaptive Bayesian optimization using principal component-aligned trust regions
On the Communication Complexity of Decentralized Bilevel Optimization
Make me an Offer: Forward and Reverse Auctioning Problems in the Tourism Industry
DiffSCI: Zero-Shot Snapshot Compressive Imaging via Iterative Spectral Diffusion Model
Appearance Codes using Joint Embedding Learning of Multiple Modalities
Establishing Dynamic Secure Sessions for ECQV Implicit Certificates in Embedded Systems
Weight Norm Control
Controlling Grid-Connected Inverters under Time-Varying Voltage Constraints
HexGen: Generative Inference of Foundation Model over Heterogeneous Decentralized Environment
Cryogenic quasi-static embedded DRAM for energy-efficient compute-in-memory applications
A Framework on Complex Matrix Derivatives with Special Structure Constraints for Wireless Systems
Multi-stage optimisation towards transformation pathways for municipal energy systems
Asymptotic CRB Analysis of Random RIS-Assisted Large-Scale Localization Systems
Deep Equilibrium Diffusion Restoration with Parallel Sampling
ART-Owen Scrambling
GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting
Configuring an heterogeneous smartgrid network: complexity and approximations for tree topologies
Operator Learning for Continuous Spatial-Temporal Model with A Hybrid Optimization Scheme
Improving Real Estate Appraisal with POI Integration and Areal Embedding
Simultaneous Robot-World and Hand-Eye Calibration
Zero redundancy distributed learning with differential privacy
Holistic Inverse Rendering of Complex Facade via Aerial 3D Scanning
LLMs as Visual Explainers: Advancing Image Classification with Evolving Visual Descriptions
Certification of Distributional Individual Fairness
Adaptive Training Distributions with Scalable Online Bilevel Optimization
Rate-Independent Gradient Crystal Plasticity Theory -- Robust Algorithmic Formulations based on Incremental Energy Minimization
Keyword: adam
Weight Norm Control
Optimal Hyperparameter $ε$ for Adaptive Stochastic Optimizers through Gradient Histograms
Keyword: gradient
Token-level Adaptation of LoRA Adapters for Downstream Task Generalization
A Whole New Ball Game: A Primal Accelerated Method for Matrix Games and Minimizing the Maximum of Smooth Functions
The Hidden Linear Structure in Score-Based Models and its Application
User-Centric Interactive AI for Distributed Diffusion Model-based AI-Generated Content
A Novel Perspective Process Simulation Framework Based on Automatic Differentiation
TimeSQL: Improving Multivariate Time Series Forecasting with Multi-Scale Patching and Smooth Quadratic Loss
On the Communication Complexity of Decentralized Bilevel Optimization
Optimal Hyperparameter $ε$ for Adaptive Stochastic Optimizers through Gradient Histograms
Multilevel Picard approximations overcome the curse of dimensionality in the numerical approximation of general semilinear PDEs with gradient-dependent nonlinearities
Deep Equilibrium Diffusion Restoration with Parallel Sampling
Sparse Low-rank Adaptation of Pre-trained Language Models
Can we infer the presence of Differential Privacy in Deep Learning models' weights? Towards more secure Deep Learning
Operator Learning for Continuous Spatial-Temporal Model with A Hybrid Optimization Scheme
AMES: A Differentiable Embedding Space Selection Framework for Latent Graph Inference
Adaptive Training Distributions with Scalable Online Bilevel Optimization
Rate-Independent Gradient Crystal Plasticity Theory -- Robust Algorithmic Formulations based on Incremental Energy Minimization
Keyword: super-resolution
There is no result