New submissions for Thu, 23 Nov 23

Keyword: sgd

Meticulously Selecting 1% of the Dataset for Pre-training! Generating Differentially Private Images Data with Semantics Query

Authors: Authors: Kecen Li, Chen Gong, Zhixiang Li, Yuzhong Zhao, Xinwen Hou, Tianhao Wang
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2311.12850
Pdf link: https://arxiv.org/pdf/2311.12850
Abstract Differential Privacy (DP) image data synthesis, which leverages the DP technique to generate synthetic data to replace the sensitive data, allowing organizations to share and utilize synthetic images without privacy concerns. Previous methods incorporate the advanced techniques of generative models and pre-training on a public dataset to produce exceptional DP image data, but suffer from problems of unstable training and massive computational resource demands. This paper proposes a novel DP image synthesis method, termed PRIVIMAGE, which meticulously selects pre-training data, promoting the efficient creation of DP datasets with high fidelity and utility. PRIVIMAGE first establishes a semantic query function using a public dataset. Then, this function assists in querying the semantic distribution of the sensitive dataset, facilitating the selection of data from the public dataset with analogous semantics for pre-training. Finally, we pre-train an image generative model using the selected data and then fine-tune this model on the sensitive dataset using Differentially Private Stochastic Gradient Descent (DP-SGD). PRIVIMAGE allows us to train a lightly parameterized generative model, reducing the noise in the gradient during DP-SGD training and enhancing training stability. Extensive experiments demonstrate that PRIVIMAGE uses only 1% of the public dataset for pre-training and 7.6% of the parameters in the generative model compared to the state-of-the-art method, whereas achieves superior synthetic performance and conserves more computational resources. On average, PRIVIMAGE achieves 30.1% lower FID and 12.6% higher Classification Accuracy than the state-of-the-art method. The replication package and datasets can be accessed online.
Keyword: optimization

A general Framework for Utilizing Metaheuristic Optimization for Sustainable Unrelated Parallel Machine Scheduling: A concise overview
Authors: Authors: Absalom E. Ezugwu
Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI)
Arxiv link: https://arxiv.org/abs/2311.12802
Pdf link: https://arxiv.org/pdf/2311.12802
Abstract Sustainable development has emerged as a global priority, and industries are increasingly striving to align their operations with sustainable practices. Parallel machine scheduling (PMS) is a critical aspect of production planning that directly impacts resource utilization and operational efficiency. In this paper, we investigate the application of metaheuristic optimization algorithms to address the unrelated parallel machine scheduling problem (UPMSP) through the lens of sustainable development goals (SDGs). The primary objective of this study is to explore how metaheuristic optimization algorithms can contribute to achieving sustainable development goals in the context of UPMSP. We examine a range of metaheuristic algorithms, including genetic algorithms, particle swarm optimization, ant colony optimization, and more, and assess their effectiveness in optimizing the scheduling problem. The algorithms are evaluated based on their ability to improve resource utilization, minimize energy consumption, reduce environmental impact, and promote socially responsible production practices. To conduct a comprehensive analysis, we consider UPMSP instances that incorporate sustainability-related constraints and objectives.
Reducing the Environmental Impact of Wireless Communication via Probabilistic Machine Learning
Authors: Authors: A. Ryo Koblitz, Lorenzo Maggi, Matthew Andrews
Subjects: Networking and Internet Architecture (cs.NI); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2311.12807
Pdf link: https://arxiv.org/pdf/2311.12807
Abstract Machine learning methods are increasingly adopted in communications problems, particularly those arising in next generation wireless settings. Though seen as a key climate mitigation and societal adaptation enabler, communications related energy consumption is high and is expected to grow in future networks in spite of anticipated efficiency gains in 6G due to exponential communications traffic growth. To make meaningful climate mitigation impact in the communications sector, a mindset shift away from maximizing throughput at all cost and towards prioritizing energy efficiency is needed. Moreover, this must be adopted in both existing (without incurring further embodied carbon costs through equipment replacement) and future network infrastructure, given the long development time of mobile generations. To that end, we present summaries of two such problems, from both current and next generation network specifications, where probabilistic inference methods were used to great effect: using Bayesian parameter tuning we are able to safely reduce the energy consumption of existing hardware on a live communications network by $11\%$ whilst maintaining operator specified performance envelopes; through spatiotemporal Gaussian process surrogate modeling we reduce the overhead in a next generation hybrid beamforming system by over $60\%$, greatly improving the networks' ability to target highly mobile users such as autonomous vehicles. The Bayesian paradigm is itself helpful in terms of energy usage, since training a Bayesian optimization model can require much less computation than, say, training a deep neural network.
Proposing an intelligent mesh smoothing method with graph neural networks
Authors: Authors: Zhichao Wang, Xinhai Chen, Junjun Yan, Jie Liu
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Arxiv link: https://arxiv.org/abs/2311.12815
Pdf link: https://arxiv.org/pdf/2311.12815
Abstract In CFD, mesh smoothing methods are commonly utilized to refine the mesh quality to achieve high-precision numerical simulations. Specifically, optimization-based smoothing is used for high-quality mesh smoothing, but it incurs significant computational overhead. Pioneer works improve its smoothing efficiency by adopting supervised learning to learn smoothing methods from high-quality meshes. However, they pose difficulty in smoothing the mesh nodes with varying degrees and also need data augmentation to address the node input sequence problem. Additionally, the required labeled high-quality meshes further limit the applicability of the proposed method. In this paper, we present GMSNet, a lightweight neural network model for intelligent mesh smoothing. GMSNet adopts graph neural networks to extract features of the node's neighbors and output the optimal node position. During smoothing, we also introduce a fault-tolerance mechanism to prevent GMSNet from generating negative volume elements. With a lightweight model, GMSNet can effectively smoothing mesh nodes with varying degrees and remain unaffected by the order of input data. A novel loss function, MetricLoss, is also developed to eliminate the need for high-quality meshes, which provides a stable and rapid convergence during training. We compare GMSNet with commonly used mesh smoothing methods on two-dimensional triangle meshes. The experimental results show that GMSNet achieves outstanding mesh smoothing performances with 5% model parameters of the previous model, and attains 8.62 times faster than optimization-based smoothing.
A PSO Based Method to Generate Actionable Counterfactuals for High Dimensional Data
Authors: Authors: Shashank Shekhar, Asif Salim, Adesh Bansode, Vivaswan Jinturkar, Anirudha Nayak
Subjects: Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Methodology (stat.ME)
Arxiv link: https://arxiv.org/abs/2311.12825
Pdf link: https://arxiv.org/pdf/2311.12825
Abstract Counterfactual explanations (CFE) are methods that explain a machine learning model by giving an alternate class prediction of a data point with some minimal changes in its features. It helps the users to identify their data attributes that caused an undesirable prediction like a loan or credit card rejection. We describe an efficient and an actionable counterfactual (CF) generation method based on particle swarm optimization (PSO). We propose a simple objective function for the optimization of the instance-centric CF generation problem. The PSO brings in a lot of flexibility in terms of carrying out multi-objective optimization in large dimensions, capability for multiple CF generation, and setting box constraints or immutability of data attributes. An algorithm is proposed that incorporates these features and it enables greater control over the proximity and sparsity properties over the generated CFs. The proposed algorithm is evaluated with a set of action-ability metrics in real-world datasets, and the results were superior compared to that of the state-of-the-arts.
Nature Inspired Evolutionary Swarm Optimizers for Biomedical Image and Signal Processing -- A Systematic Review
Authors: Authors: Subhrangshu Adhikary
Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI)
Arxiv link: https://arxiv.org/abs/2311.12830
Pdf link: https://arxiv.org/pdf/2311.12830
Abstract The challenge of finding a global optimum in a solution search space with limited resources and higher accuracy has given rise to several optimization algorithms. Generally, the gradient-based optimizers converge to the global solution very accurately, but they often require a large number of iterations to find the solution. Researchers took inspiration from different natural phenomena and behaviours of many living organisms to develop algorithms that can solve optimization problems much quicker with high accuracy. These algorithms are called nature-inspired meta-heuristic optimization algorithms. These can be used for denoising signals, updating weights in a deep neural network, and many other cases. In the state-of-the-art, there are no systematic reviews available that have discussed the applications of nature-inspired algorithms on biomedical signal processing. The paper solves that gap by discussing the applications of such algorithms in biomedical signal processing and also provides an updated survey of the application of these algorithms in biomedical image processing. The paper reviews 28 latest peer-reviewed relevant articles and 26 nature-inspired algorithms and segregates them into thoroughly explored, lesser explored and unexplored categories intending to help readers understand the reliability and exploration stage of each of these algorithms.
Enhancing Robotic Manipulation: Harnessing the Power of Multi-Task Reinforcement Learning and Single Life Reinforcement Learning in Meta-World
Authors: Authors: Ghadi Nehme, Ishan Sabane, Tejas Y. Deo
Subjects: Artificial Intelligence (cs.AI); Robotics (cs.RO)
Arxiv link: https://arxiv.org/abs/2311.12854
Pdf link: https://arxiv.org/pdf/2311.12854
Abstract At present, robots typically require extensive training to successfully accomplish a single task. However, to truly enhance their usefulness in real-world scenarios, robots should possess the capability to perform multiple tasks effectively. To address this need, various multi-task reinforcement learning (RL) algorithms have been developed, including multi-task proximal policy optimization (PPO), multi-task trust region policy optimization (TRPO), and multi-task soft-actor critic (SAC). Nevertheless, these algorithms demonstrate optimal performance only when operating within an environment or observation space that exhibits a similar distribution. In reality, such conditions are often not the norm, as robots may encounter scenarios or observations that differ from those on which they were trained. Addressing this challenge, algorithms like Q-Weighted Adversarial Learning (QWALE) attempt to tackle the issue by training the base algorithm (generating prior data) solely for a particular task, rendering it unsuitable for generalization across tasks. So, the aim of this research project is to enable a robotic arm to successfully execute seven distinct tasks within the Meta World environment. To achieve this, a multi-task soft actor-critic (MT-SAC) is employed to train the robotic arm. Subsequently, the trained model will serve as a source of prior data for the single-life RL algorithm. The effectiveness of this MT-QWALE algorithm will be assessed by conducting tests on various target positions (novel positions). In the end, a comparison is provided between the trained MT-SAC and the MT-QWALE algorithm where the MT-QWALE performs better. An ablation study demonstrates that MT-QWALE successfully completes tasks with a slightly larger number of steps even after hiding the final goal position.
An Efficient 3D Gaussian Representation for Monocular/Multi-view Dynamic Scenes
Authors: Authors: Kai Katsumata, Duc Minh Vo, Hideki Nakayama
Subjects: Graphics (cs.GR)
Arxiv link: https://arxiv.org/abs/2311.12897
Pdf link: https://arxiv.org/pdf/2311.12897
Abstract In novel view synthesis of scenes from multiple input views, 3D Gaussian splatting emerges as a viable alternative to existing radiance field approaches, delivering great visual quality and real-time rendering. While successful in static scenes, the present advancement of 3D Gaussian representation, however, faces challenges in dynamic scenes in terms of memory consumption and the need for numerous observations per time step, due to the onus of storing 3D Gaussian parameters per time step. In this study, we present an efficient 3D Gaussian representation tailored for dynamic scenes in which we define positions and rotations as functions of time while leaving other time-invariant properties of the static 3D Gaussian unchanged. Notably, our representation reduces memory usage, which is consistent regardless of the input sequence length. Additionally, it mitigates the risk of overfitting observed frames by accounting for temporal changes. The optimization of our Gaussian representation based on image and flow reconstruction results in a powerful framework for dynamic scene view synthesis in both monocular and multi-view cases. We obtain the highest rendering speed of $118$ frames per second (FPS) at a resolution of $1352 \times 1014$ with a single GPU, showing the practical usability and effectiveness of our proposed method in dynamic scene rendering scenarios.
Diffusion Model Alignment Using Direct Preference Optimization
Authors: Authors: Bram Wallace, Meihua Dang, Rafael Rafailov, Linqi Zhou, Aaron Lou, Senthil Purushwalkam, Stefano Ermon, Caiming Xiong, Shafiq Joty, Nikhil Naik
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2311.12908
Pdf link: https://arxiv.org/pdf/2311.12908
Abstract Large language models (LLMs) are fine-tuned using human comparison data with Reinforcement Learning from Human Feedback (RLHF) methods to make them better aligned with users' preferences. In contrast to LLMs, human preference learning has not been widely explored in text-to-image diffusion models; the best existing approach is to fine-tune a pretrained model using carefully curated high quality images and captions to improve visual appeal and text alignment. We propose Diffusion-DPO, a method to align diffusion models to human preferences by directly optimizing on human comparison data. Diffusion-DPO is adapted from the recently developed Direct Preference Optimization (DPO), a simpler alternative to RLHF which directly optimizes a policy that best satisfies human preferences under a classification objective. We re-formulate DPO to account for a diffusion model notion of likelihood, utilizing the evidence lower bound to derive a differentiable objective. Using the Pick-a-Pic dataset of 851K crowdsourced pairwise preferences, we fine-tune the base model of the state-of-the-art Stable Diffusion XL (SDXL)-1.0 model with Diffusion-DPO. Our fine-tuned base model significantly outperforms both base SDXL-1.0 and the larger SDXL-1.0 model consisting of an additional refinement model in human evaluation, improving visual appeal and prompt alignment. We also develop a variant that uses AI feedback and has comparable performance to training on human preferences, opening the door for scaling of diffusion model alignment methods.
Q-Seg: Quantum Annealing-based Unsupervised Image Segmentation
Authors: Authors: Supreeth Mysore Venkatesh, Antonio Macaluso, Marlon Nuske, Matthias Klusch, Andreas Dengel
Subjects: Computer Vision and Pattern Recognition (cs.CV); Quantum Physics (quant-ph)
Arxiv link: https://arxiv.org/abs/2311.12912
Pdf link: https://arxiv.org/pdf/2311.12912
Abstract In this study, we present Q-Seg, a novel unsupervised image segmentation method based on quantum annealing, tailored for existing quantum hardware. We formulate the pixel-wise segmentation problem, which assimilates spectral and spatial information of the image, as a graph-cut optimization task. Our method efficiently leverages the interconnected qubit topology of the D-Wave Advantage device, offering superior scalability over existing quantum approaches and outperforming state-of-the-art classical methods. Our empirical evaluations on synthetic datasets reveal that Q-Seg offers better runtime performance against the classical optimizer Gurobi. Furthermore, we evaluate our method on segmentation of Earth Observation images, an area of application where the amount of labeled data is usually very limited. In this case, Q-Seg demonstrates near-optimal results in flood mapping detection with respect to classical supervised state-of-the-art machine learning methods. Also, Q-Seg provides enhanced segmentation for forest coverage compared to existing annotated masks. Thus, Q-Seg emerges as a viable alternative for real-world applications using available quantum hardware, particularly in scenarios where the lack of labeled data and computational runtime are critical.
SD-NAE: Generating Natural Adversarial Examples with Stable Diffusion
Authors: Authors: Yueqian Lin, Jingyang Zhang, Yiran Chen, Hai Li
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2311.12981
Pdf link: https://arxiv.org/pdf/2311.12981
Abstract Robustly evaluating deep learning image classifiers is challenging due to some limitations of standard datasets. Natural Adversarial Examples (NAEs), arising naturally from the environment and capable of deceiving classifiers, are instrumental in identifying vulnerabilities in trained models. Existing works collect such NAEs by filtering from a huge set of real images, a process that is passive and lacks control. In this work, we propose to actively synthesize NAEs with the state-of-the-art Stable Diffusion. Specifically, our method formulates a controlled optimization process, where we perturb the token embedding that corresponds to a specified class to synthesize NAEs. The generation is guided by the gradient of loss from the target classifier so that the created image closely mimics the ground-truth class yet fools the classifier. Named SD-NAE (Stable Diffusion for Natural Adversarial Examples), our innovative method is effective in producing valid and useful NAEs, which is demonstrated through a meticulously designed experiment. Our work thereby provides a valuable method for obtaining challenging evaluation data, which in turn can potentially advance the development of more robust deep learning models. Code is available at https://github.com/linyueqian/SD-NAE.
Fast and Interpretable Mortality Risk Scores for Critical Care Patients
Authors: Authors: Chloe Qinyu Zhu, Muhang Tian, Lesia Semenova, Jiachang Liu, Jack Xu, Joseph Scarpa, Cynthia Rudin
Subjects: Machine Learning (cs.LG); Computers and Society (cs.CY)
Arxiv link: https://arxiv.org/abs/2311.13015
Pdf link: https://arxiv.org/pdf/2311.13015
Abstract Prediction of mortality in intensive care unit (ICU) patients is an important task in critical care medicine. Prior work in creating mortality risk models falls into two major categories: domain-expert-created scoring systems, and black box machine learning (ML) models. Both of these have disadvantages: black box models are unacceptable for use in hospitals, whereas manual creation of models (including hand-tuning of logistic regression parameters) relies on humans to perform high-dimensional constrained optimization, which leads to a loss in performance. In this work, we bridge the gap between accurate black box models and hand-tuned interpretable models. We build on modern interpretable ML techniques to design accurate and interpretable mortality risk scores. We leverage the largest existing public ICU monitoring datasets, namely the MIMIC III and eICU datasets. By evaluating risk across medical centers, we are able to study generalization across domains. In order to customize our risk score models, we develop a new algorithm, GroupFasterRisk, which has several important benefits: (1) it uses hard sparsity constraint, allowing users to directly control the number of features; (2) it incorporates group sparsity to allow more cohesive models; (3) it allows for monotonicity correction on models for including domain knowledge; (4) it produces many equally-good models at once, which allows domain experts to choose among them. GroupFasterRisk creates its risk scores within hours, even on the large datasets we study here. GroupFasterRisk's risk scores perform better than risk scores currently used in hospitals, and have similar prediction performance to black box ML models (despite being much sparser). Because GroupFasterRisk produces a variety of risk scores and handles constraints, it allows design flexibility, which is the key enabler of practical and trustworthy model creation.
Multi-fidelity Bayesian Optimization in Engineering Design
Authors: Authors: Bach Do, Ruda Zhang
Subjects: Computational Engineering, Finance, and Science (cs.CE); Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
Arxiv link: https://arxiv.org/abs/2311.13050
Pdf link: https://arxiv.org/pdf/2311.13050
Abstract Resided at the intersection of multi-fidelity optimization (MFO) and Bayesian optimization (BO), MF BO has found a niche in solving expensive engineering design optimization problems, thanks to its advantages in incorporating physical and mathematical understandings of the problems, saving resources, addressing exploitation-exploration trade-off, considering uncertainty, and processing parallel computing. The increasing number of works dedicated to MF BO suggests the need for a comprehensive review of this advanced optimization technique. In this paper, we survey recent developments of two essential ingredients of MF BO: Gaussian process (GP) based MF surrogates and acquisition functions. We first categorize the existing MF modeling methods and MFO strategies to locate MF BO in a large family of surrogate-based optimization and MFO algorithms. We then exploit the common properties shared between the methods from each ingredient of MF BO to describe important GP-based MF surrogate models and review various acquisition functions. By doing so, we expect to provide a structured understanding of MF BO. Finally, we attempt to reveal important aspects that require further research for applications of MF BO in solving intricate yet important design optimization problems, including constrained optimization, high-dimensional optimization, optimization under uncertainty, and multi-objective optimization.
Predict-Then-Optimize by Proxy: Learning Joint Models of Prediction and Optimization
Authors: Authors: James Kotary, Vincenzo Di Vito, Jacob Christopher, Pascal Van Hentenryck, Ferdinando Fioretto
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Arxiv link: https://arxiv.org/abs/2311.13087
Pdf link: https://arxiv.org/pdf/2311.13087
Abstract Many real-world decision processes are modeled by optimization problems whose defining parameters are unknown and must be inferred from observable data. The Predict-Then-Optimize framework uses machine learning models to predict unknown parameters of an optimization problem from features before solving. Recent works show that decision quality can be improved in this setting by solving and differentiating the optimization problem in the training loop, enabling end-to-end training with loss functions defined directly on the resulting decisions. However, this approach can be inefficient and requires handcrafted, problem-specific rules for backpropagation through the optimization step. This paper proposes an alternative method, in which optimal solutions are learned directly from the observable features by predictive models. The approach is generic, and based on an adaptation of the Learning-to-Optimize paradigm, from which a rich variety of existing techniques can be employed. Experimental evaluations show the ability of several Learning-to-Optimize methods to provide efficient, accurate, and flexible solutions to an array of challenging Predict-Then-Optimize problems.
AC Power Flow Informed Parameter Learning for DC Power Flow Network Equivalents
Authors: Authors: Babak Taheri, Daniel K. Molzahn
Subjects: Systems and Control (eess.SY)
Arxiv link: https://arxiv.org/abs/2311.13104
Pdf link: https://arxiv.org/pdf/2311.13104
Abstract This paper presents an algorithm to optimize the parameters of power systems equivalents to enhance the accuracy of the DC power flow approximation in reduced networks. Based on a zonal division of the network, the algorithm produces a reduced power system equivalent that captures inter-zonal flows with aggregated buses and equivalent transmission lines. The algorithm refines coefficient and bias parameters for the DC power flow model of the reduced network, aiming to minimize discrepancies between inter-zonal flows in DC and AC power flow results. Using optimization methods like BFGS, L-BFGS, and TNC in an offline training phase, these parameters boost the accuracy of online DC power flow computations. In contrast to existing network equivalencing methods, the proposed algorithm optimizes accuracy over a specified range of operation as opposed to only considering a single nominal point. Numerical tests demonstrate substantial accuracy improvements over traditional equivalencing and approximation methods.
White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is?
Authors: Authors: Yaodong Yu, Sam Buchanan, Druv Pai, Tianzhe Chu, Ziyang Wu, Shengbang Tong, Hao Bai, Yuexiang Zhai, Benjamin D. Haeffele, Yi Ma
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2311.13110
Pdf link: https://arxiv.org/pdf/2311.13110
Abstract In this paper, we contend that a natural objective of representation learning is to compress and transform the distribution of the data, say sets of tokens, towards a low-dimensional Gaussian mixture supported on incoherent subspaces. The goodness of such a representation can be evaluated by a principled measure, called sparse rate reduction, that simultaneously maximizes the intrinsic information gain and extrinsic sparsity of the learned representation. From this perspective, popular deep network architectures, including transformers, can be viewed as realizing iterative schemes to optimize this measure. Particularly, we derive a transformer block from alternating optimization on parts of this objective: the multi-head self-attention operator compresses the representation by implementing an approximate gradient descent step on the coding rate of the features, and the subsequent multi-layer perceptron sparsifies the features. This leads to a family of white-box transformer-like deep network architectures, named CRATE, which are mathematically fully interpretable. We show, by way of a novel connection between denoising and compression, that the inverse to the aforementioned compressive encoding can be realized by the same class of CRATE architectures. Thus, the so-derived white-box architectures are universal to both encoders and decoders. Experiments show that these networks, despite their simplicity, indeed learn to compress and sparsify representations of large-scale real-world image and text datasets, and achieve performance very close to highly engineered transformer-based models: ViT, MAE, DINO, BERT, and GPT2. We believe the proposed computational framework demonstrates great potential in bridging the gap between theory and practice of deep learning, from a unified perspective of data compression. Code is available at: https://ma-lab-berkeley.github.io/CRATE .
Toward Robust Imperceptible Perturbation against Unauthorized Text-to-image Diffusion-based Synthesis
Authors: Authors: Yixin Liu, Chenrui Fan, Yutong Dai, Xun Chen, Pan Zhou, Lichao Sun
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR)
Arxiv link: https://arxiv.org/abs/2311.13127
Pdf link: https://arxiv.org/pdf/2311.13127
Abstract Text-to-image diffusion models allow seamless generation of personalized images from scant reference photos. Yet, these tools, in the wrong hands, can fabricate misleading or harmful content, endangering individuals. To address this problem, existing poisoning-based approaches perturb user images in an imperceptible way to render them "unlearnable" from malicious uses. We identify two limitations of these defending approaches: i) sub-optimal due to the hand-crafted heuristics for solving the intractable bilevel optimization and ii) lack of robustness against simple data transformations like Gaussian filtering. To solve these challenges, we propose MetaCloak, which solves the bi-level poisoning problem with a meta-learning framework with an additional transformation sampling process to craft transferable and robust perturbation. Specifically, we employ a pool of surrogate diffusion models to craft transferable and model-agnostic perturbation. Furthermore, by incorporating an additional transformation process, we design a simple denoising-error maximization loss that is sufficient for causing transformation-robust semantic distortion and degradation in a personalized generation. Extensive experiments on the VGGFace2 and CelebA-HQ datasets show that MetaCloak outperforms existing approaches. Notably, MetaCloak can successfully fool online training services like Replicate, in a black-box manner, demonstrating the effectiveness of MetaCloak in real-world scenarios. Our code is available at https://github.com/liuyixin-louis/MetaCloak.
Joint Distributed Precoding and Beamforming for RIS-aided Cell-Free Massive MIMO Systems
Authors: Authors: Peng Zhang, Jiayi Zhang, Huahua Xiao, Xiaodan Zhang, Derrick Wing Kwan Ng, Bo Ai
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)
Arxiv link: https://arxiv.org/abs/2311.13139
Pdf link: https://arxiv.org/pdf/2311.13139
Abstract The amalgamation of cell-free networks and reconfigurable intelligent surface (RIS) has become a prospective technique for future sixth-generation wireless communication systems. In this paper, we focus on the precoding and beamforming design for a downlink RIS-aided cell-free network. The design is formulated as a non-convex optimization problem by jointly optimizing the combining vector, active precoding, and passive RIS beamforming for minimizing the weighted sum of users' mean square error. A novel joint distributed precoding and beamforming framework is proposed to decentralize the alternating optimization method for acquiring a suboptimal solution to the design problem. Finally, numerical results validate the effectiveness of the proposed distributed precoding and beamforming framework, showing its low-complexity and improved scalability compared with the centralized method.
Optimal Transport with Cyclic Symmetry
Authors: Authors: Shoichiro Takeda, Yasunori Akagi, Naoki Marumo, Kenta Niwa
Subjects: Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2311.13147
Pdf link: https://arxiv.org/pdf/2311.13147
Abstract We propose novel fast algorithms for optimal transport (OT) utilizing a cyclic symmetry structure of input data. Such OT with cyclic symmetry appears universally in various real-world examples: image processing, urban planning, and graph processing. Our main idea is to reduce OT to a small optimization problem that has significantly fewer variables by utilizing cyclic symmetry and various optimization techniques. On the basis of this reduction, our algorithms solve the small optimization problem instead of the original OT. As a result, our algorithms obtain the optimal solution and the objective function value of the original OT faster than solving the original OT directly. In this paper, our focus is on two crucial OT formulations: the linear programming OT (LOT) and the strongly convex-regularized OT, which includes the well-known entropy-regularized OT (EROT). Experiments show the effectiveness of our algorithms for LOT and EROT in synthetic/real-world data that has a strict/approximate cyclic symmetry structure. Through theoretical and experimental results, this paper successfully introduces the concept of symmetry into the OT research field for the first time.
Enhancing Microgrid Resilience with Green Hydrogen Storage
Authors: Authors: Shreshtha Dhankar, Cong Chen, Lang Tong
Subjects: Systems and Control (eess.SY)
Arxiv link: https://arxiv.org/abs/2311.13149
Pdf link: https://arxiv.org/pdf/2311.13149
Abstract We consider the problem of hydrogen storage integration in microgrids to improve the electricity supply resilience. Nonlinear effects from electrochemical models of electrolyzers and fuel cells for hydrogen storage are considered, making scheduling under the nonlinear model intractable and the conventional linear approximation infeasible. A piecewise linear model approximation with feasibility projection is proposed, resulting in a computationally efficient model predictive control for hydrogen storage operation. Several resilience performance measures, such as loss-of-load, duration-of-outage, and system cost, are used in performance evaluation. Simulations for the proposed optimization demonstrated a 13%-48% reduction in duration-of-outage, a 6.4%-21.7% reduction in system cost, and a 95% reduction in loss-of-load for critical loads compared to the scheduling algorithm involving linear model approximation. The performance gap of the proposed optimization to the benchmark involving the accurate nonlinear electrochemical model is less than 1% in most metrics.
Multi-Objective Optimization via Wasserstein-Fisher-Rao Gradient Flow
Authors: Authors: Yinuo Ren, Tesi Xiao, Tanmay Gangwani, Anshuka Rangi, Holakou Rahmanian, Lexing Ying, Subhajit Sanyal
Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
Arxiv link: https://arxiv.org/abs/2311.13159
Pdf link: https://arxiv.org/pdf/2311.13159
Abstract Multi-objective optimization (MOO) aims to optimize multiple, possibly conflicting objectives with widespread applications. We introduce a novel interacting particle method for MOO inspired by molecular dynamics simulations. Our approach combines overdamped Langevin and birth-death dynamics, incorporating a "dominance potential" to steer particles toward global Pareto optimality. In contrast to previous methods, our method is able to relocate dominated particles, making it particularly adept at managing Pareto fronts of complicated geometries. Our method is also theoretically grounded as a Wasserstein-Fisher-Rao gradient flow with convergence guarantees. Extensive experiments confirm that our approach outperforms state-of-the-art methods on challenging synthetic and real-world datasets.
Differentiable Radio Frequency Ray Tracing for Millimeter-Wave Sensing
Authors: Authors: Xingyu Chen, Xinyu Zhang, Qiyue Xia, Xinmin Fang, Chris Xiaoxuan Lu, Zhengxiong Li
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2311.13182
Pdf link: https://arxiv.org/pdf/2311.13182
Abstract Millimeter wave (mmWave) sensing is an emerging technology with applications in 3D object characterization and environment mapping. However, realizing precise 3D reconstruction from sparse mmWave signals remains challenging. Existing methods rely on data-driven learning, constrained by dataset availability and difficulty in generalization. We propose DiffSBR, a differentiable framework for mmWave-based 3D reconstruction. DiffSBR incorporates a differentiable ray tracing engine to simulate radar point clouds from virtual 3D models. A gradient-based optimizer refines the model parameters to minimize the discrepancy between simulated and real point clouds. Experiments using various radar hardware validate DiffSBR's capability for fine-grained 3D reconstruction, even for novel objects unseen by the radar previously. By integrating physics-based simulation with gradient optimization, DiffSBR transcends the limitations of data-driven approaches and pioneers a new paradigm for mmWave sensing.
Optimal trajectory planning meets network-level routing: Integrated control framework for emerging mobility systems
Authors: Authors: Heeseung Bang, Andreas A. Malikopoulos
Subjects: Systems and Control (eess.SY)
Arxiv link: https://arxiv.org/abs/2311.13193
Pdf link: https://arxiv.org/pdf/2311.13193
Abstract In this paper, we introduce a hierarchical decision-making framework for emerging mobility systems. Despite numerous studies focusing on optimizing vehicle flow, practical feasibility has often been overlooked. To address this gap, we present a route-recovery method and energy-optimal trajectory planning tailored for connected and automated vehicles (CAVs) to ensure the realization of optimal flow. Our approach identifies the optimal vehicle flow to minimize total travel time while considering consistent mobility demands in urban settings. We deploy a heuristic route-recovery algorithm that assigns routes to CAVs and departure/arrival time at each road segment. Furthermore, we propose an efficient coordination method that rapidly solves constrained optimization problems by flexibly piecing together unconstrained energy-optimal trajectories. The proposed method has the potential to effectively generate optimal vehicle flow, contributing to the reduction of travel time and energy consumption in urban areas.
Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model
Authors: Authors: Kai Yang, Jian Tao, Jiafei Lyu, Chunjiang Ge, Jiaxin Chen, Qimai Li, Weihan Shen, Xiaolong Zhu, Xiu Li
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2311.13231
Pdf link: https://arxiv.org/pdf/2311.13231
Abstract Using reinforcement learning with human feedback (RLHF) has shown significant promise in fine-tuning diffusion models. Previous methods start by training a reward model that aligns with human preferences, then leverage RL techniques to fine-tune the underlying models. However, crafting an efficient reward model demands extensive datasets, optimal architecture, and manual hyperparameter tuning, making the process both time and cost-intensive. The direct preference optimization (DPO) method, effective in fine-tuning large language models, eliminates the necessity for a reward model. However, the extensive GPU memory requirement of the diffusion model's denoising process hinders the direct application of the DPO method. To address this issue, we introduce the Direct Preference for Denoising Diffusion Policy Optimization (D3PO) method to directly fine-tune diffusion models. The theoretical analysis demonstrates that although D3PO omits training a reward model, it effectively functions as the optimal reward model trained using human feedback data to guide the learning process. This approach requires no training of a reward model, proving to be more direct, cost-effective, and minimizing computational overhead. In experiments, our method uses the relative scale of objectives as a proxy for human preference, delivering comparable results to methods using ground-truth rewards. Moreover, D3PO demonstrates the ability to reduce image distortion rates and generate safer images, overcoming challenges lacking robust reward models.
Understanding Cost Dynamics of Serverless Computing: An Empirical Study
Authors: Authors: Muhammad Hamza, Muhammad Azeem Akbar, Rafael Capilla
Subjects: Software Engineering (cs.SE)
Arxiv link: https://arxiv.org/abs/2311.13242
Pdf link: https://arxiv.org/pdf/2311.13242
Abstract The advent of serverless computing has revolutionized the landscape of cloud computing, offering a new paradigm that enables developers to focus solely on their applications rather than managing and provisioning the underlying infrastructure. These applications involve integrating individual functions into a cohesive workflow for complex tasks. The pay-per-use model and nontransparent reporting by cloud providers make it difficult to estimate serverless costs, imped-ing informed business decisions. Existing research studies on serverless compu-ting focus on performance optimization and state management, both from empir-ical and technical perspectives. However, the state-of-the-art shows a lack of em-pirical investigations on the understanding of the cost dynamics of serverless computing over traditional cloud computing. Therefore, this study delves into how organizations anticipate the costs of adopting serverless. It also aims to com-prehend workload suitability and identify best practices for cost optimization of serverless applications. To this end, we conducted a qualitative (interviews) study with 15 experts from 8 companies involved in the migration and development of serverless systems. The findings revealed that, while serverless computing is highly suitable for unpredictable workloads, it may not be cost-effective for cer-tain high-scale applications. The study also introduces a taxonomy for comparing the cost of adopting serverless versus traditional cloud.
Hard Label Black Box Node Injection Attack on Graph Neural Networks
Authors: Authors: Yu Zhou, Zihao Dong, Guofeng Zhang, Jingchen Tang
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Social and Information Networks (cs.SI)
Arxiv link: https://arxiv.org/abs/2311.13244
Pdf link: https://arxiv.org/pdf/2311.13244
Abstract While graph neural networks have achieved state-of-the-art performances in many real-world tasks including graph classification and node classification, recent works have demonstrated they are also extremely vulnerable to adversarial attacks. Most previous works have focused on attacking node classification networks under impractical white-box scenarios. In this work, we will propose a non-targeted Hard Label Black Box Node Injection Attack on Graph Neural Networks, which to the best of our knowledge, is the first of its kind. Under this setting, more real world tasks can be studied because our attack assumes no prior knowledge about (1): the model architecture of the GNN we are attacking; (2): the model's gradients; (3): the output logits of the target GNN model. Our attack is based on an existing edge perturbation attack, from which we restrict the optimization process to formulate a node injection attack. In the work, we will evaluate the performance of the attack using three datasets, COIL-DEL, IMDB-BINARY, and NCI1.
Towards Hetero-Client Federated Multi-Task Learning
Authors: Authors: Yuxiang Lu, Suizhi Huang, Yuwen Yang, Shalayiding Sirejiding, Yue Ding, Hongtao Lu
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2311.13250
Pdf link: https://arxiv.org/pdf/2311.13250
Abstract Federated Learning (FL) enables joint training across distributed clients using their local data privately. Federated Multi-Task Learning (FMTL) builds on FL to handle multiple tasks, assuming model congruity that identical model architecture is deployed in each client. To relax this assumption and thus extend real-world applicability, we introduce a novel problem setting, Hetero-Client Federated Multi-Task Learning (HC-FMTL), to accommodate diverse task setups. The main challenge of HC-FMTL is the model incongruity issue that invalidates conventional aggregation methods. It also escalates the difficulties in accurate model aggregation to deal with data and task heterogeneity inherent in FMTL. To address these challenges, we propose the FedHCA$^2$ framework, which allows for federated training of personalized models by modeling relationships among heterogeneous clients. Drawing on our theoretical insights into the difference between multi-task and federated optimization, we propose the Hyper Conflict-Averse Aggregation scheme to mitigate conflicts during encoder updates. Additionally, inspired by task interaction in MTL, the Hyper Cross Attention Aggregation scheme uses layer-wise cross attention to enhance decoder interactions while alleviating model incongruity. Moreover, we employ learnable Hyper Aggregation Weights for each client to customize personalized parameter updates. Extensive experiments demonstrate the superior performance of FedHCA$^2$ in various HC-FMTL scenarios compared to representative methods. Our code will be made publicly available.
Probabilistic Inference in Reinforcement Learning Done Right
Authors: Authors: Jean Tarbouriech, Tor Lattimore, Brendan O'Donoghue
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Arxiv link: https://arxiv.org/abs/2311.13294
Pdf link: https://arxiv.org/pdf/2311.13294
Abstract A popular perspective in Reinforcement learning (RL) casts the problem as probabilistic inference on a graphical model of the Markov decision process (MDP). The core object of study is the probability of each state-action pair being visited under the optimal policy. Previous approaches to approximate this quantity can be arbitrarily poor, leading to algorithms that do not implement genuine statistical inference and consequently do not perform well in challenging problems. In this work, we undertake a rigorous Bayesian treatment of the posterior probability of state-action optimality and clarify how it flows through the MDP. We first reveal that this quantity can indeed be used to generate a policy that explores efficiently, as measured by regret. Unfortunately, computing it is intractable, so we derive a new variational Bayesian approximation yielding a tractable convex optimization problem and establish that the resulting policy also explores efficiently. We call our approach VAPOR and show that it has strong connections to Thompson sampling, K-learning, and maximum entropy exploration. We conclude with some experiments demonstrating the performance advantage of a deep RL version of VAPOR.
AA-DL: AoI-Aware Deep Learning Approach for D2D-Assisted Industrial IoT
Authors: Authors: Hossam Farag, Mohamed Ragab, Cedomir Stefanovic
Subjects: Networking and Internet Architecture (cs.NI); Signal Processing (eess.SP)
Arxiv link: https://arxiv.org/abs/2311.13325
Pdf link: https://arxiv.org/pdf/2311.13325
Abstract In real-time Industrial Internet of Things (IIoT), e.g., monitoring and control scenarios, the freshness of data is crucial to maintain the system functionality and stability. In this paper, we propose an AoI-Aware Deep Learning (AA-DL) approach to minimize the Peak Age of Information (PAoI) in D2D-assisted IIoT networks. Particularly, we analyzed the success probability and the average PAoI via stochastic geometry, and formulate an optimization problem with the objective to find the optimal scheduling policy that minimizes PAoI. In order to solve the non-convex scheduling problem, we develop a Neural Network (NN) structure that exploits the Geographic Location Information (GLI) along with feedback stages to perform unsupervised learning over randomly deployed networks. Our motivation is based on the observation that in various transmission contexts, the wireless channel intensity is mainly influenced by distancedependant path loss, which could be calculated using the GLI of each link. The performance of the AA-DL method is evaluated via numerical results that demonstrate the effectiveness of our proposed method to improve the PAoI performance compared to a recent benchmark while maintains lower complexity against the conventional iterative optimization method.
Trace-enabled Timing Model Synthesis for ROS2-based Autonomous Applications
Authors: Authors: Hazem Abaza, Debayan Roy, Shiqing Fan, Selma Saidi, Antonios Motakis
Subjects: Operating Systems (cs.OS)
Arxiv link: https://arxiv.org/abs/2311.13333
Pdf link: https://arxiv.org/pdf/2311.13333
Abstract Autonomous applications are typically developed over Robot Operating System 2.0 (ROS2) even in time-critical systems like automotive. Recent years have seen increased interest in developing model-based timing analysis and schedule optimization approaches for ROS2-based applications. To complement these approaches, we propose a tracing and measurement framework to \emph{obtain timing models} of ROS2-based applications. It offers a tracer based on \emph{extended Berkeley Packet Filter} that probes different functions in ROS2 middleware and reads their arguments or return values to reason about the data flow in applications. It combines event traces from ROS2 and the operating system to generate a \emph{directed acyclic graph} showing ROS2 callbacks, precedence relations between them, and their timing attributes. While being compatible with existing analyses, we also show how to model (i)~message synchronization, e.g., in sensor fusion, and (ii)~service requests from multiple clients, e.g., in motion planning. Considering that, in real-world scenarios, the application code might be \emph{confidential} and formal models are unavailable, our framework still enables the application of existing analysis and optimization techniques.
REDS: Resource-Efficient Deep Subnetworks for Dynamic Resource Constraints
Authors: Authors: Francesco Corti, Balz Maag, Joachim Schauer, Ulrich Pferschy, Olga Saukh
Subjects: Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2311.13349
Pdf link: https://arxiv.org/pdf/2311.13349
Abstract Deep models deployed on edge devices frequently encounter resource variability, which arises from fluctuating energy levels, timing constraints, or prioritization of other critical tasks within the system. State-of-the-art machine learning pipelines generate resource-agnostic models, not capable to adapt at runtime. In this work we introduce Resource-Efficient Deep Subnetworks (REDS) to tackle model adaptation to variable resources. In contrast to the state-of-the-art, REDS use structured sparsity constructively by exploiting permutation invariance of neurons, which allows for hardware-specific optimizations. Specifically, REDS achieve computational efficiency by (1) skipping sequential computational blocks identified by a novel iterative knapsack optimizer, and (2) leveraging simple math to re-arrange the order of operations in REDS computational graph to take advantage of the data cache. REDS support conventional deep networks frequently deployed on the edge and provide computational benefits even for small and simple networks. We evaluate REDS on six benchmark architectures trained on the Google Speech Commands, FMNIST and CIFAR10 datasets, and test on four off-the-shelf mobile and embedded hardware platforms. We provide a theoretical result and empirical evidence for REDS outstanding performance in terms of submodels' test set accuracy, and demonstrate an adaptation time in response to dynamic resource constraints of under 40$\mu$s, utilizing a 2-layer fully-connected network on Arduino Nano 33 BLE Sense.
Numerical Approximation of Optimal Convex Shapes in $\mathbb{R}^3$
Authors: Authors: Sören Bartels (1), Hedwig Keller (1), Gerd Wachsmuth (2) ((1) University Freiburg, (2) BTU Cottbus)
Subjects: Numerical Analysis (math.NA); Optimization and Control (math.OC)
Arxiv link: https://arxiv.org/abs/2311.13386
Pdf link: https://arxiv.org/pdf/2311.13386
Abstract In the optimization of convex domains under a PDE constraint numerical difficulties arise in the approximation of convex domains in $\mathbb{R}^3$. Previous research used a restriction to rotationally symmetric domains to reduce shape optimization problems to a two-dimensional setting. In the current research, two approaches for the approximation in $\mathbb{R}^3$ are considered. First, a notion of discrete convexity allows for a nearly convex approximation with polyhedral domains. An alternative approach is based on the recent observation that higher order finite elements can approximate convex functions conformally. As a second approach these results are used to approximate optimal convex domains with isoparametric convex domains. The proposed algorithms were tested on shape optimization problems constrained by a Poisson equation and both algorithms achieved similar results.
Conflict Management in the Near-RT-RIC of Open RAN: A Game Theoretic Approach
Authors: Authors: Abdul Wadud, Fatemeh Golpayegani, Nima Afraz
Subjects: Networking and Internet Architecture (cs.NI); Computer Science and Game Theory (cs.GT)
Arxiv link: https://arxiv.org/abs/2311.13389
Pdf link: https://arxiv.org/pdf/2311.13389
Abstract Open Radio Access Network (RAN) was introduced recently to incorporate intelligence and openness into the upcoming generation of RAN. Open RAN offers standardized interfaces and the capacity to accommodate network applications from external vendors through extensible applications (xApps), which enhance network management flexibility. The Near-Real-Time Radio Intelligent Controller (Near-RT-RIC) employs specialized and intelligent xApps for achieving time-critical optimization objectives, but conflicts may arise due to different vendors' xApps modifying the same parameters or indirectly affecting each others' performance. A standardized Conflict Management System (CMS) is absent in most of the popular Open RAN architectures including the most prominent O-RAN Alliance architecture. To address this, we propose a CMS with independent controllers for conflict detection and mitigation between xApps in the Near-RT-RIC. We utilize cooperative bargain game theory, including Nash Social Welfare Function (NSWF) and the Equal Gains (EG) solution, to find optimal configurations for conflicting parameters. Experimental results demonstrate the effectiveness of the proposed Conflict Management Controller (CMC) in balancing conflicting parameters and mitigating adverse impacts in the Near-RT-RIC on a theoretical example scenario.
Depth-Regularized Optimization for 3D Gaussian Splatting in Few-Shot Images
Authors: Authors: Jaeyoung Chung, Jeongtaek Oh, Kyoung Mu Lee
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
Arxiv link: https://arxiv.org/abs/2311.13398
Pdf link: https://arxiv.org/pdf/2311.13398
Abstract In this paper, we present a method to optimize Gaussian splatting with a limited number of images while avoiding overfitting. Representing a 3D scene by combining numerous Gaussian splats has yielded outstanding visual quality. However, it tends to overfit the training views when only a small number of images are available. To address this issue, we introduce a dense depth map as a geometry guide to mitigate overfitting. We obtained the depth map using a pre-trained monocular depth estimation model and aligning the scale and offset using sparse COLMAP feature points. The adjusted depth aids in the color-based optimization of 3D Gaussian splatting, mitigating floating artifacts, and ensuring adherence to geometric constraints. We verify the proposed method on the NeRF-LLFF dataset with varying numbers of few images. Our approach demonstrates robust geometry compared to the original method that relies solely on images.
The Tempered Hilbert Simplex Distance and Its Application To Non-linear Embeddings of TEMs
Authors: Authors: Ehsan Amid, Frank Nielsen, Richard Nock, Manfred K. Warmuth
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
Arxiv link: https://arxiv.org/abs/2311.13459
Pdf link: https://arxiv.org/pdf/2311.13459
Abstract Tempered Exponential Measures (TEMs) are a parametric generalization of the exponential family of distributions maximizing the tempered entropy function among positive measures subject to a probability normalization of their power densities. Calculus on TEMs relies on a deformed algebra of arithmetic operators induced by the deformed logarithms used to define the tempered entropy. In this work, we introduce three different parameterizations of finite discrete TEMs via Legendre functions of the negative tempered entropy function. In particular, we establish an isometry between such parameterizations in terms of a generalization of the Hilbert log cross-ratio simplex distance to a tempered Hilbert co-simplex distance. Similar to the Hilbert geometry, the tempered Hilbert distance is characterized as a $t$-symmetrization of the oriented tempered Funk distance. We motivate our construction by introducing the notion of $t$-lengths of smooth curves in a tautological Finsler manifold. We then demonstrate the properties of our generalized structure in different settings and numerically examine the quality of its differentiable approximations for optimization in machine learning settings.
Multi-Objective Bayesian Optimization with Active Preference Learning
Authors: Authors: Ryota Ozaki, Kazuki Ishikawa, Youhei Kanzaki, Shinya Suzuki, Shion Takeno, Ichiro Takeuchi, Masayuki Karasuyama
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
Arxiv link: https://arxiv.org/abs/2311.13460
Pdf link: https://arxiv.org/pdf/2311.13460
Abstract There are a lot of real-world black-box optimization problems that need to optimize multiple criteria simultaneously. However, in a multi-objective optimization (MOO) problem, identifying the whole Pareto front requires the prohibitive search cost, while in many practical scenarios, the decision maker (DM) only needs a specific solution among the set of the Pareto optimal solutions. We propose a Bayesian optimization (BO) approach to identifying the most preferred solution in the MOO with expensive objective functions, in which a Bayesian preference model of the DM is adaptively estimated by an interactive manner based on the two types of supervisions called the pairwise preference and improvement request. To explore the most preferred solution, we define an acquisition function in which the uncertainty both in the objective functions and the DM preference is incorporated. Further, to minimize the interaction cost with the DM, we also propose an active learning strategy for the preference estimation. We empirically demonstrate the effectiveness of our proposed method through the benchmark function optimization and the hyper-parameter optimization problems for machine learning models.
Large-scale Package Deliveries with Unmanned Aerial Vehicles using Collective Learning
Authors: Authors: Arun Narayanan, Evangelos Pournaras, Pedro H. J. Nardelli
Subjects: Multiagent Systems (cs.MA)
Arxiv link: https://arxiv.org/abs/2311.13489
Pdf link: https://arxiv.org/pdf/2311.13489
Abstract Unmanned aerial vehicles (UAVs) have significant practical advantages for delivering packages, and many logistics companies have begun deploying UAVs for commercial package deliveries. To deliver packages quickly and cost-effectively, the routes taken by UAVs from depots to customers must be optimized. This route optimization problem, a type of capacitated vehicle routing problem, has recently attracted considerable research interest. However, few papers have dealt with large-scale deliveries, where the number of customers exceed 1000. We present an innovative, practical package delivery model wherein multiple UAVs deliver multiple packages to customers who are compensated for late deliveries. Further, we propose an innovative methodology that combines a new plan-generation algorithm with a collective-learning heuristic to quickly determine cost-effective paths of UAVs even for large-scale deliveries up to 10000 customers. Specialized settings are applied to a collective-learning heuristic, the Iterative Economic Planning and Optimized Selections (I-EPOS) in order to coordinate collective actions of the UAVs. To demonstrate our methodology, we applied our highly flexible approach to a depot in Heathrow Airport, London. We show that a coordinated approach, in which the UAVs collectively determine their flight paths, leads to lower operational costs than an uncoordinated approach. Further, the coordinated approach enables large-scale package deliveries.
Energy and Time-Aware Inference Offloading for DNN-based Applications in LEO Satellites
Authors: Authors: Yijie Chen, Qiyang Zhang, Yiran Zhang, Xiao Ma, Ao Zhou
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Arxiv link: https://arxiv.org/abs/2311.13509
Pdf link: https://arxiv.org/pdf/2311.13509
Abstract In recent years, Low Earth Orbit (LEO) satellites have witnessed rapid development, with inference based on Deep Neural Network (DNN) models emerging as the prevailing technology for remote sensing satellite image recognition. However, the substantial computation capability and energy demands of DNN models, coupled with the instability of the satellite-ground link, pose significant challenges, burdening satellites with limited power intake and hindering the timely completion of tasks. Existing approaches, such as transmitting all images to the ground for processing or executing DNN models on the satellite, is unable to effectively address this issue. By exploiting the internal hierarchical structure of DNNs and treating each layer as an independent subtask, we propose a satellite-ground collaborative computation partial offloading approach to address this challenge. We formulate the problem of minimizing the inference task execution time and onboard energy consumption through offloading as an integer linear programming (ILP) model. The complexity in solving the problem arises from the combinatorial explosion in the discrete solution space. To address this, we have designed an improved optimization algorithm based on branch and bound. Simulation results illustrate that, compared to the existing approaches, our algorithm improve the performance by 10%-18%
Hybrid Whale-Mud-Ring Optimization for Precise Color Skin Cancer Image Segmentation
Authors: Authors: Amir Hamza, Badis Lekouaghet, Yassine Himeur
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY)
Arxiv link: https://arxiv.org/abs/2311.13512
Pdf link: https://arxiv.org/pdf/2311.13512
Abstract Timely identification and treatment of rapidly progressing skin cancers can significantly contribute to the preservation of patients' health and well-being. Dermoscopy, a dependable and accessible tool, plays a pivotal role in the initial stages of skin cancer detection. Consequently, the effective processing of digital dermoscopy images holds significant importance in elevating the accuracy of skin cancer diagnoses. Multilevel thresholding is a key tool in medical imaging that extracts objects within the image to facilitate its analysis. In this paper, an enhanced version of the Mud Ring Algorithm hybridized with the Whale Optimization Algorithm, named WMRA, is proposed. The proposed approach utilizes bubble-net attack and mud ring strategy to overcome stagnation in local optima and obtain optimal thresholds. The experimental results show that WMRA is powerful against a cluster of recent methods in terms of fitness, Peak Signal to Noise Ratio (PSNR), and Mean Square Error (MSE).
Leveraging CNNs and Ensemble Learning for Automated Disaster Image Classification
Authors: Authors: Archit Rathod, Veer Pariawala, Mokshit Surana, Kumkum Saxena
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2311.13531
Pdf link: https://arxiv.org/pdf/2311.13531
Abstract Natural disasters act as a serious threat globally, requiring effective and efficient disaster management and recovery. This paper focuses on classifying natural disaster images using Convolutional Neural Networks (CNNs). Multiple CNN architectures were built and trained on a dataset containing images of earthquakes, floods, wildfires, and volcanoes. A stacked CNN ensemble approach proved to be the most effective, achieving 95% accuracy and an F1 score going up to 0.96 for individual classes. Tuning hyperparameters of individual models for optimization was critical to maximize the models' performance. The stacking of CNNs with XGBoost acting as the meta-model utilizes the strengths of the CNN and ResNet models to improve the overall accuracy of the classification. Results obtained from the models illustrated the potency of CNN-based models for automated disaster image classification. This lays the foundation for expanding these techniques to build robust systems for disaster response, damage assessment, and recovery management.
Combinatorial Optimization with Policy Adaptation using Latent Space Search
Authors: Authors: Felix Chalumeau, Shikha Surana, Clement Bonnet, Nathan Grinsztajn, Arnu Pretorius, Alexandre Laterre, Thomas D. Barrett
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Arxiv link: https://arxiv.org/abs/2311.13569
Pdf link: https://arxiv.org/pdf/2311.13569
Abstract Combinatorial Optimization underpins many real-world applications and yet, designing performant algorithms to solve these complex, typically NP-hard, problems remains a significant research challenge. Reinforcement Learning (RL) provides a versatile framework for designing heuristics across a broad spectrum of problem domains. However, despite notable progress, RL has not yet supplanted industrial solvers as the go-to solution. Current approaches emphasize pre-training heuristics that construct solutions but often rely on search procedures with limited variance, such as stochastically sampling numerous solutions from a single policy or employing computationally expensive fine-tuning of the policy on individual problem instances. Building on the intuition that performant search at inference time should be anticipated during pre-training, we propose COMPASS, a novel RL approach that parameterizes a distribution of diverse and specialized policies conditioned on a continuous latent space. We evaluate COMPASS across three canonical problems - Travelling Salesman, Capacitated Vehicle Routing, and Job-Shop Scheduling - and demonstrate that our search strategy (i) outperforms state-of-the-art approaches on 11 standard benchmarking tasks and (ii) generalizes better, surpassing all other approaches on a set of 18 procedurally transformed instance distributions.
On diffusion-based generative models and their error bounds: The log-concave case with full convergence estimates
Authors: Authors: Stefano Bruno, Ying Zhang, Dong-Young Lim, Ömer Deniz Akyildiz, Sotirios Sabanis
Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC); Probability (math.PR); Machine Learning (stat.ML)
Arxiv link: https://arxiv.org/abs/2311.13584
Pdf link: https://arxiv.org/pdf/2311.13584
Abstract We provide full theoretical guarantees for the convergence behaviour of diffusion-based generative models under the assumption of strongly logconcave data distributions while our approximating class of functions used for score estimation is made of Lipschitz continuous functions. We demonstrate via a motivating example, sampling from a Gaussian distribution with unknown mean, the powerfulness of our approach. In this case, explicit estimates are provided for the associated optimization problem, i.e. score approximation, while these are combined with the corresponding sampling estimates. As a result, we obtain the best known upper bound estimates in terms of key quantities of interest, such as the dimension and rates of convergence, for the Wasserstein-2 distance between the data distribution (Gaussian with unknown mean) and our sampling algorithm. Beyond the motivating example and in order to allow for the use of a diverse range of stochastic optimizers, we present our results using an $L^2$-accurate score estimation assumption, which crucially is formed under an expectation with respect to the stochastic optimizer and our novel auxiliary process that uses only known information. This approach yields the best known convergence rate for our sampling algorithm.
A Survey of Serverless Machine Learning Model Inference
Authors: Authors: Kamil Kojs
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2311.13587
Pdf link: https://arxiv.org/pdf/2311.13587
Abstract Recent developments in Generative AI, Computer Vision, and Natural Language Processing have led to an increased integration of AI models into various products. This widespread adoption of AI requires significant efforts in deploying these models in production environments. When hosting machine learning models for real-time predictions, it is important to meet defined Service Level Objectives (SLOs), ensuring reliability, minimal downtime, and optimizing operational costs of the underlying infrastructure. Large machine learning models often demand GPU resources for efficient inference to meet SLOs. In the context of these trends, there is growing interest in hosting AI models in a serverless architecture while still providing GPU access for inference tasks. This survey aims to summarize and categorize the emerging challenges and optimization opportunities for large-scale deep learning serving systems. By providing a novel taxonomy and summarizing recent trends, we hope that this survey could shed light on new optimization perspectives and motivate novel works in large-scale deep learning serving systems.
Keyword: adam

There is no result

Keyword: gradient

Nature Inspired Evolutionary Swarm Optimizers for Biomedical Image and Signal Processing -- A Systematic Review
Authors: Authors: Subhrangshu Adhikary
Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI)
Arxiv link: https://arxiv.org/abs/2311.12830
Pdf link: https://arxiv.org/pdf/2311.12830
Abstract The challenge of finding a global optimum in a solution search space with limited resources and higher accuracy has given rise to several optimization algorithms. Generally, the gradient-based optimizers converge to the global solution very accurately, but they often require a large number of iterations to find the solution. Researchers took inspiration from different natural phenomena and behaviours of many living organisms to develop algorithms that can solve optimization problems much quicker with high accuracy. These algorithms are called nature-inspired meta-heuristic optimization algorithms. These can be used for denoising signals, updating weights in a deep neural network, and many other cases. In the state-of-the-art, there are no systematic reviews available that have discussed the applications of nature-inspired algorithms on biomedical signal processing. The paper solves that gap by discussing the applications of such algorithms in biomedical signal processing and also provides an updated survey of the application of these algorithms in biomedical image processing. The paper reviews 28 latest peer-reviewed relevant articles and 26 nature-inspired algorithms and segregates them into thoroughly explored, lesser explored and unexplored categories intending to help readers understand the reliability and exploration stage of each of these algorithms.
Meticulously Selecting 1% of the Dataset for Pre-training! Generating Differentially Private Images Data with Semantics Query
Authors: Authors: Kecen Li, Chen Gong, Zhixiang Li, Yuzhong Zhao, Xinwen Hou, Tianhao Wang
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Arxiv link: https://arxiv.org/abs/2311.12850
Pdf link: https://arxiv.org/pdf/2311.12850
Abstract Differential Privacy (DP) image data synthesis, which leverages the DP technique to generate synthetic data to replace the sensitive data, allowing organizations to share and utilize synthetic images without privacy concerns. Previous methods incorporate the advanced techniques of generative models and pre-training on a public dataset to produce exceptional DP image data, but suffer from problems of unstable training and massive computational resource demands. This paper proposes a novel DP image synthesis method, termed PRIVIMAGE, which meticulously selects pre-training data, promoting the efficient creation of DP datasets with high fidelity and utility. PRIVIMAGE first establishes a semantic query function using a public dataset. Then, this function assists in querying the semantic distribution of the sensitive dataset, facilitating the selection of data from the public dataset with analogous semantics for pre-training. Finally, we pre-train an image generative model using the selected data and then fine-tune this model on the sensitive dataset using Differentially Private Stochastic Gradient Descent (DP-SGD). PRIVIMAGE allows us to train a lightly parameterized generative model, reducing the noise in the gradient during DP-SGD training and enhancing training stability. Extensive experiments demonstrate that PRIVIMAGE uses only 1% of the public dataset for pre-training and 7.6% of the parameters in the generative model compared to the state-of-the-art method, whereas achieves superior synthetic performance and conserves more computational resources. On average, PRIVIMAGE achieves 30.1% lower FID and 12.6% higher Classification Accuracy than the state-of-the-art method. The replication package and datasets can be accessed online.
Neural-Integrated Meshfree (NIM) Method: A differentiable programming-based hybrid solver for computational mechanics
Authors: Authors: Honghui Du, QiZhi He
Subjects: Machine Learning (cs.LG); Computational Engineering, Finance, and Science (cs.CE)
Arxiv link: https://arxiv.org/abs/2311.12915
Pdf link: https://arxiv.org/pdf/2311.12915
Abstract We present the neural-integrated meshfree (NIM) method, a differentiable programming-based hybrid meshfree approach within the field of computational mechanics. NIM seamlessly integrates traditional physics-based meshfree discretization techniques with deep learning architectures. It employs a hybrid approximation scheme, NeuroPU, to effectively represent the solution by combining continuous DNN representations with partition of unity (PU) basis functions associated with the underlying spatial discretization. This neural-numerical hybridization not only enhances the solution representation through functional space decomposition but also reduces both the size of DNN model and the need for spatial gradient computations based on automatic differentiation, leading to a significant improvement in training efficiency. Under the NIM framework, we propose two truly meshfree solvers: the strong form-based NIM (S-NIM) and the local variational form-based NIM (V-NIM). In the S-NIM solver, the strong-form governing equation is directly considered in the loss function, while the V-NIM solver employs a local Petrov-Galerkin approach that allows the construction of variational residuals based on arbitrary overlapping subdomains. This ensures both the satisfaction of underlying physics and the preservation of meshfree property. We perform extensive numerical experiments on both stationary and transient benchmark problems to assess the effectiveness of the proposed NIM methods in terms of accuracy, scalability, generalizability, and convergence properties. Moreover, comparative analysis with other physics-informed machine learning methods demonstrates that NIM, especially V-NIM, significantly enhances both accuracy and efficiency in end-to-end predictive capabilities.
SD-NAE: Generating Natural Adversarial Examples with Stable Diffusion
Authors: Authors: Yueqian Lin, Jingyang Zhang, Yiran Chen, Hai Li
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2311.12981
Pdf link: https://arxiv.org/pdf/2311.12981
Abstract Robustly evaluating deep learning image classifiers is challenging due to some limitations of standard datasets. Natural Adversarial Examples (NAEs), arising naturally from the environment and capable of deceiving classifiers, are instrumental in identifying vulnerabilities in trained models. Existing works collect such NAEs by filtering from a huge set of real images, a process that is passive and lacks control. In this work, we propose to actively synthesize NAEs with the state-of-the-art Stable Diffusion. Specifically, our method formulates a controlled optimization process, where we perturb the token embedding that corresponds to a specified class to synthesize NAEs. The generation is guided by the gradient of loss from the target classifier so that the created image closely mimics the ground-truth class yet fools the classifier. Named SD-NAE (Stable Diffusion for Natural Adversarial Examples), our innovative method is effective in producing valid and useful NAEs, which is demonstrated through a meticulously designed experiment. Our work thereby provides a valuable method for obtaining challenging evaluation data, which in turn can potentially advance the development of more robust deep learning models. Code is available at https://github.com/linyueqian/SD-NAE.
CovarNav: Machine Unlearning via Model Inversion and Covariance Navigation
Authors: Authors: Ali Abbasi, Chayne Thrash, Elaheh Akbari, Daniel Zhang, Soheil Kolouri
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Arxiv link: https://arxiv.org/abs/2311.12999
Pdf link: https://arxiv.org/pdf/2311.12999
Abstract The rapid progress of AI, combined with its unprecedented public adoption and the propensity of large neural networks to memorize training data, has given rise to significant data privacy concerns. To address these concerns, machine unlearning has emerged as an essential technique to selectively remove the influence of specific training data points on trained models. In this paper, we approach the machine unlearning problem through the lens of continual learning. Given a trained model and a subset of training data designated to be forgotten (i.e., the "forget set"), we introduce a three-step process, named CovarNav, to facilitate this forgetting. Firstly, we derive a proxy for the model's training data using a model inversion attack. Secondly, we mislabel the forget set by selecting the most probable class that deviates from the actual ground truth. Lastly, we deploy a gradient projection method to minimize the cross-entropy loss on the modified forget set (i.e., learn incorrect labels for this set) while preventing forgetting of the inverted samples. We rigorously evaluate CovarNav on the CIFAR-10 and Vggface2 datasets, comparing our results with recent benchmarks in the field and demonstrating the efficacy of our proposed approach.
White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is?
Authors: Authors: Yaodong Yu, Sam Buchanan, Druv Pai, Tianzhe Chu, Ziyang Wu, Shengbang Tong, Hao Bai, Yuexiang Zhai, Benjamin D. Haeffele, Yi Ma
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2311.13110
Pdf link: https://arxiv.org/pdf/2311.13110
Abstract In this paper, we contend that a natural objective of representation learning is to compress and transform the distribution of the data, say sets of tokens, towards a low-dimensional Gaussian mixture supported on incoherent subspaces. The goodness of such a representation can be evaluated by a principled measure, called sparse rate reduction, that simultaneously maximizes the intrinsic information gain and extrinsic sparsity of the learned representation. From this perspective, popular deep network architectures, including transformers, can be viewed as realizing iterative schemes to optimize this measure. Particularly, we derive a transformer block from alternating optimization on parts of this objective: the multi-head self-attention operator compresses the representation by implementing an approximate gradient descent step on the coding rate of the features, and the subsequent multi-layer perceptron sparsifies the features. This leads to a family of white-box transformer-like deep network architectures, named CRATE, which are mathematically fully interpretable. We show, by way of a novel connection between denoising and compression, that the inverse to the aforementioned compressive encoding can be realized by the same class of CRATE architectures. Thus, the so-derived white-box architectures are universal to both encoders and decoders. Experiments show that these networks, despite their simplicity, indeed learn to compress and sparsify representations of large-scale real-world image and text datasets, and achieve performance very close to highly engineered transformer-based models: ViT, MAE, DINO, BERT, and GPT2. We believe the proposed computational framework demonstrates great potential in bridging the gap between theory and practice of deep learning, from a unified perspective of data compression. Code is available at: https://ma-lab-berkeley.github.io/CRATE .
Combatting Human Trafficking in the Cyberspace: A Natural Language Processing-Based Methodology to Analyze the Language in Online Advertisements
Authors: Authors: Alejandro Rodriguez Perez, Pablo Rivas
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computers and Society (cs.CY); Social and Information Networks (cs.SI)
Arxiv link: https://arxiv.org/abs/2311.13118
Pdf link: https://arxiv.org/pdf/2311.13118
Abstract This project tackles the pressing issue of human trafficking in online C2C marketplaces through advanced Natural Language Processing (NLP) techniques. We introduce a novel methodology for generating pseudo-labeled datasets with minimal supervision, serving as a rich resource for training state-of-the-art NLP models. Focusing on tasks like Human Trafficking Risk Prediction (HTRP) and Organized Activity Detection (OAD), we employ cutting-edge Transformer models for analysis. A key contribution is the implementation of an interpretability framework using Integrated Gradients, providing explainable insights crucial for law enforcement. This work not only fills a critical gap in the literature but also offers a scalable, machine learning-driven approach to combat human exploitation online. It serves as a foundation for future research and practical applications, emphasizing the role of machine learning in addressing complex social issues.
Multi-Objective Optimization via Wasserstein-Fisher-Rao Gradient Flow
Authors: Authors: Yinuo Ren, Tesi Xiao, Tanmay Gangwani, Anshuka Rangi, Holakou Rahmanian, Lexing Ying, Subhajit Sanyal
Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
Arxiv link: https://arxiv.org/abs/2311.13159
Pdf link: https://arxiv.org/pdf/2311.13159
Abstract Multi-objective optimization (MOO) aims to optimize multiple, possibly conflicting objectives with widespread applications. We introduce a novel interacting particle method for MOO inspired by molecular dynamics simulations. Our approach combines overdamped Langevin and birth-death dynamics, incorporating a "dominance potential" to steer particles toward global Pareto optimality. In contrast to previous methods, our method is able to relocate dominated particles, making it particularly adept at managing Pareto fronts of complicated geometries. Our method is also theoretically grounded as a Wasserstein-Fisher-Rao gradient flow with convergence guarantees. Extensive experiments confirm that our approach outperforms state-of-the-art methods on challenging synthetic and real-world datasets.
SecureCut: Federated Gradient Boosting Decision Trees with Efficient Machine Unlearning
Authors: Authors: Jian Zhang, Bowen Li Jie Li, Chentao Wu
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR)
Arxiv link: https://arxiv.org/abs/2311.13174
Pdf link: https://arxiv.org/pdf/2311.13174
Abstract In response to legislation mandating companies to honor the \textit{right to be forgotten} by erasing user data, it has become imperative to enable data removal in Vertical Federated Learning (VFL) where multiple parties provide private features for model training. In VFL, data removal, i.e., \textit{machine unlearning}, often requires removing specific features across all samples under privacy guarentee in federated learning. To address this challenge, we propose \methname, a novel Gradient Boosting Decision Tree (GBDT) framework that effectively enables both \textit{instance unlearning} and \textit{feature unlearning} without the need for retraining from scratch. Leveraging a robust GBDT structure, we enable effective data deletion while reducing degradation of model performance. Extensive experimental results on popular datasets demonstrate that our method achieves superior model utility and forgetfulness compared to \textit{state-of-the-art} methods. To our best knowledge, this is the first work that investigates machine unlearning in VFL scenarios.
Differentiable Radio Frequency Ray Tracing for Millimeter-Wave Sensing
Authors: Authors: Xingyu Chen, Xinyu Zhang, Qiyue Xia, Xinmin Fang, Chris Xiaoxuan Lu, Zhengxiong Li
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2311.13182
Pdf link: https://arxiv.org/pdf/2311.13182
Abstract Millimeter wave (mmWave) sensing is an emerging technology with applications in 3D object characterization and environment mapping. However, realizing precise 3D reconstruction from sparse mmWave signals remains challenging. Existing methods rely on data-driven learning, constrained by dataset availability and difficulty in generalization. We propose DiffSBR, a differentiable framework for mmWave-based 3D reconstruction. DiffSBR incorporates a differentiable ray tracing engine to simulate radar point clouds from virtual 3D models. A gradient-based optimizer refines the model parameters to minimize the discrepancy between simulated and real point clouds. Experiments using various radar hardware validate DiffSBR's capability for fine-grained 3D reconstruction, even for novel objects unseen by the radar previously. By integrating physics-based simulation with gradient optimization, DiffSBR transcends the limitations of data-driven approaches and pioneers a new paradigm for mmWave sensing.
Test-time Adaptive Vision-and-Language Navigation
Authors: Authors: Junyu Gao, Xuan Yao, Changsheng Xu
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2311.13209
Pdf link: https://arxiv.org/pdf/2311.13209
Abstract Vision-and-Language Navigation (VLN) has witnessed significant advancements in recent years, largely attributed to meticulously curated datasets and proficiently trained models. Nevertheless, when tested in diverse environments, the trained models inevitably encounter significant shifts in data distribution, highlighting that relying solely on pre-trained and fixed navigation models is insufficient. To enhance models' generalization ability, test-time adaptation (TTA) demonstrates significant potential in the computer vision field by leveraging unlabeled test samples for model updates. However, simply applying existing TTA methods to the VLN task cannot well handle the adaptability-stability dilemma of VLN models, i.e., frequent updates can result in drastic changes in model parameters, while occasional updates can make the models ill-equipped to handle dynamically changing environments. Therefore, we propose a Fast-Slow Test-Time Adaptation (FSTTA) approach for VLN by performing decomposition-accumulation analysis for both gradients and parameters in a unified framework. Specifically, in the fast update phase, gradients generated during the recent multi-step navigation process are decomposed into components with varying levels of consistency. Then, these components are adaptively accumulated to pinpoint a concordant direction for fast model adaptation. In the slow update phase, historically recorded parameters are gathered, and a similar decomposition-accumulation analysis is conducted to revert the model to a stable state. Extensive experiments show that our method obtains impressive performance gains on four popular benchmarks.
Asymptotically compatible energy and dissipation law of the nonuniform L2-$1_σ$ scheme for time fractional Allen-Cahn model
Authors: Authors: Hong-lin Liao, Xiaohan Zhu, Hong Sun
Subjects: Numerical Analysis (math.NA)
Arxiv link: https://arxiv.org/abs/2311.13216
Pdf link: https://arxiv.org/pdf/2311.13216
Abstract We build an asymptotically compatible energy of the variable-step L2-$1_{\sigma}$ scheme for the time-fractional Allen-Cahn model with the Caputo's fractional derivative of order $\alpha\in(0,1)$, under a weak step-ratio constraint $\tauk/\tau{k-1}\geq r_{\star}(\alpha)$ for $k\ge2$, where $\tauk$ is the $k$-th time-step size and $r{\star}(\alpha)\in(0.3865,0.4037)$ for $\alpha\in(0,1)$. It provides a positive answer to the open problem in [J. Comput. Phys., 414:109473], and, to the best of our knowledge, it is the first second-order nonuniform time-stepping scheme to preserve both the maximum bound principle and the energy dissipation law of time-fractional Allen-Cahn model. The compatible discrete energy is constructed via a novel discrete gradient structure of the second-order L2-$1_{\sigma}$ formula by a local-nonlocal splitting technique. It splits the discrete fractional derivative into two parts: one is a local term analogue to the trapezoid rule of the first derivative and the other is a nonlocal summation analogue to the L1 formula of Caputo derivative. Numerical examples with an adaptive time-stepping strategy are provided to show the effectiveness of our scheme and the asymptotic properties of the associated modified energy.
Hard Label Black Box Node Injection Attack on Graph Neural Networks
Authors: Authors: Yu Zhou, Zihao Dong, Guofeng Zhang, Jingchen Tang
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Social and Information Networks (cs.SI)
Arxiv link: https://arxiv.org/abs/2311.13244
Pdf link: https://arxiv.org/pdf/2311.13244
Abstract While graph neural networks have achieved state-of-the-art performances in many real-world tasks including graph classification and node classification, recent works have demonstrated they are also extremely vulnerable to adversarial attacks. Most previous works have focused on attacking node classification networks under impractical white-box scenarios. In this work, we will propose a non-targeted Hard Label Black Box Node Injection Attack on Graph Neural Networks, which to the best of our knowledge, is the first of its kind. Under this setting, more real world tasks can be studied because our attack assumes no prior knowledge about (1): the model architecture of the GNN we are attacking; (2): the model's gradients; (3): the output logits of the target GNN model. Our attack is based on an existing edge perturbation attack, from which we restrict the optimization process to formulate a node injection attack. In the work, we will evaluate the performance of the attack using three datasets, COIL-DEL, IMDB-BINARY, and NCI1.
Hierarchical Matrix Factorization for Interpretable Collaborative Filtering
Authors: Authors: Kai Sugahara, Kazushi Okamoto
Subjects: Information Retrieval (cs.IR)
Arxiv link: https://arxiv.org/abs/2311.13277
Pdf link: https://arxiv.org/pdf/2311.13277
Abstract Matrix factorization (MF) is a simple collaborative filtering technique that achieves superior recommendation accuracy by decomposing the user-item rating matrix into user and item latent matrices. This approach relies on learning from user-item interactions, which may not effectively capture the underlying shared dependencies between users or items. Therefore, there is scope to explicitly capture shared dependencies to further improve recommendation accuracy and the interpretability of learning results by summarizing user-item interactions. Based on these insights, we propose "Hierarchical Matrix Factorization" (HMF), which incorporates clustering concepts to capture the hierarchy, where leaf nodes and other nodes correspond to users/items and clusters, respectively. Central to our approach, called hierarchical embeddings, is the additional decomposition of the user and item latent matrices (embeddings) into probabilistic connection matrices, which link the hierarchy, and a root cluster latent matrix. Thus, each node is represented by the weighted average of the embeddings of its parent clusters. The embeddings are differentiable, allowing simultaneous learning of interactions and clustering using a single gradient descent method. Furthermore, the obtained cluster-specific interactions naturally summarize user-item interactions and provide interpretability. Experimental results on rating and ranking predictions demonstrated the competitiveness of HMF over vanilla and hierarchical MF methods, especially its robustness in sparse interactions. Additionally, it was confirmed that the clustering integration of HMF has the potential for faster learning convergence and mitigation of overfitting compared to MF, and also provides interpretability through a cluster-centered case study.
Differentially Private Non-Convex Optimization under the KL Condition with Optimal Rates
Authors: Authors: Michael Menart, Enayat Ullah, Raman Arora, Raef Bassily, Cristóbal Guzmán
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Optimization and Control (math.OC); Machine Learning (stat.ML)
Arxiv link: https://arxiv.org/abs/2311.13447
Pdf link: https://arxiv.org/pdf/2311.13447
Abstract We study private empirical risk minimization (ERM) problem for losses satisfying the $(\gamma,\kappa)$-Kurdyka-{\L}ojasiewicz (KL) condition. The Polyak-{\L}ojasiewicz (PL) condition is a special case of this condition when $\kappa=2$. Specifically, we study this problem under the constraint of $\rho$ zero-concentrated differential privacy (zCDP). When $\kappa\in[1,2]$ and the loss function is Lipschitz and smooth over a sufficiently large region, we provide a new algorithm based on variance reduced gradient descent that achieves the rate $\tilde{O}\big(\big(\frac{\sqrt{d}}{n\sqrt{\rho}}\big)^\kappa\big)$ on the excess empirical risk, where $n$ is the dataset size and $d$ is the dimension. We further show that this rate is nearly optimal. When $\kappa \geq 2$ and the loss is instead Lipschitz and weakly convex, we show it is possible to achieve the rate $\tilde{O}\big(\big(\frac{\sqrt{d}}{n\sqrt{\rho}}\big)^\kappa\big)$ with a private implementation of the proximal point method. When the KL parameters are unknown, we provide a novel modification and analysis of the noisy gradient descent algorithm and show that this algorithm achieves a rate of $\tilde{O}\big(\big(\frac{\sqrt{d}}{n\sqrt{\rho}}\big)^{\frac{2\kappa}{4-\kappa}}\big)$ adaptively, which is nearly optimal when $\kappa = 2$. We further show that, without assuming the KL condition, the same gradient descent algorithm can achieve fast convergence to a stationary point when the gradient stays sufficiently large during the run of the algorithm. Specifically, we show that this algorithm can approximate stationary points of Lipschitz, smooth (and possibly nonconvex) objectives with rate as fast as $\tilde{O}\big(\frac{\sqrt{d}}{n\sqrt{\rho}}\big)$ and never worse than $\tilde{O}\big(\big(\frac{\sqrt{d}}{n\sqrt{\rho}}\big)^{1/2}\big)$. The latter rate matches the best known rate for methods that do not rely on variance reduction.
Keyword: super-resolution

Recognition-Guided Diffusion Model for Scene Text Image Super-Resolution
Authors: Authors: Yuxuan Zhou, Liangcai Gao, Zhi Tang, Baole Wei
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Arxiv link: https://arxiv.org/abs/2311.13317
Pdf link: https://arxiv.org/pdf/2311.13317
Abstract Scene Text Image Super-Resolution (STISR) aims to enhance the resolution and legibility of text within low-resolution (LR) images, consequently elevating recognition accuracy in Scene Text Recognition (STR). Previous methods predominantly employ discriminative Convolutional Neural Networks (CNNs) augmented with diverse forms of text guidance to address this issue. Nevertheless, they remain deficient when confronted with severely blurred images, due to their insufficient generation capability when little structural or semantic information can be extracted from original images. Therefore, we introduce RGDiffSR, a Recognition-Guided Diffusion model for scene text image Super-Resolution, which exhibits great generative diversity and fidelity even in challenging scenarios. Moreover, we propose a Recognition-Guided Denoising Network, to guide the diffusion model generating LR-consistent results through succinct semantic guidance. Experiments on the TextZoom dataset demonstrate the superiority of RGDiffSR over prior state-of-the-art methods in both text recognition accuracy and image fidelity.

zoq / arxiv-updates

New submissions for Thu, 23 Nov 23 #649

Keyword: sgd

Meticulously Selecting 1% of the Dataset for Pre-training! Generating Differentially Private Images Data with Semantics Query

Keyword: optimization

A general Framework for Utilizing Metaheuristic Optimization for Sustainable Unrelated Parallel Machine Scheduling: A concise overview

Reducing the Environmental Impact of Wireless Communication via Probabilistic Machine Learning

Proposing an intelligent mesh smoothing method with graph neural networks

A PSO Based Method to Generate Actionable Counterfactuals for High Dimensional Data

Nature Inspired Evolutionary Swarm Optimizers for Biomedical Image and Signal Processing -- A Systematic Review

Enhancing Robotic Manipulation: Harnessing the Power of Multi-Task Reinforcement Learning and Single Life Reinforcement Learning in Meta-World

An Efficient 3D Gaussian Representation for Monocular/Multi-view Dynamic Scenes

Diffusion Model Alignment Using Direct Preference Optimization

Q-Seg: Quantum Annealing-based Unsupervised Image Segmentation

SD-NAE: Generating Natural Adversarial Examples with Stable Diffusion

Fast and Interpretable Mortality Risk Scores for Critical Care Patients

Multi-fidelity Bayesian Optimization in Engineering Design

Predict-Then-Optimize by Proxy: Learning Joint Models of Prediction and Optimization

AC Power Flow Informed Parameter Learning for DC Power Flow Network Equivalents

White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is?

Toward Robust Imperceptible Perturbation against Unauthorized Text-to-image Diffusion-based Synthesis

Joint Distributed Precoding and Beamforming for RIS-aided Cell-Free Massive MIMO Systems

Optimal Transport with Cyclic Symmetry

Enhancing Microgrid Resilience with Green Hydrogen Storage

Multi-Objective Optimization via Wasserstein-Fisher-Rao Gradient Flow

Differentiable Radio Frequency Ray Tracing for Millimeter-Wave Sensing

Optimal trajectory planning meets network-level routing: Integrated control framework for emerging mobility systems

Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model

Understanding Cost Dynamics of Serverless Computing: An Empirical Study

Hard Label Black Box Node Injection Attack on Graph Neural Networks

Towards Hetero-Client Federated Multi-Task Learning

Probabilistic Inference in Reinforcement Learning Done Right

AA-DL: AoI-Aware Deep Learning Approach for D2D-Assisted Industrial IoT

Trace-enabled Timing Model Synthesis for ROS2-based Autonomous Applications

REDS: Resource-Efficient Deep Subnetworks for Dynamic Resource Constraints

Numerical Approximation of Optimal Convex Shapes in $\mathbb{R}^3$

Conflict Management in the Near-RT-RIC of Open RAN: A Game Theoretic Approach

Depth-Regularized Optimization for 3D Gaussian Splatting in Few-Shot Images

The Tempered Hilbert Simplex Distance and Its Application To Non-linear Embeddings of TEMs

Multi-Objective Bayesian Optimization with Active Preference Learning

Large-scale Package Deliveries with Unmanned Aerial Vehicles using Collective Learning

Energy and Time-Aware Inference Offloading for DNN-based Applications in LEO Satellites

Hybrid Whale-Mud-Ring Optimization for Precise Color Skin Cancer Image Segmentation

Leveraging CNNs and Ensemble Learning for Automated Disaster Image Classification

Combinatorial Optimization with Policy Adaptation using Latent Space Search

On diffusion-based generative models and their error bounds: The log-concave case with full convergence estimates

A Survey of Serverless Machine Learning Model Inference

Keyword: adam

Keyword: gradient

Nature Inspired Evolutionary Swarm Optimizers for Biomedical Image and Signal Processing -- A Systematic Review

Meticulously Selecting 1% of the Dataset for Pre-training! Generating Differentially Private Images Data with Semantics Query

Neural-Integrated Meshfree (NIM) Method: A differentiable programming-based hybrid solver for computational mechanics

SD-NAE: Generating Natural Adversarial Examples with Stable Diffusion

CovarNav: Machine Unlearning via Model Inversion and Covariance Navigation

White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is?

Combatting Human Trafficking in the Cyberspace: A Natural Language Processing-Based Methodology to Analyze the Language in Online Advertisements

Multi-Objective Optimization via Wasserstein-Fisher-Rao Gradient Flow

SecureCut: Federated Gradient Boosting Decision Trees with Efficient Machine Unlearning

Differentiable Radio Frequency Ray Tracing for Millimeter-Wave Sensing

Test-time Adaptive Vision-and-Language Navigation

Asymptotically compatible energy and dissipation law of the nonuniform L2-$1_σ$ scheme for time fractional Allen-Cahn model

Hard Label Black Box Node Injection Attack on Graph Neural Networks

Hierarchical Matrix Factorization for Interpretable Collaborative Filtering

Differentially Private Non-Convex Optimization under the KL Condition with Optimal Rates

Keyword: super-resolution

Recognition-Guided Diffusion Model for Scene Text Image Super-Resolution