Abstract
Multi-server queueing systems are widely used models for job scheduling in machine learning, wireless networks, and crowdsourcing. This paper considers a multi-server system with multiple servers and multiple types of jobs. The system maintains a separate queue for each type of jobs. For each time slot, each available server picks a job from a queue and then serves the job until it is complete. The arrival rates of the queues and the mean service times are unknown and even nonstationary. We propose the MaxWeight with discounted upper confidence bound (UCB) algorithm, which simultaneously learns the statistics and schedules jobs to servers. We prove that the proposed algorithm can stabilize the queues when the arrival rates are strictly within the service capacity region. Specifically, we prove that the queue lengths are bounded in the mean under the assumption that the mean service times change relatively slowly over time and the arrival rates are bounded away from the capacity region by a constant whose value depends on the discount factor used in the discounted UCB. Simulation results confirm that the proposed algorithm can stabilize the queues and that it outperforms MaxWeight with empirical mean and MaxWeight with discounted empirical mean. The proposed algorithm is also better than MaxWeight with UCB in the nonstationary setting.
Keyword: scaling
A 0.6V$-$1.8V Compact Temperature Sensor with 0.24°C Resolution, $\pm$1.4°C Inaccuracy and 1.06nJ per Conversion
Authors: Benjamin Zambrano, Esteban Garzón, Sebastiano Strangio, Giuseppe Iannaccone, Marco Lanuzza
Abstract
This paper presents a fully-integrated CMOS temperature sensor for densely-distributed thermal monitoring in systems on chip supporting dynamic voltage and frequency scaling. The sensor front-end exploits a sub-threshold PMOS-based circuit to convert the local temperature into two biasing currents. These are then used to define two oscillation frequencies, whose ratio is proportional to absolute-temperature. Finally, the sensor back-end translates such frequency ratio into the digital temperature code. Thanks to its low-complexity architecture, the proposed design achieves a very compact footprint along with low-power consumption and high accuracy in a wide temperature range. Moreover, thanks to a simple embedded line regulation mechanism, our sensor supports voltage-scalability. The design was prototyped in a 180nm CMOS technology with a 0{\deg}C $-$ 100{\deg}C temperature detection range, a very wide supply voltage operating range from 0.6V up to 1.8V and very small silicon area occupation of just 0.021$mm^2$. Experimental measurements performed on 20 test chips have shown very competitive figures of merit, including a resolution of 0.24{\deg}C, an inaccuracy of $\pm$1.4{\deg}C, a sampling rate of about 1.5kHz and an energy per conversion of 1.06nJ at 30{\deg}C.
Local Optimization Often is Ill-conditioned in Genetic Programming for Symbolic Regression
Authors: Gabriel Kronberger
Subjects: Neural and Evolutionary Computing (cs.NE)
Abstract
Gradient-based local optimization has been shown to improve results of genetic programming (GP) for symbolic regression. Several state-of-the-art GP implementations use iterative nonlinear least squares (NLS) algorithms such as the Levenberg-Marquardt algorithm for local optimization. The effectiveness of NLS algorithms depends on appropriate scaling and conditioning of the optimization problem. This has so far been ignored in symbolic regression and GP literature. In this study we use a singular value decomposition of NLS Jacobian matrices to determine the numeric rank and the condition number. We perform experiments with a GP implementation and six different benchmark datasets. Our results show that rank-deficient and ill-conditioned Jacobian matrices occur frequently and for all datasets. The issue is less extreme when restricting GP tree size and when using many non-linear functions in the function set.
Abstract
We study the effect of normalization on the layers of deep neural networks of feed-forward type. A given layer $i$ with $N{i}$ hidden units is allowed to be normalized by $1/N{i}^{\gamma{i}}$ with $\gamma{i}\in[1/2,1]$ and we study the effect of the choice of the $\gamma{i}$ on the statistical behavior of the neural network's output (such as variance) as well as on the test accuracy on the MNIST data set. We find that in terms of variance of the neural network's output and test accuracy the best choice is to choose the $\gamma{i}$'s to be equal to one, which is the mean-field scaling. We also find that this is particularly true for the outer layer, in that the neural network's behavior is more sensitive in the scaling of the outer layer as opposed to the scaling of the inner layers. The mechanism for the mathematical analysis is an asymptotic expansion for the neural network's output. An important practical consequence of the analysis is that it provides a systematic and mathematically informed way to choose the learning rate hyperparameters. Such a choice guarantees that the neural network behaves in a statistically robust way as the $N_i$ grow to infinity.
Semi-Centralised Multi-Agent Reinforcement Learning with Policy-Embedded Training
Authors: Taher Jafferjee, Juliusz Ziomek, Tianpei Yang, Zipeng Dai, Jianhong Wang, Matthew Taylor, Kun Shao, Jun Wang, David Mguni
Subjects: Multiagent Systems (cs.MA); Machine Learning (cs.LG)
Abstract
Centralised training (CT) is the basis for many popular multi-agent reinforcement learning (MARL) methods because it allows agents to quickly learn high-performing policies. However, CT relies on agents learning from one-off observations of other agents' actions at a given state. Because MARL agents explore and update their policies during training, these observations often provide poor predictions about other agents' behaviour and the expected return for a given action. CT methods therefore suffer from high variance and error-prone estimates, harming learning. CT methods also suffer from explosive growth in complexity due to the reliance on global observations, unless strong factorisation restrictions are imposed (e.g., monotonic reward functions for QMIX). We address these challenges with a new semi-centralised MARL framework that performs policy-embedded training and decentralised execution. Our method, policy embedded reinforcement learning algorithm (PERLA), is an enhancement tool for Actor-Critic MARL algorithms that leverages a novel parameter sharing protocol and policy embedding method to maintain estimates that account for other agents' behaviour. Our theory proves PERLA dramatically reduces the variance in value estimates. Unlike various CT methods, PERLA, which seamlessly adopts MARL algorithms, scales easily with the number of agents without the need for restrictive factorisation assumptions. We demonstrate PERLA's superior empirical performance and efficient scaling in benchmark environments including StarCraft Micromanagement II and Multi-agent Mujoco
Keyword: calibration
Tweak: Towards Portable Deep Learning Models for Domain-Agnostic LoRa Device Authentication
Authors: Jared Gaskin, Bechir Hamdaoui, Weng-Keen Wong
Abstract
Deep learning based device fingerprinting has emerged as a key method of identifying and authenticating devices solely via their captured RF transmissions. Conventional approaches are not portable to different domains in that if a model is trained on data from one domain, it will not perform well on data from a different but related domain. Examples of such domains include the receiver hardware used for collecting the data, the day/time on which data was captured, and the protocol configuration of devices. This work proposes Tweak, a technique that, using metric learning and a calibration process, enables a model trained with data from one domain to perform well on data from another domain. This process is accomplished with only a small amount of training data from the target domain and without changing the weights of the model, which makes the technique computationally lightweight and thus suitable for resource-limited IoT networks. This work evaluates the effectiveness of Tweak vis-a-vis its ability to identify IoT devices using a testbed of real LoRa-enabled devices under various scenarios. The results of this evaluation show that Tweak is viable and especially useful for networks with limited computational resources and applications with time-sensitive missions.
Keyword: out of distribution detection
There is no result
Keyword: out-of-distribution detection
There is no result
Keyword: expected calibration error
There is no result
Keyword: overconfident
There is no result
Keyword: overconfidence
There is no result
Keyword: confidence
MaxWeight With Discounted UCB: A Provably Stable Scheduling Policy for Nonstationary Multi-Server Systems With Unknown Statistics
Keyword: scaling
A 0.6V$-$1.8V Compact Temperature Sensor with 0.24°C Resolution, $\pm$1.4°C Inaccuracy and 1.06nJ per Conversion
Local Optimization Often is Ill-conditioned in Genetic Programming for Symbolic Regression
Normalization effects on deep neural networks
Semi-Centralised Multi-Agent Reinforcement Learning with Policy-Embedded Training
Keyword: calibration
Tweak: Towards Portable Deep Learning Models for Domain-Agnostic LoRa Device Authentication