Note that this used to be called genrandom, until I found there was already a fairly useless standard linux utility called genrandom. So djenrandom became the name.
This program generates random data with known controlled statistical properties. Its primary reason for existing is to provide test data for calibrating and validating random number testing algorithms.
It implements a number of models, selected with the -m
Pure : Uniform random data. SUMS : Step Update Metastable Source. This models a step update metastable entropy source of the type used in Intel CPUs. Biased : This model allows the probability of a 1 or 0 to be controlled. Correlated : This model allows the serial correlation coefficient to be controlled. Normal : This model generates Normal (or Gaussian) distributed data and outputs as floating point values. SinBias : This model has a sinusoidally varying bias. Markov 2 Parameter : This implements a two state model. States 1 and 0, which output 1 and 0 respectively. Two parameters give the probability of transitioning from 1 to 0 and from 0 to 1. This model allows both bias a serial correlation to be modelled in the same data series. Markov Sigmoid : This generates bits by walking along a finite Markov chain with transition probabilities set according to a chosen sigmoid curve. Moving left generates 0, moving right generates 1. This enables both bias and serial correlation to be modelled in the same data series. File : This reads data from a file and re-outputs it.
This program generates random data in 1KiByte blocks. The number of blocks is controlled by the -k
Usage: djrandom [-bsvhn] [-x
[-r <right_stepsize>] [--stepnoise=<noise on step>] [--bias=<bias>]
[--correlation=<correlation>] [--mean=<normal mean>] [--variance=<normal variance>]
[--pcg_state_16=<16|32|64>] [--pcg_generator=<LCG|MCG>] [--pcg_of=<XSH_RS|XSH|RR]
[--sinbias_offset=<0.0 to 1.0>] [--sinbias_amplitude=<0.0 to 1.0>] [--sinbias_period=<samples per cycle>]
[--p10=<probability of 10 transition] [--p01=<probability of 01 transition>]
[--states=<integer of number of states in the markov chain>]
[--sigmoid=<flat|linear|sums|logistic|tanh|atan|gudermann|erf|algebraic]
[--min_range=<float less than max_range>][--max_range=<float greater than min_range>]
[-o <output_filename>] [-j <j filename>] [-i <input filename>] [-f <hex|binary|01>]
[-J <json_filename>] [-Y <yaml_filename>]
[--bpb=<binary bits per byte>]
[-k <1K_Blocks>] [-w [1..256]]
[-D <deterministic seed string>]
Generate random bits with configurable non-uniformities. Author: David Johnston, dj@deadhat.com
-m, --model=<pure(default)|sums|biased|correlated|lcg|pcg|xorshift|normal|file> Select random source model
Step Update Metastable Source model (-m sums) Options
-l, --left=
Biased model (-m biased) Options
--bias=
Correlated model (-m correlated) Options
--correlation=
Sinusoidally Varying Bias model (-m sinbias) Options
--sinbias_amplitude=<0.0 to 1.0> Amplitude of the variation of the bias between 0.0 and 1.0. Only for sinbias model
--sinbias_offset=<0.0 to 1.0> Midpoint Offset of the varying bias between 0.0 and 1.0. Only for sinbias model
--sinbias_period=
Two Parameter Markov model (-m markov_2_param) Options
--fast Use a fast version on the generator. and one set of: --p10=<0.0 to 1.0> The probability of a 1 following a 0, default 0.5 --p01=<0.0 to 1.0> The probability of a 0 following a 1, default 0.5 or --bias=<0.0 to 1.0> The ones probability, default 0.5 --correlation=<-1.0 to 1.0> The serial correlation coefficient, default 0.0 or --entropy=<0.0 to 1.0> The per bit entropy, default 1.0 --bitwidth=<3 to 64> The number of bits per symbol
Sigmoid Markov model (-m markov_sigmoid) Options
--states=
Normal model (-m normal) Options
--mean=
Linear Congruential Generator model (-m lcg) Options
--lcg_a=
Permuted Congruential Generator model (-m pcg) Options
--pcg_state_size=
XorShift model (-m xorshift) Options
--xorshift_size=[state size of xorshift] 32 or 128
General Options
-x, --xor=
File Options
-o
Output Format Options
-b, --binary output in raw binary format --bpb Number of bits per byte to output in binary output mode. Default 8. -w, --width=[1...256] Byte per line of output
The most important option of all
-h, --help print this help and exit
Pure : The data produced from the Pure model is indistiguishable from uniform random bits where each bit is independent and has a 50% probability of being 1. It is generated from a variant of a CTR_DRBG with a couple of extra AES stages thrown in for fun.
SUMS : Step Update Metastable Source. This models a dual differential feeback cross coupled latch, as used in the Intel DRNG Entropy Source that feeds the RdRand and RdSeed instructions. It has a control variable t, which moved left or right based on evaluating a probability of moving away from the center. The curve is defined P = 0.5 exp(-0.5 t*t). This is computed with floating point arithmetic. Options are in the model to vary the left and right step sizes and to add noise to the step sizes.
Biased : This generates bits according to a given probabilty (bias) that the bit is 1.
Correlated : This model generates data with 50% bias and the given serial correlation coefficient. The probability of a bit being the same as the previous bit is computed from the SCC. P(a=b) = (1+scc)/2. This relationship only holds for unbiased bits.
Normal : This model generates Normal (or Gaussian) distributed data and outputs as floating point values. The algorithm to compute normal variates uses the Marsargalia Polar Method.
SinBias : This model has a sinusoidally varying bias. This is one of the models used by NIST in evaluating the SP800-90B Non-IID entropy lower bound tests. The frequency and amplitude of the sinuoid can can be controlled.
Markov 2 Parameter : This implements a two state model. States 1 and 0, which output 1 and 0 respectively. Two parameters give the probability of transitioning from 1 to 0 and from 0 to 1. This model allows both bias a serial correlation to be modelled in the same data series. A three way relationship exists between the P01,P10 markov parameters, the SCC and mean of the generated data and the entropy of the generated data. Allowing data with know SCC, mean and entropy to be generated. This is useful for testing entropy estimation algorithms. One of the transition parameters, the SCC and mean or the entropy can be given. If the entropy is given, then there is an infinite set of P01,P10 pairs that generate that entropy level. They exist on a closed curve on the P01,P10 plane. The program will pick one at random.
Markov Sigmoid : This generates bits by walking along a finite Markov chain with transition probabilities set according to a chosen sigmoid curve. Moving left generates 0, moving right generates 1. This is similar to the SUMS model, except it is a finite state model, not a floating point model. But choosing a range and curve appropriately, it is easy to model the feedback curve in a feedback controlled entropy source. A paper by Rachael Parker (DOI: 10.1109/IVSW.2017.8031540) includes proof that the occupancy of the Markov states follows a normal distribution for any sigmoid. From this the average min entropy of a group of bits from the source can be computed from the weighted average of the min entropyies of bits from individual states. The curves are Flat : P(move_left | x) = 0.5 Linear : P(move_left | x) = x algebraic : P(move_left | x) = x/sqrt(1.0+(xx)) atan : P(move_left | x) = arctan(x) tanh : P(move_left | x) = hyperbolic_tangent(x) erf : P(move_left | x) = erf(x) gudermann : P(move_left | x) = 2.0arctan(hyperbolic_tangent(x/2)) logistic : P(move_left | x) = 1.0/(1.0+exp(-x))
The range gives the bounds on the chosen curve. The algorithm scales the vertical position of the curve
to vary between -1 and +1, so that the curve intersects the 0.5 region.
File : This reads a given file and outputs the data. This is useful for format conversion. It provides flexible input and output formats.