Closed secworks closed 3 weeks ago
Fixed format, CI fails on hashes - which it should at this point.
This implementation yields:
➜ ~ ent ram_out_xorwow.bin
Entropy = 7.998632 bits per byte.
Optimum compression would reduce the size
of this 131072 byte file by 0 percent.
Chi square distribution for 131072 samples is 248.17, and randomly
would exceed this value 60.85 percent of the times.
Arithmetic mean value of data bytes is 127.2708 (127.5 = random).
Monte Carlo value for Pi is 3.135179675 (error 0.20 percent).
Serial correlation coefficient is 0.000223 (totally uncorrelated = 0.0).
Startup time is not noticebly slower than original. Needs to be measured somehow.
ent on our current implementation gives:
Entropy = 7.954691 bits per byte.
Optimum compression would reduce the size
of this 131072 byte file by 0 percent.
Chi square distribution for 131072 samples is 8146.58, and randomly
would exceed this value less than 0.01 percent of the times.
Arithmetic mean value of data bytes is 127.5535 (127.5 = random).
Monte Carlo value for Pi is 3.132249943 (error 0.30 percent).
Serial correlation coefficient is 0.009923 (totally uncorrelated = 0.0).
NOTE: that the app is still present in this .bin-files and are obviously not random. The first 0x10C bytes should be cropped away.
Updated where I cropped the first 272 bytes (which should be the app).
➜ ~ ent ram_out_xorwow_crop.bin
Entropy = 7.998641 bits per byte.
Optimum compression would reduce the size
of this 130800 byte file by 0 percent.
Chi square distribution for 130800 samples is 246.23, and randomly
would exceed this value 64.18 percent of the times.
Arithmetic mean value of data bytes is 127.3576 (127.5 = random).
Monte Carlo value for Pi is 3.133211009 (error 0.27 percent).
Serial correlation coefficient is -0.000621 (totally uncorrelated = 0.0).
➜ ~ ent ram_out_orig_crop.bin
Entropy = 7.954721 bits per byte.
Optimum compression would reduce the size
of this 130800 byte file by 0 percent.
Chi square distribution for 130800 samples is 8124.03, and randomly
would exceed this value less than 0.01 percent of the times.
Arithmetic mean value of data bytes is 127.6415 (127.5 = random).
Monte Carlo value for Pi is 3.135779817 (error 0.19 percent).
Serial correlation coefficient is 0.009079 (totally uncorrelated = 0.0).
Marginally better than with the app.
I propose we move forward on this and try to get it merged.
But what tests do we need to perform in order to be confident enough about the implementation? We have some initial results above, using ent where we could see an improvment in shi square distribution. However, maybe we should:
What do you think?
I've done som more tests to compare the C implementation with the Go model
First 10 values from C implementation:
addr: 0x0ab510c6, data: 0x10052812
addr: 0x00502e9f, data: 0x15e160a2
addr: 0x093237ad, data: 0x0752b749
addr: 0x0a75c99d, data: 0x66cc4d05
addr: 0x0ec0680d, data: 0x02ffccc4
addr: 0x0f36b45c, data: 0x97811f88
addr: 0x014c48d1, data: 0x3a78f3a1
addr: 0x05d5b3c9, data: 0xa594f538
addr: 0x096aa20e, data: 0x5d4efc72
addr: 0x0b86b00a, data: 0xdadf5f43
First 10 values from the Go model:
addr: 0x0ab510c6, data: 0x10052812
addr: 0x00502e9f, data: 0x15e160a2
addr: 0x093237ad, data: 0x0752b749
addr: 0x0a75c99d, data: 0x66cc4d05
addr: 0x0ec0680d, data: 0x02ffccc4
addr: 0x0f36b45c, data: 0x97811f88
addr: 0x014c48d1, data: 0x3a78f3a1
addr: 0x05d5b3c9, data: 0xa594f538
addr: 0x096aa20e, data: 0x5d4efc72
addr: 0x0b86b00a, data: 0xdadf5f43
I then generated 10M values from each implementation and compared the generated data files:
sha256sum big_c.txt
e3d4d2e5b7e64aca612e09ef831fcab91f1cad7cf66d1086ae931da900c2ef85 big_c.txt
shasum256 big_go.txt
e3d4d2e5b7e64aca612e09ef831fcab91f1cad7cf66d1086ae931da900c2ef85 big_go.txt
Generated 1 GW32 binary data from the C-implementation and tested with ENT:
Entropy = 8.000000 bits per byte.
Optimum compression would reduce the size
of this 4000000000 byte file by 0 percent.
Chi square distribution for 4000000000 samples is 215.37, and randomly
would exceed this value 96.60 percent of the times.
Arithmetic mean value of data bytes is 127.5001 (127.5 = random).
Monte Carlo value for Pi is 3.141636609 (error 0.00 percent).
Serial correlation coefficient is 0.000031 (totally uncorrelated = 0.0).
Results from testing with PractRand:
RNG_test using PractRand version 0.94
RNG = RNG_stdin8, seed = unknown
test set = core, folding = standard (8 bit)
rng=RNG_stdin8, seed=unknown
length= 512 megabytes (2^29 bytes), time= 3.9 seconds
Test Name Raw Processed Evaluation
BCFN(2+0,13-1,T) R= +3.2 p = 0.097 normal
BCFN(2+1,13-1,T) R= +18.4 p = 2.4e-9 VERY SUSPICIOUS
BCFN(2+2,13-2,T) R= +5.9 p = 0.011 normalish
BCFN(2+3,13-2,T) R= +3.1 p = 0.102 normal
BCFN(2+4,13-3,T) R= -0.5 p = 0.562 normal
BCFN(2+5,13-3,T) R= -1.8 p = 0.763 normal
BCFN(2+6,13-4,T) R= +1.3 p = 0.279 normal
BCFN(2+7,13-5,T) R= -2.4 p = 0.836 normal
BCFN(2+8,13-5,T) R= -0.7 p = 0.592 normal
BCFN(2+9,13-6,T) R= -0.7 p = 0.590 normal
BCFN(2+10,13-6,T) R= -0.0 p = 0.470 normal
BCFN(2+11,13-7,T) R= +0.3 p = 0.399 normal
BCFN(2+12,13-8,T) R= +1.5 p = 0.218 normal
BCFN(2+13,13-8,T) R= -0.4 p = 0.502 normal
BCFN(2+14,13-9,T) R= -2.4 p = 0.911 normal
BCFN(2+15,13-9,T) R= +1.6 p = 0.191 normal
DC6-9x1Bytes-1 R= +2679 p = 4e-1572 FAIL !!!!!!!!
Gap-16:A R= +3.3 p = 0.023 normal
Gap-16:B R= +1.3 p = 0.182 normal
FPF-14+6/16:(0,14-0) R= -3.7 p =1-4.7e-3 normal
FPF-14+6/16:(1,14-0) R= -2.0 p = 0.925 normal
FPF-14+6/16:(2,14-0) R= -4.8 p =1-3.7e-4 normal
FPF-14+6/16:(3,14-0) R= -3.8 p =1-3.3e-3 normal
FPF-14+6/16:(4,14-0) R= -4.6 p =1-5.8e-4 normal
FPF-14+6/16:(5,14-1) R= -1.8 p = 0.905 normal
FPF-14+6/16:(6,14-2) R= -0.9 p = 0.737 normal
FPF-14+6/16:(7,14-2) R= -3.7 p =1-4.0e-3 normal
FPF-14+6/16:(8,14-3) R= -1.8 p = 0.898 normal
FPF-14+6/16:(9,14-4) R= -1.5 p = 0.855 normal
FPF-14+6/16:(10,14-5) R= +0.2 p = 0.445 normal
FPF-14+6/16:(11,14-5) R= -1.9 p = 0.916 normal
FPF-14+6/16:(12,14-6) R= +0.0 p = 0.491 normal
FPF-14+6/16:(13,14-7) R= -0.6 p = 0.660 normal
FPF-14+6/16:(14,14-8) R= +1.1 p = 0.217 normal
FPF-14+6/16:(15,14-8) R= +0.2 p = 0.432 normal
FPF-14+6/16:(16,14-9) R= -0.9 p = 0.720 normal
FPF-14+6/16:(17,14-10) R= +0.6 p = 0.313 normal
FPF-14+6/16:(18,14-11) R= -0.8 p = 0.679 normal
FPF-14+6/16:(19,14-11) R= -2.1 p = 0.980 normal
FPF-14+6/16:all R= -9.7 p =1-4.6e-9 VERY SUSPICIOUS
FPF-14+6/16:cross R= -0.8 p = 0.785 normal
BRank(12):128(4) R= +2684 p~= 1e-1428 FAIL !!!!!!!!
BRank(12):256(4) R= +6301 p~= 5e-3352 FAIL !!!!!!!!
BRank(12):384(1) R= +4889 p~= 1e-1472 FAIL !!!!!!!!
BRank(12):512(2) R= +9364 p~= 5e-2820 FAIL !!!!!!!!
BRank(12):768(1) R= +9883 p~= 3e-2976 FAIL !!!!!!!!
BRank(12):1K(2) R=+18528 p~= 1e-5578 FAIL !!!!!!!!
BRank(12):1536(1) R=+19226 p~= 1e-5788 FAIL !!!!!!!!
BRank(12):2K(1) R=+25361 p~= 1e-7635 FAIL !!!!!!!!
mod3n(5):(0,9-0) R= +0.8 p = 0.349 normal
mod3n(5):(1,9-1) R= +3.9 p = 0.028 normal
mod3n(5):(2,9-1) R= +1.6 p = 0.216 normal
mod3n(5):(3,9-2) R= -1.2 p = 0.728 normal
mod3n(5):(4,9-2) R= -4.7 p =1-7.1e-3 normal
mod3n(5):(5,9-3) R= -3.1 p = 0.948 normal
mod3n(5):(6,9-3) R= +0.3 p = 0.434 normal
mod3n(5):(7,9-4) R= +1.4 p = 0.235 normal
mod3n(5):(8,9-4) R= -1.5 p = 0.777 normal
mod3n(5):(9,9-5) R= +3.7 p = 0.041 normal
mod3n(5):(10,9-5) R= +0.6 p = 0.355 normal
mod3n(5):(11,9-6) R= -1.5 p = 0.784 normal
mod3n(5):(12,9-6) R= +0.6 p = 0.316 normal
mod3n(5):(13,9-6) R= -1.6 p = 0.810 normal
mod3n(5):(14,9-6) R= -0.1 p = 0.466 normal
mod3n(5):(15,9-6) R= -1.7 p = 0.819 normal
TMFn(2+0):wl R= -3.0 p~= 0.8 normal
TMFn(2+1):wl R= +1.1 p~= 0.3 normal
TMFn(2+2):wl R= +1.1 p~= 0.3 normal
TMFn(2+3):wl R= -1.8 p~= 0.7 normal
TMFn(2+4):wl R= +0.5 p~= 0.4 normal
[Low1/8]BCFN(2+0,13-3,T) R= +0.9 p = 0.339 normal
[Low1/8]BCFN(2+1,13-3,T) R= -4.4 p = 0.970 normal
[Low1/8]BCFN(2+2,13-4,T) R= +1.5 p = 0.261 normal
[Low1/8]BCFN(2+3,13-4,T) R= -2.5 p = 0.851 normal
[Low1/8]BCFN(2+4,13-5,T) R= -1.9 p = 0.779 normal
[Low1/8]BCFN(2+5,13-5,T) R= -1.1 p = 0.657 normal
[Low1/8]BCFN(2+6,13-6,T) R= -0.4 p = 0.527 normal
[Low1/8]BCFN(2+7,13-6,T) R= -3.9 p = 0.968 normal
[Low1/8]BCFN(2+8,13-7,T) R= -2.0 p = 0.799 normal
[Low1/8]BCFN(2+9,13-8,T) R= +2.1 p = 0.162 normal
[Low1/8]BCFN(2+10,13-8,T) R= +3.1 p = 0.097 normal
[Low1/8]BCFN(2+11,13-9,T) R= +1.2 p = 0.238 normal
[Low1/8]BCFN(2+12,13-9,T) R= -0.7 p = 0.553 normal
[Low1/8]DC6-9x1Bytes-1 R= +4.2 p = 0.039 normalish
[Low1/8]Gap-16:A R= +0.1 p = 0.636 normal
[Low1/8]Gap-16:B R= -1.6 p = 0.865 normal
[Low1/8]FPF-14+6/16:(0,14-0) R= +3.0 p = 0.018 normal
[Low1/8]BCFN(2+10,13-8,T) R= +3.1 p = 0.097 normal
[Low1/8]BCFN(2+11,13-9,T) R= +1.2 p = 0.238 normal
[Low1/8]BCFN(2+12,13-9,T) R= -0.7 p = 0.553 normal
[Low1/8]DC6-9x1Bytes-1 R= +4.2 p = 0.039 normalish
[Low1/8]Gap-16:A R= +0.1 p = 0.636 normal
[Low1/8]Gap-16:B R= -1.6 p = 0.865 normal
[Low1/8]FPF-14+6/16:(0,14-0) R= +3.0 p = 0.018 normal
[Low1/8]FPF-14+6/16:(1,14-0) R= -2.4 p = 0.958 normal
[Low1/8]FPF-14+6/16:(2,14-1) R= +2.0 p = 0.087 normal
[Low1/8]FPF-14+6/16:(3,14-2) R= -2.0 p = 0.926 normal
[Low1/8]FPF-14+6/16:(4,14-2) R= -0.5 p = 0.642 normal
[Low1/8]FPF-14+6/16:(5,14-3) R= +0.2 p = 0.431 normal
[Low1/8]FPF-14+6/16:(6,14-4) R= +3.8 p = 5.2e-3 normal
[Low1/8]FPF-14+6/16:(7,14-5) R= +3.0 p = 0.020 normal
[Low1/8]FPF-14+6/16:(8,14-5) R= -0.3 p = 0.576 normal
[Low1/8]FPF-14+6/16:(9,14-6) R= +0.7 p = 0.306 normal
[Low1/8]FPF-14+6/16:(10,14-7) R= -0.6 p = 0.666 normal
[Low1/8]FPF-14+6/16:(11,14-8) R= -0.1 p = 0.495 normal
[Low1/8]FPF-14+6/16:(12,14-8) R= +2.8 p = 0.035 normal
[Low1/8]FPF-14+6/16:(13,14-9) R= -0.0 p = 0.472 normal
[Low1/8]FPF-14+6/16:(14,14-10) R= -0.5 p = 0.603 normal
[Low1/8]FPF-14+6/16:(15,14-11) R= +0.2 p = 0.383 normal
[Low1/8]FPF-14+6/16:(16,14-11) R= -1.9 p = 0.964 normal
[Low1/8]FPF-14+6/16:all R= +1.4 p = 0.178 normal
[Low1/8]FPF-14+6/16:cross R= +0.5 p = 0.291 normal
[Low1/8]BRank(12):128(4) R= -1.4 p~= 0.890 normal
[Low1/8]BRank(12):256(2) R= -0.2 p~= 0.554 normal
[Low1/8]BRank(12):384(1) R= +0.4 p~= 0.366 normal
[Low1/8]BRank(12):512(2) R= +0.8 p~= 0.293 normal
[Low1/8]BRank(12):768(1) R= +0.4 p~= 0.366 normal
[Low1/8]BRank(12):1K(1) R= +1.8 p~= 0.146 normal
[Low1/8]mod3n(5):(0,9-2) R= -3.6 p = 0.967 normal
[Low1/8]mod3n(5):(1,9-2) R= -1.6 p = 0.787 normal
[Low1/8]mod3n(5):(2,9-3) R= +1.9 p = 0.169 normal
[Low1/8]mod3n(5):(3,9-3) R= -0.6 p = 0.615 normal
[Low1/8]mod3n(5):(4,9-4) R= -4.0 p = 0.987 normal
[Low1/8]mod3n(5):(5,9-4) R= -1.2 p = 0.714 normal
[Low1/8]mod3n(5):(6,9-5) R= +1.7 p = 0.185 normal
[Low1/8]mod3n(5):(7,9-5) R= +0.1 p = 0.444 normal
[Low1/8]mod3n(5):(8,9-6) R= -1.1 p = 0.701 normal
[Low1/8]mod3n(5):(9,9-6) R= +1.3 p = 0.213 normal
[Low1/8]mod3n(5):(10,9-6) R= +1.4 p = 0.205 normal
[Low1/8]mod3n(5):(11,9-6) R= -2.4 p = 0.936 normal
[Low1/8]mod3n(5):(12,9-6) R= -1.1 p = 0.695 normal
[Low1/8]TMFn(2+0):wl R= +0.7 p~= 0.4 normal
[Low1/8]TMFn(2+1):wl R= -1.9 p~= 0.7 normal
Actually a bit better than I expected from a simple PRNG.
Great results! I'm not so familiar with the output from PractRand, so we can discuss that on Monday. Otherwise I'm happy and ready to approve.
The only remaining thing I can think of is to measure the affect on the start up time. I don't think there is a difference that one would notice, but it is good to have it quantified. I can do that task.
Some info on interpreting PractRand test result: https://pracrand.sourceforge.net/Tests_overview.txt
Failures on short-ranged tests, especially DC6 parameterizations, are
typical of small chaotic PRNGs with insufficient mixing / avalanche.
Failures on FPF parameterizations can have similar meaning. However,
both of those find bias in a wide variety of PRNGs, sometimes for
non-obvious reasons.
BCFN failures where the first parameter listed is high (say, 5 or
more) typically reflect a cyclic buffer being traversed, where
patterns in the content of that buffer show up at specific spacings.
On the other hand, BCFN failures where the first parameter listed is
low are typically similar in meaning to a failure on a DC6
parameterization (see the previous paragraph). Note that when I
refer to the value of the first BCFN parameter, I'm refering to the
evaluated value listed in the output, meaning that for a test result
listed under "BCFN(2+3,13-1)" the first parameter would be 5 (2+3).
Failures on BRank suggest that in some way the PRNG output, or at
least part of it, was extremely linear, producable strictly by xoring
bits of previous PRNG output. This is classically seen on LFSRs /
GFSRs, but sometimes it shows up on PRNGs that are *supposed* to be
chaotic, but actually have some component that isn't really.
Long discussion with the author of PractRand about interpretation of the BCFN results. https://sourceforge.net/p/pracrand/discussion/366935/thread/0dd1fc3e/
Some info on interpreting PractRand test result: https://pracrand.sourceforge.net/Tests_overview.txt
...
Long discussion with the author of PractRand about interpretation of the BCFN results. https://sourceforge.net/p/pracrand/discussion/366935/thread/0dd1fc3e/
The fails we get seems to be explainable, which is comforting.
I have measured the difference between the new RAM fill, compared to the current.
Current 55 ms. xorwow: 133 ms.
So we have an increase of 78 ms on the start up. More than I expected, what do we think about that?
Also measured
// Set RAM address and data scrambling values
*ram_rand = rnd_word();
*ram_scramble = rnd_word();
which takes 30 ms.
Edit: Added the measurement for RAM address and data scrambling.
Some info on interpreting PractRand test result: https://pracrand.sourceforge.net/Tests_overview.txt ... Long discussion with the author of PractRand about interpretation of the BCFN results. https://sourceforge.net/p/pracrand/discussion/366935/thread/0dd1fc3e/
The fails we get seems to be explainable, which is comforting.
I have measured the difference between the new RAM fill, compared to the current.
Current 55 ms. xorwow: 133 ms.
So we have an increase of 78 ms on the start up. More than I expected, what do we think about that?
More than I expected too. But we also do additional RNG-reads that could be removed to offset this delay. My ESP sense tells me that we would save 30 ms.
Yes. An approach would be to remove the first two RNG-reads to set the RAM and address scrambling, which would save us 30 ms.
The addition of xorwow would then land on an increase of 48 ms. Which is OK, I think.
Completed measurements, in summary: Current implementation: 55ms. xorwow: 132 ms xorshift: 114 ms 1 word read from TRNG (with full wait from another TRNG read): 15 ms.
Which also accomodates most of the difference between xorwow and xorshift, since it xorwow makes two reads.
This PR improve initial RAM randomization by adding the xorwow variant of Marasaglia xorshift PRNG as data generator.
Closes #162