dhardy / rand

http://doc.rust-lang.org/rand
Other
2 stars 2 forks source link

NewSeeded: rename new → try_new, add new_with_fallback #21

Closed dhardy closed 6 years ago

dhardy commented 6 years ago

Rename OsRng::new → try_new too for consitency Add dependency on time crate (Note: better to use chrono, but has no real equivalent to precise_time_ns().)

@pitdicker, can you review?

pitdicker commented 6 years ago

Yes, thank you for asking!

As a test I printed 100 numbers generated with ClockRng, see this gist. It seems to me only the low 20 bits are useful (about 1 second), or even less then that. And then they are only useful once. Also the difference between two times is not constant, but pretty predictable. But taking the difference once should just about be enough to supply the remaining 12 bits for a u32. Something like this:

    let time = precise_time_ns();
    let diff = precise_time_ns() - time;
    let int = (time as u32) ^ (diff as u32 << 20);

A very small PRNG seeded with these 32 bits should give more 'random' results than taking any more from the system clock.

So while the system clock should be good enough for one u32, I don't think having a RNG based on it` is workable.

When using the system clock ass a fallback, what do you think of the following: Initialize a 32-bit 'PCG RXS M XS (LCG)' with it, and use that to initialize the target RNG?

PCG is not as difficult as it seems. With the constants written out a step takes just 4 lines of code. I have the start of something here (different version though).

pitdicker commented 6 years ago

Can't we use the standard library instead of the time crate? It seems to provide everything we need with SystemTime::now(), duration_since(UNIX_EPOCH) and subsec_nanos().

dhardy commented 6 years ago

@pitdicker That sounds much smarter than the code I wrote. Assembling roughly a u32 of entropy and expanding with a small RNG sounds reasonable (better than zero-padding or repeating, and probably not much worse than calling subsec_nanos() repeatedly). On the other hand, generating a fresh u32 via subsec_nanos() or similar for each 4-bytes of the PRNG may still get more entropy overall.

The lower precision part of the time (up to a few days, e.g. 10^6 sec) still provides some varying numbers, but is much more predictable. Whether or not it counts as "entropy" depends on the application (i.e. if you know this was calculated in response to a web query in the last second there is significantly less entropy than if it happened when the server was last rebooted — not that we're going to recommend anyone use system time as entropy for cryptography of course). But using a high-precision u32 repeatedly instead of alternating with the low-precision part like my ClockRng does still only needs to get another 20-bits to get ahead; with 256-bits that's four extra u32s so even with just 5 bits per call (per your gist, looks like we have at least this) this method comes out ahead.

If we wanted to make the fallback as secure as possible, we could require the use of some RNG wrapper, like Reseeding but combining new entropy with current state, with exponential or quadratic back-off on seeding frequency. There's a couple of tradeoffs though: the requirement for an RNG wrapper, reseeding complexity, and susceptibility to entropy-depletion attacks. So not worth it I think.

pitdicker commented 6 years ago

I agree that making more calls to subsec_nanos() gives a bit more entropy. But I am not really comfortable with it.

Even with very basic statistics the numbers don't look very random. If I sort by the diff2 column in the gist 0 occurs 8 times, and the numbers between -6 and 5 about 3 times on average. Only because there are 13 outliers (probably due to scheduling) in diff1 it looks somewhat random enough.

In https://github.com/rust-lang/rfcs/pull/722 there has been some discussion about reseeding. The linked article https://eprint.iacr.org/2013/338.pdf seems to be the state of the art of analysis around reseeding. It is to way difficult for me, at least without spending days to read it. The takeaway seems to be that extracting entropy and integrating it into an RNGs state is hard.

I don't trust myself (or maybe us) to come up with a good way to really extract more than 32 bits of entropy from the system time. With repeated runs I suspect there are a lot of details that start to matter: actual precision of the hardware timer (in the cpu?), rounding in the kernel, variance of the time between runs...

For a fall-back I would go with something simple.

pitdicker commented 6 years ago

The lower precision part of the time (up to a few days, e.g. 10^6 sec) still provides some varying numbers, but is much more predictable. Whether or not it counts as "entropy" depends on the application

If I understand you right, this means we could construct an u64? If we only needed one random number, I agree. But if someone were to initialize multiple RNGs, they would not be as different as when each of them was based on a u32 extracted from the system time.

dhardy commented 6 years ago

If I understand you right, this means we could construct an u64?

Constructing a u64 is easy; how much of it is unpredictable is another matter. Regarding an attack, it depends on how precisely the attacker knows when the number was drawn; regarding two different seeds being different in the same program, it depends how closely they were made.

I don't trust myself (or maybe us) to come up with a good way to really extract more than 32 bits of entropy from the system time.

Given enough time, and mixed scheduling (not making all the "time" readings in a tight loop), it should be possible to extract an unlimited amount of entropy. But not doing stuff in a tight loop is really going to help, so I think making only one or two readings each time next_u32 is called is much better than only reading from the timer when the ClockRng is constructed is better.

Perhaps best would be to combine the two approaches: construct a simple PRNG, and each time next_u32 is called read the time once or twice and mix it into the PRNG state (possibly with variable shift?).

I still wouldn't trust this approach to withstand a serious attack, but for anything else (including low-resource attacks) it should be fine. Makes me wonder, if we can extract a little entropy from the system timer at a very low cost, maybe we should mix that into the output of OsRng, just in case the system happens to have a broken or compromised entropy source?

dhardy commented 6 years ago

I did a little more work to try to estimate available entropy. The good news is that output of the high-resolution counter is well distributed, so there isn't too much rounding going on (on my test system at least). The bad news is that diffs between samples are not at all well distributed, but there may still be 2-4 bits of entropy per call.

That example is simply calling the timer in a tight loop and doing very little work per cycle; if the timer is called less frequently or with allocs or yields between calls there could be far more entropy available.

pitdicker commented 6 years ago

Wow, you are serious about estimating the entropy of the system clock!

That example is simply calling the timer in a tight loop and doing very little work per cycle; if the timer is called less frequently or with allocs or yields between calls there could be far more entropy available.

I also saw that the difference is much less predictable for the first two rounds of the loop.

What would you suggest after testing? Constructing a single u32/u64 or to really have a ClockRng?

dhardy commented 6 years ago

That's true, the difference between the first two samples may be much greater. I initially had differences greater than 1ms, but realised it was probably caused by a HashMap alloc, so switched to pre-allocated array buckets (ignoring large diffs, which I haven't really looked at).

I'm thinking a hybrid approach may be best.

dhardy commented 6 years ago

@pitdicker maybe you'd like to review my latest version? I'm not sure how to estimate entropy, but I think it should be reasonably good (well, still not close to the 64-bits of state I suppose).

It's a tad slower than PCG however...

dhardy commented 6 years ago

FYI

test init_clock0        ... bench:       2,126 ns/iter (+/- 31)
test init_clock2        ... bench:      59,307 ns/iter (+/- 728)
test init_clock12       ... bench:     344,871 ns/iter (+/- 6,067)
test init_clock20       ... bench:     573,214 ns/iter (+/- 8,513)
test init_clock32       ... bench:     915,750 ns/iter (+/- 8,663)
dhardy commented 6 years ago

See last commit for another approach which is simpler but should be stronger (though slower).

pitdicker commented 6 years ago

To test the statistical quality I have run a few variants trough PractRand.

https://github.com/dhardy/rand/pull/21/commits/d6f8ebf72d8e93816445de521916c8263b5d6a52 fails after 256 MiB, with only 3 tests not failing. https://github.com/dhardy/rand/pull/21/commits/a306a69650c926fcb2e0c8ae6554d08754b9801d fails after 256 MiB, with only 19 tests not failing. https://github.com/dhardy/rand/pull/21/commits/64f1f63a36a8666387e9c2895ff32607fb56a11c using ClockRng::new(32); fails after 128 MiB, with only 52 tests not failing. https://github.com/dhardy/rand/pull/21/commits/e40a030a5d3e6cc2332f20cee5a688e9e3ffa8de using StrongClockRng fails after 16 MiB (surprisingly low).

If I take https://github.com/dhardy/rand/pull/21/commits/a306a69650c926fcb2e0c8ae6554d08754b9801d and apply the MurmurHash finalizer on it, it fails after 256 MiB, with 65 test not failing.

    fn next_u32(&mut self) -> u32 {
        let mut state = get_time() as u32;

        state = (state ^ (state >> 16)).wrapping_mul(0x85ebca6b);
        state = (state ^ (state >> 13)).wrapping_mul(0xc2b2ae35);
        state ^ (state >> 16);

        state as u32
    }

So the experiments to extract more randomness from the system clock work against us :-(.

But a hybrid approach works great! Combining a hash of the clock as source of some extra entropy with PCG (messy implementation):

#[derive(Debug)]
pub struct ClockRng {
    state: u32,
}

impl ClockRng {
    /// Create a `ClockRng` (very low cost)
    pub fn new() -> ClockRng {
        let mut state = precise_time_ns() as u32;

        state = (state ^ (state >> 16)).wrapping_mul(0x85ebca6b);
        state = (state ^ (state >> 13)).wrapping_mul(0xc2b2ae35);
        state ^ (state >> 16);

        ClockRng { state: state }
    }
}

impl Rng for ClockRng {
    fn next_u32(&mut self) -> u32 {
        // Input 1: high 32 bits of the current time in nanoseconds, hashed with
        // the MurmurHash finalizer.
        let mut time = precise_time_ns() as u32;

        time = (time ^ (time >> 16)).wrapping_mul(0x85ebca6b);
        time = (time ^ (time >> 13)).wrapping_mul(0xc2b2ae35);
        time ^ (time >> 16);

        // Input 2: PCG 32 RXS M XS
        let mut state = self.state;
        self.state = state * 747796405 + 2891336453;
        state = ((state >> ((state >> 28) + 4)) ^ state) * 277803737;
        state = (state >> 22) ^ state;

        time ^ state
    }
    /* other methods */
}

A PractRand run is now at 32 GiB without anything to report.

dhardy commented 6 years ago

@pitdicker PractRand is designed to measure bias, not entropy. While bias in our output isn't a good thing, it's much less important than entropy. Remember, a fully deterministic PRNG can pass PractRand (not forever maybe, but to more data than you could possibly run). PractRand may still be useful for trying to remove bias I guess.

I am surprised that 64f1f63 does worse than the previous commit. a306a69 does no mixing at all yet gets one of the best results. This surprises me. What happens if you run the same test a couple more times?

The other funny thing is StrongClockRng doing so badly. I expected some bias, but it does worse than a single call to precise_time_ns()? It could be that the many calls to the time function in a tight loop interact badly, but given that they are combined after bit-shift I doubt it.

dhardy commented 6 years ago

Directly measuring entropy via the Shannon method is possible, so I did that (roughly 4-5.5 bits per call, but only last 2 bits have high entropy).

So if we mix the result well we should get a good quality generator, right? Lets try that...

dhardy commented 6 years ago

@pitdicker if you care to, please test the new StrongClockRng; hopefully it's much less biased (although I'm not that familiar with mixing).

I'm considering removing ClockRng and replacing it with StrongClockRng; although the current ClockRng is faster it would be faster still to seed a PRNG like ChaCha from StrongClockRng if much output is required, and stronger too.

pitdicker commented 6 years ago

I am surprised that 64f1f63 does worse than the previous commit. a306a69 does no mixing at all yet gets one of the best results. This surprises me. What happens if you run the same test a couple more times?

Yes it surprises me to. But I only had time to run the test, not to figure out the cause... PractRand has very reliable tests. I have run them several times but the results stay the same.

if you care to, please test the new StrongClockRng; hopefully it's much less biased (although I'm not that familiar with mixing).

No problem. It is better, but not much, sorry:

RNG_test using PractRand version 0.93
RNG = RNG_stdin32, seed = 0x18026a4
test set = normal, folding = standard (32 bit)

rng=RNG_stdin32, seed=0x18026a4
length= 32 megabytes (2^25 bytes), time= 2.2 seconds
  Test Name                         Raw       Processed     Evaluation
  BCFN(2+0,13-4,T)                  R=+63388  p = 0           FAIL !!!!!!!!  
  BCFN(2+1,13-4,T)                  R=+63272  p = 0           FAIL !!!!!!!!  
  BCFN(2+2,13-5,T)                  R=+78762  p = 0           FAIL !!!!!!!!  
  BCFN(2+3,13-5,T)                  R=+75308  p = 0           FAIL !!!!!!!!  
  BCFN(2+4,13-5,T)                  R=+71051  p = 0           FAIL !!!!!!!!  
  BCFN(2+5,13-6,T)                  R=+80227  p = 0           FAIL !!!!!!!!  
  BCFN(2+6,13-6,T)                  R=+66259  p = 0           FAIL !!!!!!!!  
  BCFN(2+7,13-7,T)                  R=+57279  p = 0           FAIL !!!!!!!!  
  BCFN(2+8,13-8,T)                  R=+40052  p = 0           FAIL !!!!!!!!  
  BCFN(2+9,13-8,T)                  R=+20541  p =  5e-5214    FAIL !!!!!!!!  
  BCFN(2+10,13-9,T)                 R=+11783  p =  1e-2648    FAIL !!!!!!!!  
  BCFN(2+11,13-9,T)                 R= +5870  p =  4e-1320    FAIL !!!!!!!!  
  DC6-9x1Bytes-1                    R= +6709  p =  2e-4517    FAIL !!!!!!!!  
  Gap-16:A                          R= +2938  p =  1e-2409    FAIL !!!!!!!!  
  Gap-16:B                          R=+16748  p = 0           FAIL !!!!!!!!  
  FPF-14+6/16:(1,14-1)              R=+113.3  p =  3.7e-100   FAIL !!!!!     
  FPF-14+6/16:(2,14-2)              R= +47.5  p =  2.4e-41    FAIL !!!       
  FPF-14+6/16:(3,14-2)              R= +45.8  p =  7.4e-40    FAIL !!!       
  FPF-14+6/16:(4,14-3)              R= +68.4  p =  1.0e-59    FAIL !!!!      
  FPF-14+6/16:(5,14-4)              R=  +7.5  p =  4.5e-6   unusual          
  FPF-14+6/16:(7,14-5)              R=+516.8  p =  2.4e-428   FAIL !!!!!!!   
  FPF-14+6/16:(8,14-6)              R=+357.4  p =  1.5e-273   FAIL !!!!!!    
  FPF-14+6/16:(9,14-7)              R=+230.7  p =  2.1e-183   FAIL !!!!!!    
  FPF-14+6/16:(10,14-8)             R=+166.1  p =  1.5e-119   FAIL !!!!!     
  FPF-14+6/16:(11,14-8)             R= +88.3  p =  1.4e-63    FAIL !!!!      
  FPF-14+6/16:(12,14-9)             R= +59.3  p =  1.2e-37    FAIL !!!       
  FPF-14+6/16:(13,14-10)            R= +42.0  p =  6.2e-23    FAIL !!        
  FPF-14+6/16:(14,14-11)            R= +26.1  p =  3.3e-12   VERY SUSPICIOUS 
  FPF-14+6/16:(15,14-11)            R= +18.3  p =  8.1e-9   very suspicious  
  FPF-14+6/16:all                   R=+22915  p = 0           FAIL !!!!!!!!  
  FPF-14+6/16:all2                  R=+96772  p = 0           FAIL !!!!!!!!  
  FPF-14+6/16:cross                 R=+732039 p = 0           FAIL !!!!!!!!  
  BRank(12):128(4)                  R= +34.5  p~=  3.7e-19    FAIL !         
  BRank(12):256(2)                  R= +97.5  p~=  2.2e-30    FAIL !!!       
  BRank(12):384(1)                  R=+152.8  p~=  5.1e-47    FAIL !!!       
  BRank(12):512(2)                  R=+337.8  p~=  1.0e-102   FAIL !!!!!     
  BRank(12):768(1)                  R=+411.1  p~=  8.8e-125   FAIL !!!!!     
  BRank(12):1K(1)                   R=+583.3  p~=  1.2e-176   FAIL !!!!!!    
  [Low8/32]BCFN(2+0,13-5,T)         R=+316293 p = 0           FAIL !!!!!!!!  
  [Low8/32]BCFN(2+1,13-5,T)         R=+263569 p = 0           FAIL !!!!!!!!  
  [Low8/32]BCFN(2+2,13-6,T)         R=+260166 p = 0           FAIL !!!!!!!!  
  [Low8/32]BCFN(2+3,13-6,T)         R=+188056 p = 0           FAIL !!!!!!!!  
  [Low8/32]BCFN(2+4,13-6,T)         R=+112303 p = 0           FAIL !!!!!!!!  
  [Low8/32]BCFN(2+5,13-7,T)         R=+70944  p = 0           FAIL !!!!!!!!  
  [Low8/32]BCFN(2+6,13-8,T)         R=+41820  p = 0           FAIL !!!!!!!!  
  [Low8/32]BCFN(2+7,13-8,T)         R=+20745  p =  7e-5266    FAIL !!!!!!!!  
  [Low8/32]BCFN(2+8,13-9,T)         R=+11858  p =  1e-2665    FAIL !!!!!!!!  
  [Low8/32]BCFN(2+9,13-9,T)         R= +5896  p =  5e-1326    FAIL !!!!!!!!  
  [Low8/32]DC6-9x1Bytes-1           R=+50897  p = 0           FAIL !!!!!!!!  
  [Low8/32]Gap-16:A                 R=+68886  p = 0           FAIL !!!!!!!!  
  [Low8/32]Gap-16:B                 R=+444386 p = 0           FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:(1,14-2)     R=+86776  p = 0           FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:(2,14-3)     R=+63292  p = 0           FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:(3,14-4)     R=+42429  p = 0           FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:(4,14-5)     R=+30154  p = 0           FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:(5,14-5)     R=+21577  p = 0           FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:(6,14-6)     R=+14558  p = 0           FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:(7,14-7)     R=+10546  p =  7e-8393    FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:(9,14-8)     R= +4640  p =  1e-3338    FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:(10,14-9)    R=+388.6  p =  5.5e-245   FAIL !!!!!!    
  [Low8/32]FPF-14+6/16:(11,14-10)   R=+10150  p =  2e-5400    FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:(12,14-11)   R= +4119  p =  5e-1796    FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:(13,14-11)   R= +5558  p =  3e-2423    FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:all          R=+185222 p = 0           FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:all2         R=+3842559587 p = 0           FAIL !!!!!!!!  
  [Low8/32]FPF-14+6/16:cross        R=+1040458 p = 0           FAIL !!!!!!!!  
  [Low8/32]BRank(12):128(2)         R=+337.8  p~=  1.0e-102   FAIL !!!!!     
  [Low8/32]BRank(12):256(2)         R=+824.9  p~=  2.3e-249   FAIL !!!!!!    
  [Low8/32]BRank(12):384(1)         R=+927.7  p~=  2.6e-280   FAIL !!!!!!    
  [Low8/32]BRank(12):512(1)         R= +1272  p~=  5.4e-384   FAIL !!!!!!!   
  [Low1/32]BCFN(2+0,13-7,T)         R=+334440 p = 0           FAIL !!!!!!!!  
  [Low1/32]BCFN(2+1,13-7,T)         R=+158068 p = 0           FAIL !!!!!!!!  
  [Low1/32]BCFN(2+2,13-8,T)         R=+90156  p = 0           FAIL !!!!!!!!  
  [Low1/32]BCFN(2+3,13-8,T)         R=+43780  p = 0           FAIL !!!!!!!!  
  [Low1/32]BCFN(2+4,13-8,T)         R=+21431  p =  5e-5440    FAIL !!!!!!!!  
  [Low1/32]BCFN(2+5,13-9,T)         R=+12134  p =  1e-2727    FAIL !!!!!!!!  
  [Low1/32]BCFN(2+6,13-9,T)         R= +5993  p =  7e-1348    FAIL !!!!!!!!  
  [Low1/32]DC6-9x1Bytes-1           R=+157070 p = 0           FAIL !!!!!!!!  
  [Low1/32]Gap-16:A                 R=+96472  p = 0           FAIL !!!!!!!!  
  [Low1/32]Gap-16:B                 R=+279673 p = 0           FAIL !!!!!!!!  
  [Low1/32]FPF-14+6/16:cross        R=+1853012 p = 0           FAIL !!!!!!!!  
  [Low1/32]BRank(12):128(2)         R= +3748  p~=  3e-1129    FAIL !!!!!!!!  
  [Low1/32]BRank(12):256(1)         R= +5405  p~=  3e-1628    FAIL !!!!!!!!  
  ...and 2 test result(s) without anomalies

A few small experiments from my side fared much worse though.

pitdicker commented 6 years ago

Directly measuring entropy via the Shannon method is possible, so I did that (roughly 4-5.5 bits per call, but only last 2 bits have high entropy).

Good work! About 4-5 bits matches my feeling. This paper (which I have not yet read fully) explains on page 6 and 7 that Shannon entropy is the best case, and we can realistically expect to be able to extract a little less.

But not doing stuff in a tight loop is really going to help, so I think making only one or two readings each time next_u32 is called is much better than only reading from the timer when the ClockRng is constructed is better.

Perhaps best would be to combine the two approaches: construct a simple PRNG, and each time next_u32 is called read the time once or twice and mix it into the PRNG state (possibly with variable shift?).

Yes, this is better than my proposal to seed a small PRNG once with the system clock. When speaking in terms of entropy, every round then adds a little, and every little bit helps.

I still wouldn't trust this approach to withstand a serious attack, but for anything else (including low-resource attacks) it should be fine. Makes me wonder, if we can extract a little entropy from the system timer at a very low cost, maybe we should mix that into the output of OsRng, just in case the system happens to have a broken or compromised entropy source?

OsRng already uses the system timer as an input source, and is very conservative. Mixing it with the system timer will prabably even make it worse. I have read serious advice against trying to improve on in, but can't remember where.

PractRand is designed to measure bias, not entropy. While bias in our output isn't a good thing, it's much less important than entropy.

I am really learning as I go. You are right about PractRand. But I expect a good method to extract entropy and mix it to produce practically unbiased results. So it is useful as an indication.

I do believe being unbiased is a very important property, especially when seeding other RNGs.

I'm thinking a hybrid approach may be best.

What did you think about https://github.com/dhardy/rand/pull/21#issuecomment-339228397? It only adds about 4 bits of entropy each round as you calculated. It is statistically good, I canceled PractRand after 1 TiB of no anomalies.

I think we either have to go with a simple solution, or do good research. Wikipedia has among others the suggestion to use a Cryptographic hash function to extract entropy.

dhardy commented 6 years ago

What did you think about #21 (comment)?

Uh, sorry but there are a few issues with that code.

state ^ (state >> 16); time ^ (time >> 16);

You have two lines of code which don't do anything.

self.state = state * 747796405 + 2891336453;

Time never affects the state, it only affects the output with a simple XOR. So basically you have a simple LCG like PCG with some extra tweaking of the output; you're never accumulating entropy.

Further, you never try to assemble much entropy. I still think ChaChaRng::seed_from(&mut StrongClockRng::new()) would be much stronger because even if there is some bias in the seed it should still be close to the maximum 256 bits of entropy in theory (if I understand the issue correctly, which I admit I may not).

@burdges I wonder if you are able to help us out here?

pitdicker commented 6 years ago

Does this sound right to you?

My previous thought was to set up a small RNG with the result from the first round. Then for each next_u32 use the result of that RNG combined with about 4 new bits of entropy. There is no collecting of entropy, every little bit is directly returned with the next_u32 it is collected in.

You want to go with something stronger, and that makes sense. That would mean mixing in the new values from get_time with the state op the PCG RNG as you do is better. Also using 64 bits of internal state is better, as that can hold 64 bits of entropy. I wonder why it's results are worse, but maybe because of the modification of the output function.

In principle I like the idea behind StongClockRng. What do you think of this? Collect about 8 rounds of get_time for each next_u32 (~32 bits of entropy), and after each round combine as the PCG increment. As last do the output function of PCG to get a 32-bit number.

Hopefully it can give both the good entropy we want, and be unbiased.

dhardy commented 6 years ago

@pitdicker you can try that variant on ClockRng; it should have more entropy with 8 rounds of get_time but in terms of bias I don't know if it would be any different. I'm surprised that my modification of PCG didn't work well since one is supposed to be able to use any odd number as increment. Maybe it would be better with a fixed increment, and then do XOR or something to combine the time entropy into the state?

Overall though I still don't think bias in the crypto keys should be a problem aside from the reduction in entropy, and I'm not sure how to measure that (I would expect it to be small, but don't know). I still think something like StrongClockRng would be a good choice for generating keys, because in theory you can get close to the maximum entropy that the key can hold — unless there's some major weakness in the time counters (I certainly don't trust it enough to recommend for cryptography). The bias in the key shouldn't matter because CSPRNGs must prevent calculation of the key from the output, and if a poor key resulted in poor (biased) output, that would already give a massive hint as to what the key is. For non-crypto PRNGs, well, some have poor output with certain keys, but I doubt it would be a significant problem for most applications.

pitdicker commented 6 years ago

I'm surprised that my modification of PCG didn't work well since one is supposed to be able to use any odd number as increment.

That part was a good idea, and it works well. I only did a quick test during breakfast, where I replaced the output function. I think there is the problem with the output function, that is different from normal PCG.

pitdicker commented 6 years ago

I still feel uneasy about coming up with something ourselves. What do you think about creating an implementation based on this Jitter RNG?

It also uses the the system clock as source, and has excellent documentation and analysis. It would be quite some work though, there are about 1000 lines of C code. The license is BSD or alternatively GPL.

pitdicker commented 6 years ago

Apparently that is also the method Linux uses to extract entropy from the timer.

dhardy commented 6 years ago

Nice find; that does sound like what we want. At this level of complexity it might also make sense to use an external crate, or to start with an external implementation and consider importing into rand later.

But this is getting a bit beyond the scope of the rand refactor so I'd rather give it less focus (still happy to do code reviews etc.). Do you want to try implementing this @pitdicker? If not I'll get it mentioned in the next "impl period" newsletter or something; it should be a relatively easy project for someone else to get into. (Of course one could instead just wrap the C library in an external crate.)

pitdicker commented 6 years ago

But this is getting a bit beyond the scope of the rand refactor so I'd rather give it less focus (still happy to do code reviews etc.). Do you want to try implementing this @pitdicker?

Currently I have an implementation of Xorshift* and PCG mostly finished, and polishing it a bit and waiting an the discussion around initialization. Also OsRng needs a good look. Diving into Jitter RNG would be a nice project though... Can you get it mentioned in the "impl period" newsletter? If there is no one I would like to work on it.

How to proceed with this PR? Remove StrongClockRng, make ClockRng private for now and use it as a fallback?

dhardy commented 6 years ago

How to proceed for now — not sure; see #23.

pitdicker commented 6 years ago

On closer look it seems quite doable to implement jitterentropy-library. I will give it a try.

dhardy commented 6 years ago

JitterRng has been merged now and #54 covers usage as a fallback. That finally makes this PR obsolete.