lemire / SwiftWyhash

Fast random number generator in pure Swift
Apache License 2.0
23 stars 2 forks source link

Magic Constants #6

Closed palimondo closed 5 years ago

palimondo commented 5 years ago

I see the implementation here matches the wyhash64 from your blog.

I was searching the net for the original author, as your article does not mention him. I think it is Wang Yi and the original code is here, if I'm correct: https://github.com/wangyi-fudan/wyhash

Looking at his original code, I've noticed that the magic constants in your implementation do not match those in original.

It looks like the original 3 constants (_wyp02) in order of appearance are:

The current implementation here is:

    public mutating func next() -> UInt64 {
        self.state &+= 0x60bee2bee120fc15
        let t1 = self.state.multipliedFullWidth(by: 0xa3b195354a39b70d)
        let m1 = t1.high ^ t1.low
        let t2 = m1.multipliedFullWidth(by: 0x1b03738712fad5c9)
        let m2 = t2.high ^ t2.low
        return m2
    }

Is this not an issue for the quality of the output? I'm not sure that using different constants the claim of passing the BigCrush is correct...

lemire commented 5 years ago

@palimondo

  1. Yes, I did not name him in this blog post and the reason is quite simple... he reached out to me with a pseudonym and I was uncertain about his name. I was also uncertain about whether he was the inventor, it took me some time to determine credit (wyhash is closely related to prior approaches). But we can ping him... @wangyi-fudan

  2. I will add his name both to the README file and to my blog post.

  3. I verified that the code, with the constant I am using now, passes Big Crush. I think Wang changed the constant and kept working on the code afterward. We can certainly match the current version.

lemire commented 5 years ago

Credit issue should be resolved now.

palimondo commented 5 years ago

Also… I see the code from @wangyi-fudan contains wyrand, that is even simpler than wyhash64.

wyhash64(uint64_t A, uint64_t B){ return _wymum(_wymum(A^_wyp0, B^_wyp1), _wyp2); }
…
wyrand(uint64_t *seed){ *seed+=_wyp0; return _wymum(*seed^_wyp1,*seed); }

Do I understand correctly that your wyhash64 seems to be a hybrid of these two?

lemire commented 5 years ago

I made no hybrid. I have no designer credit in any of this. I just took the original code that he had back when he first reached out to me. I tested it. That is all.

I am pretty sure that any difference you see are the results of further work on his side of things.

Of course, we can update accordingly.

lemire commented 5 years ago

(I did tweak his code somewhat, e.g., he was using things like unsigned long long and stuff... but I kept the C logic identical.)

palimondo commented 5 years ago

I see. BTW I also found this C# implementation, if anybody is interested: https://github.com/cocowalla/wyhash-dotnet

lemire commented 5 years ago

Here are the constants, March 21st...

https://github.com/wangyi-fudan/wyhash/commit/a9f8f38eb0832aacaa983d99ccc8e7b0c55e8ccb

 const  unsigned long long  _wyp0=0x60bee2bee120fc15ull,    _wyp1=0xa3b195354a39b70dull,    _wyp2=0x1b03738712fad5c9ull;

Here are the constants after March 21st...

const   unsigned long long  _wyp0=0xa0761d6478bd642full,    _wyp1=0xe7037ed1a0b428dbull,    _wyp2=0x8ebc6af09c88c6e3ull;

So the constants changed March 21st...

wangyi-fudan commented 5 years ago

dear all: i come here as i received your emails. the magic constant changed after i developed a more powerful statistic test https://github.com/rurban/smhasher/issues/65 This test breaks several hashes, and wyhash's statistics is not very good. So I redesigned the primers with following criterion: each primer is random sampled. each prime should have exactly 32 1bit and 32 0bit. the xor of each pair of primes should have 32 1bit and 32 0 bit. So these primer are exactly balanced.

palimondo commented 5 years ago

@lemire:

Of course, we can update accordingly.

Updating to match the latest stand of wyhash brings up issues with naming and the implementation.

Naming

My thinking in PR #3 was:

I see you refer to this generator as wyhash64 in your article. It makes sense to me to rename the implementation to Wyhash64. This makes it follow the conventional naming like SplitMix64 or Xoshiro128.

I see that back in March 13th, the key operation was called wyhashmix64 (later renamed to _wymum) and PRNG was called wyrng_u64, so I can see why you called it wyhash64 in the blog post. That codebase was still in flux: at the time of your original publication it was called wyrng (hence the name WyRng in the #C and Rust ports, I guess?) ... But now I see few problems with this:

Based on that, can we rename the Swift struct one last time to WyRand?

Implementation

Updating to latest stand is not only about using the new constants, the algorithm was also simplified: the latest version uses just a single MUM operation instead of two, as was the case of your original publication, and it mixes in the prime before multiplication. This should be faster than the old version. I think rewritten to match your style it now looks like this:

static inline uint64_t wyrand_stateless(uint64_t *seed) {
  *seed += UINT64_C(0xa0761d6478bd642f);
  __uint128_t tmp;
  uint64_t m1 = *seed ^ UINT64_C(0xe7037ed1a0b428db);
  tmp = (__uint128_t)*seed * m1;
  uint64_t m2 = (tmp >> 64) ^ tmp;
  return m2;
}

Could you please re-verify this still passes the BigCrunch? (EDIT: I've prepared a PR in testingRNG for you, but I don't have the capacity to run that.)

Given that you've updated your blog post with the mention of this project, you might want to also make it reflect this latest wyhash development?

The Swift implementation that passes the same test* as the C# port looks like this: * only first generated value, see https://github.com/cocowalla/wyhash-dotnet/issues/1 for discussion of how to correctly modify the generator's state.

public struct WyRand: RandomNumberGenerator {
    private var state : UInt64

    init(seed : UInt64) {
        state = seed
    }

    public mutating func next() -> UInt64 {
        state &+= 0xa0761d6478bd642f
        let mul = state.multipliedFullWidth(by: state ^ 0xe7037ed1a0b428db)
        return mul.high ^ mul.low
    }
}

I would update this project to match, if you give me green light on the passing BigCrush.

lemire commented 5 years ago

@palimondo It passes Big Crunch, but I have not done an in-depth analysis (which takes time and manual labor). I'll need more time to update my sites with the information... However, we can move forward, I think.

Flaws are always a possibility, with any generator.

palimondo commented 5 years ago

Did you get a chance to bench it on ARM yet? IIRC your follow up post mentioned the old implementation did not fare that well there due to instruction set differences?

lemire commented 5 years ago

@palimondo

Why don't we include a benchmark in Swift? There we can run it on an ARM server and know. I'll open an issue.

palimondo commented 5 years ago

Closing this, since PR #7 has been merged to master.