Improve uniform `ShortByteString`

lehins commented 3 years ago

random-1.2.0 contained a shortcut where architecture independent generation of bytes relied on bytestring's builder functionality, which forced us to generate ShortByteString as pinned. This PR fixes that technical debt.

This is a non-breaking change.

Bodigrim commented 3 years ago

Any idea how to test it on BE arch?

lehins commented 3 years ago

Any idea how to test it on BE arch?

Besides spinning up an AMD server from some cloud provider, like Hetzner for instance and running tests there, I have no other ideas.

Bodigrim commented 3 years ago

Which AMD machines are big-endian? I thought all of them are little-endian, as well as modern ARM.

Bodigrim commented 3 years ago

@juhp I recall you raising s390-related issues at GHC bug tracker. Do you possibly have an access to a big-endian machine to test this patch?

lehins commented 3 years ago

Oh look at that. AMD is LE too. In that case I have no clue how to get hold of BE hardware :D

juhp commented 3 years ago

I can probably do a test build in the Fedora buildsystem.

You can see our latest release build here for example.

So you just want me to build this change? Get the testsuite to run looks rather tricky.

juhp commented 3 years ago

Here is a scratch build: https://koji.fedoraproject.org/koji/taskinfo?taskID=74908079 (using ghc-8.10.5 and LTS 18 packages basically).

lehins commented 3 years ago

@juhp Thank you for your help. Unfortunately it is the random:spec test suite that needs to be run in order to confirm that big/little endian compatibility works as expected.

juhp commented 3 years ago

Okay, I feared as much, perhaps I can get temporary access to a Fedora s390x instance... Otherwise in the worst case we will find out later I guess ;-)

juhp commented 3 years ago

~~Good news I ran the testsuite on Fedora 34 s390x and it passed:~~

lehins commented 3 years ago

@juhp Awesome!!! Thank you very much for verifying this PR!!!

juhp commented 3 years ago

Sorry please wait - that was the wrong log... rechecking now

juhp commented 3 years ago

I ran the testsuite on Fedora 34 s390x now correctly in your branch and I am afraid there was 1 test failure:

    genByteString/ShortByteString consistency:  FAIL
      test/Spec.hs:118:
      expected: [78,232,117,189]
       but got: [189,117,232,78]

(Sorry for the false confirmation earlier)

lehins commented 3 years ago

@juhp Dammit, that is unfortunate. Thank you for rechecking it!

Bodigrim commented 3 years ago

@lehins actually, what is our goal here? To produce the same random numbers from the same seed both on LE and BE platforms? Why it does not suffice just to produce some random numbers, but not necessarily the same?

lehins commented 3 years ago

To produce the same random numbers from the same seed both on LE and BE platforms?

yes

Why it does not suffice just to produce some random numbers, but not necessarily the same?

I don't follow.

lehins commented 3 years ago

@Bodigrim Not random numbers, but sequence of random bytes

lehins commented 3 years ago

@Bodigrim I'll describe what is going on in a little more detail:

In order to to generate a ByteString (or a ShortByteString) we could do something like genByteStringM n g = pack <$> replicateM n (uniformM g)

However this would generate 64bits for every byte that will be used, which is extremely wasteful and inefficient.

What we do instead is generate one Word64 at a time and write into a mutable buffer until we fill it up. Writing it in BE/LE agnostic manner will ensure that generated ByteString will be the same for all architectures for the same generator.

There is also an extra issue at the end of a ByteString as well, since we often will have a tail that is smaller than Word64 (when mod n 8 /= 0) we need to write the first few bytes into the end of the ByteString in the same manner across architectures as well.

So the failing test in this https://github.com/haskell/random/pull/116#issuecomment-911600622 depicts that there is a problem in the logic (or in my assumptions of how it works) somewhere and we will get bytes in different order on BE vs LE machines.

Now, all I need is to figure out how can I get hands on BE machine so I can experiment with this, I can't be constantly bugging Jens to verify if a change works or not. I suspect the problem was there prior to this PR as well, except the test was not present until now and if anyone would ever run random on a BE machine random bytes would be still ... random, so this problem is not very well pronounced, nevertheless it is still there.

curiousleo commented 3 years ago

Now, all I need is to figure out how can I get hands on BE machine so I can experiment with this, I can't be constantly bugging Jens to verify if a change works or not.

It looks like this project lets you run an emulated s390x Ubuntu with QEMU + Docker:

$ docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
$ docker run --rm -t s390x/ubuntu uname -m
s390x

https://github.com/multiarch/qemu-user-static#getting-started

That might help.

lehins commented 3 years ago

Halleluja! I am not going crazy! @curiousleo enormous thank you for this suggestion with docker. It was a bit painful to get it to work, anything complicated like cabal or stack will not work on that docker, because it uses up memory like crazy.

So it appears my sanity is fine and GHC is not reporting ByteOrder correctly: https://gitlab.haskell.org/ghc/ghc/-/issues/20338

@Bodigrim thank you for suggesting to avoid CPP otherwise we would not have found that bug, but now I need to bring the CPP back.

@juhp thank you again for helping debug this. If you don't mind I'll ask you again in a little bit to run the test suite one last time, just to be sure. For now I'll need to bring back the CPP approach first.

Bodigrim commented 3 years ago

GHC is not reporting ByteOrder correctly

Oh, that's pretty big. Thanks for debugging it.

curiousleo commented 3 years ago

Damn, nice find @lehins.

Bodigrim commented 3 years ago

It was a bit painful to get it to work, anything complicated like cabal or stack will not work on that docker, because it uses up memory like crazy.

I was able to use cabal -j1. It eats up to 16Gb RAM and is slow as hell, literally hours and hours to build dependencies, but succeeds.

lehins commented 3 years ago

:smile: As I said, painful indeed

haskell / random

Improve uniform `ShortByteString` #116