Closed sergesku closed 2 years ago
I would be happy to accept a PR.
generate
results in non-random results entirely:
λ generate ((,,) <$> pickAny @Int <*> pickAny @Int <*> pickAny @Int)
(8630743491499986047,8630743491499986047,8630743491499986047)
Looking at the source, the rng isn't threaded through:
pickAny :: forall a m. (Monad m, Random a) => FakeT m a
pickAny = FakeT (\settings -> let (x, _) = random (getRandomGen settings) in pure x)
So every call to pickAny
is always going to result in the same a
for deterministic mode.
generate results in non-random results entirely:
By default, the values generated are deterministic.
I think explicity documenting it in Haddock would be a good thing: https://hackage.haskell.org/package/fakedata-1.0.2/docs/Faker.html#v:generate
Should that be what "deterministic" means here though? Fake data that is always the same doesn't seem useful. I don't see why anyone would ever use generate
as-is. Even in my example, why would I want all my pickAny
s in my applicative composition to be the same? I'd want them to be different - but the same every time I ran with deterministic mode set to "on".
It would still be "deterministic" if it uses the given StdGen
(which is based on a seed). But it would be deterministically random (i.e. it generates the same random value for the given seed) instead of not random at all.
So I think the seed should be threaded through (making it an IORef
makes this possible).
IIRC, It was made like that for perf reasons. Constructing individual record types didn't need StdGen
to be changed as the fields within it were different.
I gave this a much better look today. Using applicative instance doesn't change the underlying stdgen (I think I did it because if not it will break applicative laws, but if someone can prove otherwise - I would be happy to accept a PR in the form of quickcheck-classes)
I will see if I can add tests for all the properties using quickcheck-classes, but my OSS time is limited these days.
So how to produce a different set of numbers using pickAny:
Prelude Faker Faker.Combinators> generateNonDeterministic $ sequence [pickAny @Int, pickAny @Int, pickAny @Int]
[-7259296580907384170,3667397267649137371,4885350613124653216]
Prelude Faker Faker.Combinators> generateNonDeterministic $ sequence [pickAny @Int, pickAny @Int, pickAny @Int]
[-4133709365272433231,-4842076987059227700,6130434414512083199]
And I would write your second version of code like this to produce the intended effect:
lst1 <- generateNonDeterministic $ replicateM 3 pickAny
lst2 <- generateNonDeterministic $ replicateM 3 pickAny
lst3 <- generateNonDeterministic $ replicateM 3 pickAny
Or it can be made even more shorter:
generateNonDeterministic $ sequence [ replicateM 3 pickAny, replicateM 3 pickAny, replicateM 3 pickAny]
This method will not work to create absolutely different values.
generateNonDeterministic $ sequence [ replicateM 3 pickAny, replicateM 3 pickAny, replicateM 3 pickAny] :: IO [[Int]]
[ [-6647855527820622346,-1896711671625956513,-8882849823799143032]
, [-1896711671625956513,-8882849823799143032,-4477016760604416600]
, [-8882849823799143032,-4477016760604416600,6599046923786404784]]
FakeT is only really a reader internally, so it can't replicate the full power of state.
Ah, right. In that case these are the two options:
lst1 <- generateNonDeterministic $ replicateM 3 pickAny
lst2 <- generateNonDeterministic $ replicateM 3 pickAny
lst3 <- generateNonDeterministic $ replicateM 3 pickAny
Or in this specific case:
generateNonDeterministic $ replicateM 12 pickAny
I think this is a good solution for lists: https://github.com/fakedata-haskell/fakedata/pull/47
A more general solution to thread a modified StdGen
through a <*>
would be very sexy though.
How about the following compromise?
Have a separate version of apply: variedApply
... It would be slower but produce varied results. We could also flip the angle brackets 90 degrees to have a corresponding infix: v*v
Then we save on perf when it's not needed and don't break any laws
As for the other combinators, we could have a separate twin module: Combinators.Varied
A more general solution to thread a modified StdGen through a <*> would be very sexy though.
I would be happy to take a PR for it with also additional test cases that application laws won't break: https://hackage.haskell.org/package/quickcheck-classes-0.6.5.0/docs/Test-QuickCheck-Classes.html#v:applicativeLaws
I have release a new version in the hackage with improved docs. Also added a section on README that might be helpful: https://github.com/fakedata-haskell/fakedata#deterministic-vs-non-deterministic-values
We also have a new function named generateNonDeterminsicWithFixedSeed
in the newer releaese: https://github.com/fakedata-haskell/fakedata/blob/master/ChangeLog.md#103
With that I'm closing the issue. But free feel to re-open if there are any more questions on this.
As a result of the specific control of the internal
StdGen
, duplicate values can be generated in some cases