Closed edoarad closed 1 year ago
@edoarad Thanks! Can you elaborate on what you find weird about this?
sure. so, ideally, x would only be sampled once each time and squared, so we'd get only positive numbers. An alternative implementation (mistaken imo) would sample independently from two copies, in which case we'd get a symetric dist around 0. Neither seems to be the case
On Sun, Jul 16, 2023, 22:46 Peter Wildeford @.***> wrote:
@edoarad https://github.com/edoarad Thanks! Can you elaborate on what you find weird about this?
— Reply to this email directly, view it on GitHub https://github.com/rethinkpriorities/squigglepy/issues/47#issuecomment-1637175450, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF6OSLMAIKH6KFFKEX7OCTDXQRAK3ANCNFSM6AAAAAA2L6JKA4 . You are receiving this because you were mentioned.Message ID: @.***>
ideally, x would only be sampled once each time and squared
That doesn't work in Python because x
is the distribution object, so each distribution object would get sampled independently. I don't think there's any easy way for the sampler to know that both x
should be the same when sampled. However, the solution is pretty easy -- if you do sq.sample(x ** 2, n=1000)
that should get you the desired behavior.
An alternative implementation (mistaken imo) would sample independently from two copies, in which case we'd get a symetric dist around 0.
That's not the case with squigglepy because sq.norm(0, 1)
by default is a 90% CI between 0 and 1, so you get a mean of 0.5. The mean of your x * x
distribution is thus 0.25, which appears to be the case based on your observed sampling. If you want to define the interval via mean + sd you need to do that explicitly via defining x = sq.norm(mean=0, sd=1)
.
So I don't see any bad behavior here, just a confusion between what squigglepy does and what you seem to think it does or want it to do.
Yep, my mistake on the normal dist parameters..
I've recently talked with Ozzie about this, and I am now pretty sure that the correct behavior should sample only once for both x
. It could resemble your work on correlated variables, I guess, but there are also simpler implementations I can think of for "correlation 1".
Thanks! I don't see a clear way to make the interface work the way you suggest, but we will look into this for our correlated variables interface.
In the meantime, another thing you could do is define the sample via a function:
def model():
x = ~sq.norm(0, 1) # Use `~` to explicitly sample in advance
return x * x # Both x will have the same value now
sq.sample(model, n=1000) # But each time the model runs the x will be different
I don't think this is working
@edoarad You have to explicitly sample the x first: x = ~sq.norm(-1,1)
not x = sq.norm(-1, 1)
. Otherwise you get the distribution objects passed and you're back to the original problem.
Seems like I just assumed something silly about ~
and didn't read the comment or try to use your code 🤦♂️ It works well, just as you say..
Okay, in samplers.py
you have _squigglepy_internal_sample_caches
. I'm not sure what exactly it's intended for, but you could use it to implement this feature.
It's using str(dist)
as keys, but if instead you use id(dist)
it should work well. Then, you only need to refresh the cash each iteration
(I thought about adding a dict agrument to sample
, but there's no need to have it multiplied on the stack. I expect it not to hurt performance too much though.)
(also, I didn't think at all about how this is influenced by multiprocessing)