quantified-uncertainty / squiggle

An estimation language
https://squiggle-language.com
MIT License
156 stars 23 forks source link

Operations on very small or large numbers, silently round to 0 and 1 #834

Open NunoSempere opened 2 years ago

NunoSempere commented 2 years ago

Description:

When operating very small or very large numbers, these are rounded to exactly 0 or exactly 1 without warning.

Context

I am replicating this paper: (https://arxiv.org/pdf/1806.02404.pdf) . For this, I am working with

rate_of_life_formation = lognormal(1, 50) // yes, (1, 50) is correct
fraction_of_planets_with_life = 1-exp(-rate_of_life_formation)

I am then multiplying fraction_of_planets_with_life by various other factors. These are sometimes as large as fraction_of_planets_with_life is small. But because fraction_of_planets_with_life has been rounded to 0, the multiplication doesn't go through, that is, the product stays at 0.

Steps to reproduce (simplified example)

x = 10^(-1000)
y = 10^(300)
x*y

Expected behavior:

Either:

I briefly looked into this, and js doesn't seem to have good libraries for arbitrary precision floats. But I haven't looked into this all that much.

What I got instead:

Outputs were too small because some factors had been silently rounded off to 0.

berekuk commented 2 years ago

This is hard to do without tradeoffs. IEEE 754 floats are optimized on hardware level, and switching to any custom bignum implementation would have severe performance consequences.

We could eventually introduce a custom bignum type in Squiggle (also, separate integer type and decimal type would be nice to have), but that won't happen any time soon.

mlochbaum commented 2 years ago

There is a standard expm1 function giving exp minus 1, so with that you could write 1-exp(-rate_of_life_formation) as -expm1(-rate_of_life_formation) and avoid rounding to 0. In general, log odds-ratio representation gives high precision around 0 and 1, but is tricky to compute with. It may be better to store one of p, log(p), or log(1-p) based on the value of p (the logarithms are very close, perhaps equal to log odds-ratio on their domains). I don't think there's much value in added digits: the right probability space makes things work with ordinary numbers of digits but when working in the wrong one there will never be enough.

NunoSempere commented 10 months ago

I previously thought that this issue meant that one couldn't replicate the Dissolving the Fermi paradox paper in Squiggle. But I was wrong; the numerical issues don't matter that much: https://squigglehub.org/models/NunoSempere/are-we-alone.

This is because, sure, you might have some cases where (simplified):

fraction_of_planets_with_life = 10^(-500) // very low, rounded to zero number_of_planets = 10^(1000) // larger than fraction_of_planets_with_life

But this fraction is not enough to change the qualitative conclusions of the paper. In most cases where you do have other planets with intelligent life, you have a lot of them because fraction_of_planets_with_life is not extremely close to zero.

Could be a good idea to close this issue.