r-lib / rlang

Low-level API for programming with R
https://rlang.r-lib.org
Other
500 stars 137 forks source link

fix `is_integerish()` lower bound for large negative doubles #1530

Closed sorhawell closed 1 year ago

sorhawell commented 1 year ago

Hello wonderful people :)

bugfix

Currently rlang has chosen 2^52 as the upper bound for integerish doubles. However the effective lower bound is somewhere around -2^63-2^10 which is much lower than -2^52 which was likely the intended bound. This PR adds the lower bound RLANG_MIN_DOUBLE_INT = -2^52 and append some test statements. If not adopting fix rlang::is_integerish(-2^63)will return TRUE although -2^63 is not unambiguously represented as a binary64 floating point.

Bonus topic ( I will repost as issue if no obvious explanation):

I don't understand why the bound should not to allow representation of all the integers {-2^53+1, ..., -1, 0, 1, ..., 2^53-1} . The double has 53 significant bits and the sign bit. Wikipedia says -2^53 to 2^53 could be represented exactly(footnote*). Are there any other reasons?

footnote: This is true, however that is unpractical unless guaranteed the floating bound is bounded to the same range. Better to choose the range which is represented unambiguously by binary64/float64. As -2^53-1 and 2^53+1 has the same float representation as -2^53 and 2^53 respectively, that makes the -2^53 and 2^53 ambiguous represented, hence the [2^53+1; 2^53-1] bound.

sorhawell commented 1 year ago

I don't think, the Ubuntu 18.4 devel fail was due to this PR. If so, I will revise.

lionel- commented 1 year ago

Thanks!

why the bound should not to allow representation of all the integers {-2^53+1, ..., -1, 0, 1, ..., 2^53-1}

A comment says I've chosen the same bounds as the length of long vectors in R. We could extend it to the range you mention, but I'd prefer not to make any changes here unless there's a practical reason to.

lionel- commented 1 year ago

Thanks for the patch!

You'll need to reset your repo or your main branch as I've force-pushed there. For this reason it's usually better to send updates from a disposable branch.