r-devel / r-project-sprint-2023

Material for the R project sprint
https://contributor.r-project.org/r-project-sprint-2023/
17 stars 3 forks source link

Large Integers (pt 1, integer size) #71

Open gmbecker opened 1 year ago

gmbecker commented 1 year ago

Endorsed by Luke

R integers are currently 32-bit. It would be useful to raise that limit (to 53 bits, 64 bits, or unlimited, i.e. bignum).

Note (gb speaking): @ltierney has historically been opposed to proliferation of visible-from-r numeric types beyond numeric (ie double) and the somewhat hidden integer (currently int32). The reason as I understood it at the time is that end users needing to worry about what kind of number they have would be a significantly negative outcome, as it would increase the bar of entry and reduce the ease of use by non-engineering types; this is in direct contrast to more "bare metal" style computational scripting languages such as Julia. I haven't had a chance to get clarity from him directly on this, so AFAIK this is unlikely to be a simple question of creating a new R-level type of numeric vector called, e.g., longinteger, or unlimitedinteger (or something shorter and more reasonable than that). He may have changed his mind about this and if so ignore everything written above in favor of his current thoughts.

Also note that all integers always being 64 bits regardless of the size of their content will come with some significant amount of overhead (all integers, including e.g., the values returned by length, grep, sum -in some cases - , etc) would take up twice as much memory and be thus also be significantly more expensive to allocate. Careful benchmarking of this at the very least, and possibly mitigation strategies for it (e.g., ALTREP?) are likely to be needed.