Because this package is meant to solve problems with fitting large amounts of data in memory, most files that need to be opened will have a size of a few gigabytes. Especially for struct types with fewer fields, this also implies there may be a few billion records per file. This is problematic considering the proliferation of signed 32-bit integers in the code base, which can only represent array indices (file sizes) no higher than about 2 billion (2 gigabytes).
Drop support for versions of R before 3.0.0 and make use of the long vector enhancements introduced in that version. Since a double can represent all the values of a 54-bit signed integer (53-bit unsigned integer), ideally our code should support files as large as 9 petabytes even if the operating system has its own limitations. This means that we should also be able to index 9 quadrillion int8, uint8, char, or uchar values in memory.
Because this package is meant to solve problems with fitting large amounts of data in memory, most files that need to be opened will have a size of a few gigabytes. Especially for
struct
types with fewer fields, this also implies there may be a few billion records per file. This is problematic considering the proliferation of signed 32-bit integers in the code base, which can only represent array indices (file sizes) no higher than about 2 billion (2 gigabytes).Drop support for versions of R before 3.0.0 and make use of the long vector enhancements introduced in that version. Since a double can represent all the values of a 54-bit signed integer (53-bit unsigned integer), ideally our code should support files as large as 9 petabytes even if the operating system has its own limitations. This means that we should also be able to index 9 quadrillion
int8
,uint8
,char
, oruchar
values in memory.