Kevin-Jin / mmap

Forked from https://r-forge.r-project.org/scm/?group_id=648
1 stars 1 forks source link

Improve support for large files #19

Closed Kevin-Jin closed 7 years ago

Kevin-Jin commented 7 years ago

Because this package is meant to solve problems with fitting large amounts of data in memory, most files that need to be opened will have a size of a few gigabytes. Especially for struct types with fewer fields, this also implies there may be a few billion records per file. This is problematic considering the proliferation of signed 32-bit integers in the code base, which can only represent array indices (file sizes) no higher than about 2 billion (2 gigabytes).

Drop support for versions of R before 3.0.0 and make use of the long vector enhancements introduced in that version. Since a double can represent all the values of a 54-bit signed integer (53-bit unsigned integer), ideally our code should support files as large as 9 petabytes even if the operating system has its own limitations. This means that we should also be able to index 9 quadrillion int8, uint8, char, or uchar values in memory.