PRL-PRG / UFOs

User Fault Objects: making vectors lazy and forgetful.
12 stars 3 forks source link

UFOs: larger-than-memory vectors for R

User Fault Objects (UFOs) is a framework for implementing custom larger-than-memory objects using a feature of the Linux kernel called userfaultfd.

Modus operandi

The UFO framework makes possible the creation of special UFO vectors. These vectors are indistinguishable from plain old R vectors: they are also contiguous areas of memory, and they also consist of an ordinary R vector header, length information, and a an array of elements. The difference is that they reside in virtual memory. When an element in the vector is accessed, an area of this virtual memory is accessed. This causes the operating system to discovers a fault and to inform the UFO framework about where the fault occured. The framework then allocates a chunk, a relatively small block, of actual memory and populates it using a programmer-defined population function. This creates the element of the vector that is being accessed, as well as a prudent amount of elements ahead of this one. From now on, this block of memory and these elements can be accessed as an ordinary R vector. When more faults occur, more chunks are brought in. When the memory taken up by the materialized chunks starts to become too big, the UFO framework will start reclaiming them, preventing the R session from running out of memory.

Warning: UFOs are under active development. Some bugs are to be expected, and some features are not yet fully implemented.

Repository map

This repository contains four R packages:

Prerequisites

Check if your operating system restricts who can call userfaultfd:

cat /proc/sys/vm/unprivileged_userfaultfd

0 means only privileged users can call userfaultfd and UFOs will only work for privileged users. To allow unprivileged users to call userfaultfd:

sysctl -w vm.unprivileged_userfaultfd=1

Installation

git clone https://github.com/PRL-PRG/UFOs.git
git clone https://github.com/PRL-PRG/viewports.git
R CMD INSTALL viewports                       ## Dependency for UFOs/ufovectors
R CMD INSTALL UFOs/ufos                       ## UFO framework
R CMD INSTALL UFOs/ufoseq                     ## example/tutorial implementation: sequences
R CMD INSTALL UFOs/ufovectors                 ## example implementation: file-backed vectors and matrices
R CMD INSTALL UFOs/ufoaltrep                  ## ALTREP implementation of file-backed vectors and matrices

Usage

For usage information, the reader is referred to specific package vignettes:

System requirements

Troubleshooting

syscall/userfaultfd: Operation not permitted
error initializing User-Fault file descriptor: Invalid argument
Error: package or namespace load failed for ‘ufos’:
 .onLoad failed in loadNamespace() for 'ufos', details:
  call: .initialize()
  error: Error initializing the UFO framework (-1)

The user has insufficient privileges to execute a userfaultfd system call.

One likely culprit is that a global sysctl knob "vm.unprivileged_userfaultfd" to control whether userfaultfd is allowed by unprivileged users was added to kernel settings. If /proc/sys/vm/unprivileged_userfaultfd is 0, do:

sysctl -w vm.unprivileged_userfaultfd=1