r-wasm / webr

The statistical language R compiled to WebAssembly via Emscripten, for use in web browsers and Node.
https://docs.r-wasm.org/webr/latest/
Other
848 stars 67 forks source link

is the emscripten filesysten the virtual filesystem? #428

Closed JosiahParry closed 3 months ago

JosiahParry commented 4 months ago

The WebR docs specify specifically the "emscripten virtual filesystem."

Is this sysnonymous with the browser's FileSystem API?

If not, is it possible to connect webR to the FileSystem API? The situation I am trying to address is installing R packages only once across multiple sessions. Ideally, I could download R packages to the FileSystem API and then webR instance can load packages from there in later sessions.

georgestagg commented 4 months ago

No, the "emscripten virtual filesystem" is described here: https://emscripten.org/docs/api_reference/Filesystem-API.html

Note that for technical reasons (related to how we are blocking for the R console in the JS worker thread), webR only partially exposes the Emscripten filesystem API. For example, the IDBFS filesystem is not currently available under webR. We expect this to be improved once Emscripten's WasmFS is ready (see below).

If not, is it possible to connect webR to the FileSystem API?

It may be possible to hook the two together with JS glue code, but it will be difficult due to the technical reasons mentioned above. Communication with the worker thread will also need to be dealt with. There is currently no "out of the box" way to connect the two systems.


Emscripten is currently working on re-implementing its Filesystem API with a much-improved system named WasmFS that aims to unify and simplify the entire process of loading files. Once that is ready, I expect that webR will also be updated to switch to Emscripten's new WasmFS and the entire process of loading files into webR will become much easier to handle.


In the meantime, from what you describe I think the easiest way to handle this currently is the following procedure: 1) Extract the R packages you are interested in into a single directory, to act as an R library.

2) Build an Emscripten filesystem image from this directory, using Emscripten's file_packager tool. This will give you two image data files: library.data and library.js.metadata.

3) Load the image data files into JS in some way. In this example the JS fetch() API has been used to load the files over the network. Perhaps the resulting images could also be stored/accessed through the filesystem API.

4) Once the image data has been loaded into memory in the JS session, it can be used for multiple webR sessions. Start a webR session and mount the image data into a virtual filesystem directory using webR's FS API:

const options = {
  packages: [{
    blob: await data.blob(), // library.data as a JS blob, or a JS File object
    metadata: await metadata.json(), // The contents of library.js.metadata, parsed as JSON
  }],
}
await webR.FS.mount("WORKERFS", options, '/library');

5) The R packages will be available in the virtual filesystem under /library, add the library to your .libPaths().

Note that this is really a re-exposing of Emscripten's WORKERFS virtual filesytsem API, so more details about the values that options can take can be found in the Emscripten documentation at https://emscripten.org/docs/api_reference/Filesystem-API.html#FS.mount.