r-wasm / jupyterlite-webr-kernel

An R kernel for JupyterLite, based on webR.
https://r-wasm.github.io/jupyterlite-webr-kernel/
MIT License
39 stars 7 forks source link

Update for webr v0.4.x & expose `baseUrl`/`repoUrl` in configuration #8

Open psychemedia opened 2 months ago

psychemedia commented 2 months ago

Hi

Are there any plans to upgrade the kernel to webr 0.4.x?

I'm trying to build a custom distribution that requires a custom built webr with a different built in webr repo path. Trying to build a v0.3.x webr dropin (albeit v0.3.3 rather than 0.3.0) is failing for me (requires py 3.11; build fails with broken cflags?) whereas I had success building latest webr build.

georgestagg commented 2 months ago

Yes. I do plan to upgrade the kernel to webR 0.4.x in the near future, but it is low priority right now as other needs are currently taking my attention.

I'm trying to build a custom distribution that requires a custom built webr with a different built in webr repo path.

For the moment, does using a pre-built version of webR and setting the following option at start up serve your needs?

option("webr_pkg_repos" = "http://127.0.0.1:9003/repo")

webr dropin (albeit v0.3.3 rather than 0.3.0) is failing for me (requires py 3.11; build fails with broken cflags?)

Interesting! It looks like the build fails because -lxml2 is included twice in the build flags. The Emscripten linker does not like that, each library should only be specified once in the flags. If you can reproduce that problem when building the latest version of webR you should open an issue on the main webR repo, as that should not happen.

psychemedia commented 2 months ago

The package load via options() works a treat, but there are various things loaded from https://webr.r-wasm.org/v0.3.0/ that are presumably baked in via a path set in the webr build that sets BASE_URL in $HOME/.webr-config.mk.

image

I'm aiming for a completely self-contained version of JupyterLite that can be published from a single trusted domain.

Re: the build, I managed to build the latest version v.0.4.x ok. Should I be able to just drop that in here or are there breaking changes 0.3.x.-> 0.4.x that will impact here?

georgestagg commented 2 months ago

Gotcha, you might also be able to override the base URL and repo URL in webr_kernel.ts in this repo, rather than rebuilding webR, in the following way:

Change this:

https://github.com/r-wasm/jupyterlite-webr-kernel/blob/3c22ad52714acc6262ad7d7568a85128712a7da1/src/webr_kernel.ts#L22-L26

To something like this:

    this.#webRConsole = new Console({
      stdout: (line: string) => console.log(line),
      stderr: (line: string) => console.error(line),
      prompt: (prompt: string) => this.inputRequest({ prompt, password: false }),
    },{
      baseUrl: "https://some-other-base-url/subdir/",
      repoUrl: "https://some-other-repo-url/subdir/",
      REnv: {
        R_HOME: '/usr/lib/R',
        FONTCONFIG_PATH: '/etc/fonts',
        R_ENABLE_JIT: '0',
    });

Ideally we would make this available to the kernel user as a parameter without needing to hack the source like this.

For this to work, the base Url assets will need to match the version of webR installed in node. package.json says "webr": "^0.3.0" so you'll need the assets from this package: https://github.com/r-wasm/webr/releases/download/v0.3.0/webr-0.3.0.zip

psychemedia commented 2 months ago

Ah, thanks v. much, will try that.

psychemedia commented 2 months ago

That seems to throw a ts error when I build it, as if new Console({..}, {...}); is not expected; should baseUrl etc. be in the same object as prompt?

UPDATE: my eyes are shot today! Missing } to close REnv.

psychemedia commented 2 months ago

Re: strategies for making this user configurable, jupyterlite offers three file configuration options; the jupyter_lite_config.json file "for build time configuration, typically when running jupyter lite build" could be a sensible place from which to do that?

georgestagg commented 2 months ago

Yes, jupyter_lite_config.json does look like the right place. When we get to updating to webR 0.4.x I'll take a look at exposing these options in that configuration file as part of the upgrade.

psychemedia commented 2 months ago

I am serving custom build packages using GitHub Pages and note that package data files are compressed by the builder (eg M348_1.3.4.data.gz but that the web request is for the uncompressed version (eg M348_1.3.4.data). Is that fixable via a setting anywhere?

image

Also in passing, I note the JupyertLite builder built to ./DESCRIPTION, ./share.js.data and ./share.js.metadata but customised local jupyterlite repo is looking for ./vfs/usr/lib/R/library/translations/DESCRIPTION and on the path ./vfs/usr/lib/R/share.js.metadata for the latter two files and ?

georgestagg commented 2 months ago

Ah, yes, webR < 0.4.1 does not support compressed filesystem images. So the issue will be solved once we upgrade webR here.

Are you using r-wasm/actions? If so you should be able to set compress: false to disable the .gz compression.

jobs:
  deploy-cran-repo:
    uses: r-wasm/actions/.github/workflows/deploy-cran-repo.yml@v2
    with:
      compress: false

If you are using the rwasm R package directly, you should be able to pass the compress argument when you build the package(s):

rwasm::add_pkg("cli", compress = FALSE)
psychemedia commented 1 month ago

In passing, to allow me to read CSV files that are shipped with a distribution as content into local browser storage, and which are exposed via the JuptyerLite IRL on the /files/ path, I note we can pass in a variable to the kernel that identifies the JupyerLite URL path:

function getFormattedUrl(url: string = window.location.href): string {
  try {
    const urlObj = new URL(url);
    // Get protocol (includes the trailing ':')
    const protocol = urlObj.protocol;
    // Get domain (hostname includes subdomains)
    const domain = urlObj.hostname;
    // Get port if it exists
    const port = urlObj.port ? `:${urlObj.port}` : '';

    // Get path and remove any file names (like index.html)
    let path = urlObj.pathname;
    path = path.replace(/\/[^\/]+\.[^\/]+$/, '/');

    // Strip the /lab path element
    path = path.replace(/\/lab\/?$/, '/');

    // Ensure path ends with trailing slash
    if (!path.endsWith('/')) {
      path += '/';
    }

    return `${protocol}//${domain}${port}${path}`;
  } catch (e) {
    // Return empty string or throw error based on your needs
    return '';
  }
}

async setupEnvironment(): Promise<void> {
    ...
    // Try to set the path
    const currentUrl = getFormattedUrl();
    await this.webR.evalRVoid(`JUPYTERLITE_PATH <- "${currentUrl}"`);
    ....

read.csv() can then be monkey patched to accept a file name (eg filename = "mydata.csv") and then generate a URL, eg https://ouseful-testing.github.io/jupyterlite-webr-kernel/files/mydata.csv / paste0( JUPYTERLITE_PATH, "files/", filename), that read.csv() can read from.