cloudflare / workers-rs

Write Cloudflare Workers in 100% Rust via WebAssembly
Apache License 2.0
2.57k stars 280 forks source link

Workers Sites support #54

Closed SeokminHong closed 1 month ago

SeokminHong commented 3 years ago

Hi 👋

I had tried to re-implement the Workers Sites using Rust, but I was stuck by KV access.

As far as I know, the kv-asset-handler package uses the STATIC_CONTENT_MANIFEST variable from global context generated by wrangler. Am I right?

If it is, can you provide some ways to accessing the manifest or Rust version of getAssetFromKV function?

nilslice commented 3 years ago

I don't have a solution in mind for this, but it is a good question.
The team is generally recommending the use of Cloudflare Pages instead of Worker Sites, but I understand the desire to use it.

I'm not super familiar with the asset manifest, or with Workers Sites, but will start tracking this request and see if there is something we will officially support.

For now, the code in the Wrangler codebase (https://github.com/cloudflare/wrangler) may point you in the right direction as to reading/deconstructing the asset manifest for use in a Rust KvAssetHandler.

Dav1dde commented 2 years ago

You can access static files through the__STATIC_CONTENT KV:

#[event(fetch)]
pub async fn main(req: worker::Request, env: worker::Env) -> worker::Result<worker::Response> {
    let kv = worker::kv::KvStore::from_this(&env, "__STATIC_CONTENT")?;

    let index = kv.get("index.html").text().await?.expect("index html");
    worker::Response::from_html(index)
}

There is still a lot missing, like caching, mime types etc, butyou can just follow the JS implementation of getAssetFromKV to do that yourself, e.g. I have a small helper like this:

pub async fn serve_asset(req: Request, store: KvStore) -> worker::Result<Response> {
    let path = req.path();
    let path = path.trim_start_matches('/');
    let value = match store.get(path).bytes().await? {
        Some(value) => value,
        None => return Response::error("Not Found", 404),
    };
    let mut response = Response::from_bytes(value)?;
    response
        .headers_mut()
        .set("Content-Type", get_mime(path).unwrap_or("text/plain"))?;
    Ok(response)
}

fn get_mime(path: &str) -> Option<&'static str> {
    let ext = if let Some((_, ext)) = path.rsplit_once(".") {
        ext
    } else {
        return None;
    };

    let ct = match ext {
        "html" => "text/html",
        "css" => "text/css",
        "js" => "text/javascript",
        "json" => "application/json",
        "png" => "image/png",
        "jpg" => "image/jpeg",
        "jpeg" => "image/jpeg",
        "ico" => "image/x-icon",
        "wasm" => "application/wasm",
        _ => return None,
    };

    return Some(ct);
}
nilslice commented 2 years ago

Thank you for the explanation here, @Dav1dde!

One minor note, is that you should be able to use the kv method on Env to access a KV namespace. So instead of:

let kv = worker::kv::KvStore::from_this(&env, "__STATIC_CONTENT")?;

you can do:

let kv = env.kv("__STATIC_CONENT")?;

The caveat is that you're forced to use the worker::kv::KvStore type from the version we have pinned in workers-rs dependencies and you might need another version. If so, please let me know :)

Dav1dde commented 2 years ago

Thanks, I was looking for something like env.kv("__STATIC_CONENT") after moving away from the router, so I just ended up copying the implementation of the router, this is a lot nicer!

The 0.5 version of the workers-kv crate has lots of improvements (which the master branch has already been updated to), luckily cargo makes referencing a git revision really easy, but a new release would help here.

Improvements like:

Dav1dde commented 2 years ago

@nilslice now I am actually trying to deploy this on cloudflare with wrangler, I am running into the issue that the files are hashed and I don't seem to have access to the manifest __STATIC_CONTENT_MANIFEST, how do I read the manifest?

Dav1dde commented 2 years ago

I have a solution now, but I wish it wasn't necessary:

Have a post-processing script:

cat <<EOF > build/worker/assets.mjs
import manifestJSON from '__STATIC_CONTENT_MANIFEST'
const assetManifest = JSON.parse(manifestJSON)

export function get_asset(name) {
    return assetManifest[name];
}
EOF

Then you can access it in Rust:

#[wasm_bindgen(raw_module = "./assets.mjs")]
extern "C" {
    fn get_asset(name: &str) -> Option<String>;
}

pub fn resolve(name: &str) -> Cow<'_, str> {
    match get_asset(name) {
        Some(name) => Cow::Owned(name),
        None => Cow::Borrowed(name),
    }
}
SeokminHong commented 2 years ago

Thanks to @Dav1dde , I also found a solution without using JavaScript directly.

#[wasm_bindgen(module = "__STATIC_CONTENT_MANIFEST")]
extern "C" {
    #[wasm_bindgen(js_name = "default")]
    static MANIFEST: String;
}

pub fn resolve(name: &str) -> Cow<'_, str> {
    match serde_json::from_str::<HashMap<&str, &str>>(&MANIFEST)
        .ok()
        .and_then(|m| m.get(name).map(|v| v.to_string()))
    {
        Some(val) => Cow::Owned(val),
        None => Cow::Borrowed(name),
    }
}
SeokminHong commented 2 years ago

Thanks to @Dav1dde , I also found a solution without using JavaScript directly.

#[wasm_bindgen(module = "__STATIC_CONTENT_MANIFEST")]
extern "C" {
    #[wasm_bindgen(js_name = "default")]
    static MANIFEST: String;
}

pub fn resolve(name: &str) -> Cow<'_, str> {
    match serde_json::from_str::<HashMap<&str, &str>>(&MANIFEST)
        .ok()
        .and_then(|m| m.get(name).map(|v| v.to_string()))
    {
        Some(val) => Cow::Owned(val),
        None => Cow::Borrowed(name),
    }
}

This solution wouldn't work with the latest worker-build because of the swc-bundler. SWC bundler tries to resolve import {default as default0} from "__STATIC_CONTENT_MANIFEST", and it will fail during bundling.

Fortunately, the worker-build using SWC hasn't been published yet, but I'll try find the better way

allsey87 commented 2 years ago

@nilslice there seems to be a couple good solutions proposed above. Would you be open to accepting a PR for either @Dav1dde or @SeokminHong solution?

The team is generally recommending the use of Cloudflare Pages instead of Worker Sites

Perhaps I am missing something, but I came to the conclusion today that Cloudflare Pages is not at all compatible with Rust/WebAssembly. It seems to me that functions do not support WebAssembly and even if one were to try to use the legacy worker support via the _worker.js file, this won't include any of the files that it imports (e.g., the WebAssembly module)?

I guess it would be possible to encode the entire module as base64 and inline it into a _worker.js file, but that feels like a very roundabout workflow....

nilslice commented 2 years ago

Hi @allsey87 - I'm not at Cloudflare anymore and don't have a good line of sight into the priorities here, nor the ability to merge anything.

Maybe @zebp could provide some feedback though.

allsey87 commented 2 years ago

This is my complete solution based on the answers above

// asset.rs
use once_cell::sync::Lazy;
use std::collections::HashMap;
use worker::*;
use worker::wasm_bindgen::prelude::*;

#[wasm_bindgen(module = "__STATIC_CONTENT_MANIFEST")]
extern "C" {
    #[wasm_bindgen(js_name = "default")]
    static MANIFEST: String;
}

static MANIFEST_MAP: Lazy<HashMap<&str, &str>> = Lazy::new(|| {
    serde_json::from_str::<HashMap<&str, &str>>(&MANIFEST)
        .unwrap_or_default()
});

pub async fn serve(context: RouteContext<()>) -> worker::Result<Response> {
    let assets = context.kv("__STATIC_CONTENT")?;
    let asset = context.param("asset")
        .map(String::as_str)
        .unwrap_or("index.html");
    /* if we are using miniflare (or wrangler with --local), MANIFEST_MAP is empty and we just
       fetch the requested name of the asset from the KV store, otherwise, MANIFEST_MAP
       provides the hashed name of the asset */
    let path = MANIFEST_MAP.get(asset).unwrap_or(&asset);
    match assets.get(path).bytes().await? {
        Some(value) => {
            let mut response = Response::from_bytes(value)?;
            response.headers_mut()
                .set("Content-Type", path.rsplit_once(".")
                    .map_or_else(|| "text/plain", |(_, ext)| match ext {
                        "html" => "text/html",
                        "css" => "text/css",
                        "js" => "text/javascript",
                        "json" => "application/json",
                        "png" => "image/png",
                        "jpg" => "image/jpeg",
                        "jpeg" => "image/jpeg",
                        "ico" => "image/x-icon",
                        "wasm" => "application/wasm",
                        _ => "text/plain",
                    })
                )
                .map(|_| response)
        }
        None => Response::error("Not Found", 404),
    }
}

For my router, I then can just write:

// lib.rs
use worker::*;
mod utils;
mod asset;

#[event(fetch)]
pub async fn main(req: Request, env: Env, _: worker::Context) -> Result<Response> {
    utils::set_panic_hook();
    Router::new()
        .get_async("/", |_, context| asset::serve(context)) // for index.html
        .get_async("/:asset", |_, context| asset::serve(context))
        .run(req, env).await
}

This solution wouldn't work with the latest worker-build because of the swc-bundler. SWC bundler tries to resolve import {default as default0} from "__STATIC_CONTENT_MANIFEST", and it will fail during bundling.

This issue wasn't relevant to me since I don't use worker-build or swc-bundler

SeokminHong commented 2 years ago

@allsey87 That's right. And I also made a commit to handle the latest worker-build for my own use: https://github.com/cloudflare/workers-rs/commit/5c7051daf53e4b93667568c34ea8cafa2550f669

At that time, the cache API wasn't merged so I stopped working for the kv asset handler.

allsey87 commented 2 years ago

@SeokminHong not really the place to ask, but what pattern did you use for matching assets in sub-directories? I am finding that .get_async("/:asset", |_, context| asset::serve(context)) doesn't match on /images/some_image.png.

andyredhead commented 2 years ago

You may not need the leading "/" on "/images/some_image.png" - perhaps just "images/some_image.png".

I haven't tried accessing assets in a cloudflare workers site from rust/wasm yet (was just browsing about to see if anyone else has done it already) but I have done it from JavaScript, where not including the leading slash in an asset path worked ok.

armfazh commented 1 year ago

In #308 I propose a function that allows to make the translation of asset names, For example: favicon.ico was mangled as favicon.<HASH>.ico.

lemmih commented 2 months ago

I'm saddened to see Sites being deprecated. Just like @allsey87, I cannot see how Pages can replace Workers.

BrandonDyer64 commented 2 months ago

@lemmih I'm trying to figure that out as well. I have a full stack Rust application that runs on workers but also has some static assets (logo, favicon, language files) and am being told by the Cloudflare Pages migration guide to merely "remove the Workers application and any associated wrangler.toml configuration files or build output" which is one of the most ridiculous things I've ever heard.

Theoretically, it's possible to run WASM in a Pages function, but there is no documentation on how to do that or what is even possible, all the while being told "Do not use Workers Sites for new projects."

kflansburg commented 1 month ago

With the introduction of Workers Assets, I'm going to close this.

allsey87 commented 1 month ago

@kflansburg could you perhaps leave a link here to the change log, PR, or a summary of Worker Assets?

lemmih commented 1 month ago

https://developers.cloudflare.com/workers/static-assets/

Static Assets are perfect for my use case. Thanks!

kflansburg commented 1 month ago

@kflansburg could you perhaps leave a link here to the change log, PR, or a summary of Worker Assets?

This is a new platform feature, there were no changes to workers-rs. Docs: https://developers.cloudflare.com/workers/static-assets/

BrandonDyer64 commented 1 month ago

@allsey87 They have an example on how to use it in conjunction with a full-stack leptos app: https://github.com/cloudflare/workers-rs/tree/main/templates/leptos Even if your particular use-case isn't leptos, it might be helpful.

BrandonDyer64 commented 1 month ago

@kflansburg In regards to this:

there were no changes to workers-rs

What about access to env.ASSETS?

kflansburg commented 1 month ago

@kflansburg In regards to this:

there were no changes to workers-rs

What about access to env.ASSETS?

Ah, I haven't added that yet. It's only needed if you want access to static assets from your Worker, if you just want us to serve the assets without modification, that is done without invoking your Worker at all. I created https://github.com/cloudflare/workers-rs/issues/644