penumbra-zone / web

Apache License 2.0
12 stars 15 forks source link

Move block processing into wasm #843

Closed grod220 closed 5 days ago

grod220 commented 5 months ago

Context

js class -> json -> | wasm boundary | -> rust struct

To do

Given wasm is quite performant, the idea is to move block processing into wasm entirely. This prevents the back-and-forth serialization overhead. It may even simply logic if it only exists in one domain.

== Consideration == We are querying from typescript HERE. By default, connectRPC is taking the bytes from the query response and parses them. So our current flow actually looks like this:

bytes -> js class -> json -> | wasm boundary | -> rust struct

We should attempt to simplify this to:

bytes -> | wasm boundary | -> rust struct

This would mean copying over parts of code that allow us to not parse, but pass the UnitArray on. Suggestions: https://bufbuild.slack.com/archives/CRZ680FUH/p1711072213138009

From bufbuild team:

function toHex(uint8Array) {
    return uint8Array.length === 0 ? '' : '0x' + Array.from(uint8Array).map(i => ('0' + i.toString(16)).slice(-2)).join(', 0x');
}
fetch("https://demo.connectrpc.com/connectrpc.eliza.v1.ElizaService/Say", {
    method: 'POST',
    headers: {
        "Content-Type": "application/grpc-web",
    },
    body: new Uint8Array(5), // Valid zero-length request message in gRPC Web
}).then(async (response) => {
    const header = new Uint8Array(5);
    let headerPos = 0;
    let payload = new Uint8Array(0);
    let payloadRemaining = 0;
    for await (const chunk of response.body) {
        for (let pos = 0; pos < chunk.length;) {
            if (headerPos < 5) {
                while(headerPos < 5 && pos < chunk.length) {
                    header[headerPos++] = chunk[pos++];
                }
                if (headerPos === 5) {
                    payloadRemaining = new DataView(header.buffer).getUint32(1, false);
                }
                console.log(`Header read pos: ${headerPos}`);
            }
            if (payloadRemaining > 0) {
                const chunkLen = Math.min(chunk.length - pos, payloadRemaining);
                const newMessage = new Uint8Array(payload.length + chunkLen);
                newMessage.set(payload);
                newMessage.set(chunk.subarray(pos, pos + chunkLen), payload.length);
                payload = newMessage;
                pos += chunkLen;
                payloadRemaining -= chunkLen;
                if (payloadRemaining === 0) {
                    // Got an entire message.
                    if (header[0]&0x80) {
                        // Trailers frame
                        console.log(`gRPC Web Trailer: flags=${header[0]} payload:\n${new TextDecoder().decode(payload)}`);
                    } else {
                        // Message
                        console.log(`gRPC Web Message: flags=${header[0]} payload=${toHex(payload)}`);
                    }
                    headerPos = 0;
                    payload = new Uint8Array();
                }
                console.log(`Remaining message bytes: ${payloadRemaining}`);
            }
        }
    }
});

Though, we may be able to skip typescript fetching entirely and simplify this to:

bytes -> rust struct

Which would only be possible if we query for blocks directly from Rust using a wasm-compatible grpc client like tonic-web-wasm-client.

Tasks:

grod220 commented 5 months ago

if we query for blocks directly from Rust using a wasm-compatible grpc client

This is possible in theory, but the rabbit hole may be too deep for now (?). We need the rpc feature enabled to get the query clients needed for tonic-web-wasm-client. We also need the proto-compiler's tonic-build dep to exclude the tonic transport impls and re-compiled. Something like:

tonic-build = { version = "0.10.0", default-features = false, features = ["cleanup-markdown"] }
# then run ./deployments/scripts/protobuf-codegen 

And then finally, we need to have penumbra-proto package be wasm compatible:

cargo build --lib --release --target wasm32-unknown-unknown

At the moment, we are greeted by an assortment of compile issues relating to required "js" features and exclusion of randomness libs.

TalDerei commented 5 days ago

can we close this as not planned?

turbocrime commented 5 days ago

the underlying motivation is raw performance, but the primary goal has been identified as parallelization. in fact the compute-heavy tasks of the block processor are already performed in wasm.

we've experimented with the serde techniques in the issue and found them to be unsuitable. profiling has provided new insight. at best this issue needs refinement, but i think ultimately it does not represent current understanding of the objectives.

given the present consensus and planning effort, to be issued in the next few days, i'm closing this issue.