antvis / g-device-api

A Device API references WebGPU implementations
https://observablehq.com/@antv/g-device-api
13 stars 3 forks source link

Error handling and panic in observablehq notebook #165

Closed jmatsushita closed 7 months ago

jmatsushita commented 9 months ago

Hi,

Sorry for opening the issue directly on github, but I cannot see this repo in the dropdown menu in https://antv-issue-helper.surge.sh

image

I created a reproducer in ObservableHQ for a panic when using a specific shader: https://observablehq.com/d/bece57b91ebd5a40

I'm using a compute-toys notebook fork which pulls in the latest version @antv/g-device-api@1.6.4

The setup works for the origami shader, however when using shader code that compiles and runs fine in compute.toys (see https://compute.toys/view/1079) I can get the following stack trace in the console:

glsl_wgsl_compiler_bg.wasm:0x23257e Uncaught (in promise) RuntimeError: unreachable
    at __rust_start_panic (glsl_wgsl_compiler_bg.wasm:0x23257e)
    at rust_panic (glsl_wgsl_compiler_bg.wasm:0x2318d1)
    at std::panicking::rust_panic_with_hook::hdb857edf4e61fe11 (glsl_wgsl_compiler_bg.wasm:0x216e3e)
    at std::panicking::begin_panic_handler::{{closure}}::hef75e0b11b6899ae (glsl_wgsl_compiler_bg.wasm:0x21dd09)
    at std::sys_common::backtrace::__rust_end_short_backtrace::h2c4c1ce3acc2944e (glsl_wgsl_compiler_bg.wasm:0x2321dc)
    at rust_begin_unwind (glsl_wgsl_compiler_bg.wasm:0x228274)
    at core::panicking::panic_fmt::hf20f9922c43dc05c (glsl_wgsl_compiler_bg.wasm:0x22ca20)
    at core::result::unwrap_failed::h92d50af5668640d2 (glsl_wgsl_compiler_bg.wasm:0x21f486)
    at wgslcomposer_wgsl_compile (glsl_wgsl_compiler_bg.wasm:0x180ba4)
    at Rr.wgsl_compile (glsl_wgsl_compiler.js:211:12)
$__rust_start_panic    @    glsl_wgsl_compiler_bg.wasm:0x23257e
$rust_panic    @    glsl_wgsl_compiler_bg.wasm:0x2318d1
$std::panicking::rust_panic_with_hook::hdb857edf4e61fe11    @    glsl_wgsl_compiler_bg.wasm:0x216e3e
$std::panicking::begin_panic_handler::{{closure}}::hef75e0b11b6899ae    @    glsl_wgsl_compiler_bg.wasm:0x21dd09
$std::sys_common::backtrace::__rust_end_short_backtrace::h2c4c1ce3acc2944e    @    glsl_wgsl_compiler_bg.wasm:0x2321dc
$rust_begin_unwind    @    glsl_wgsl_compiler_bg.wasm:0x228274
$core::panicking::panic_fmt::hf20f9922c43dc05c    @    glsl_wgsl_compiler_bg.wasm:0x22ca20
$core::result::unwrap_failed::h92d50af5668640d2    @    glsl_wgsl_compiler_bg.wasm:0x21f486
$wgslcomposer_wgsl_compile    @    glsl_wgsl_compiler_bg.wasm:0x180ba4
wgsl_compile    @    glsl_wgsl_compiler.js:211
(anonymous)    @    compute-toys.js?v=4&…7b91ebd5a40@314:485
(anonymous)    @    compute-toys.js?v=4&…7b91ebd5a40@314:478
eval    @    observablehq-17:43
await in eval (async)        
eval    @    observablehq-295:4
(anonymous)    @    worker-488e6463.js:2
Promise.then (async)        
ea    @    worker-488e6463.js:2
value    @    worker-488e6463.js:2
(anonymous)    @    worker-488e6463.js:2
Promise.then (async)        
value    @    worker-488e6463.js:2
value    @    worker-488e6463.js:2
Pr    @    worker-488e6463.js:2
value    @    worker-488e6463.js:2
value    @    worker-488e6463.js:2
define    @    compute-toys.js?v=4&…7b91ebd5a40@314:698
value    @    worker-488e6463.js:2
(anonymous)    @    worker-488e6463.js:2
Promise.then (async)        
Ea    @    worker-488e6463.js:2
define    @    worker-488e6463.js:2
rs    @    worker-488e6463.js:2
(anonymous)

Additional context

Chrome on MacOS Version 122.0.6261.94 (Official Build) (arm64)

I have noticed the following in regard to errors and error handling in general:

Using a trick such as:

var ignore = time.elapsed / time.elapsed; // to avoid rewriting run without the time bindings.
var uv = (fragCoord * 2. - resolution.xy) / resolution.y * ignore;

Works to bypass that problem and avoid having to reinitialise each cell with a different set of bindings.

The fix is easy enough, but the error handling and lack of error messages makes it difficult to use.

let a = hash1(p+float2(0,0)); 
// or let a = hash1(p+vec2(0.,0.)); 

Thank you very much for the great library, I'm really looking forward to using it more in observablehq and I hope my remarks are helpful to improve the error handling experience!

Cheers,

Jun

xiaoiver commented 8 months ago

Just try to change const to let:

// before
const tmax = 2000.0;

// after
let tmax = 2000.0;

Or declare it in module scope:

const tmax = 2000.0;

fn main_image() {}

I will check if it's a bug relative to naga, since according to WGSL Spec, a const-declaration can be declared in function scope: https://www.w3.org/TR/WGSL/#const-decls

jmatsushita commented 8 months ago

Thank you. Indeed the change you suggested fixes the error:

// after
let tmax = 2000.0;

However, should this be a panic (sometimes just swallowed or displaying RuntimeError: unreachable) would it be possible to have a more granular error message to help troubleshoot the shader next time?

xiaoiver commented 8 months ago

You're right, I should throw the compiling error message from naga.

xiaoiver commented 8 months ago

I update both naga & naga-oil to the latest version and throw error message from Rust side.

// before
naga = { version = "0.14.1", features = ["glsl-in", "wgsl-in", "wgsl-out"] }
naga_oil = "0.11.0"

// after
naga = { version = "0.19.2", features = ["glsl-in", "wgsl-in", "wgsl-out"] }
naga_oil = "0.13.0"

Now the error message can be shown correctly. For example, in the tmax case, naga-oil will complain the following message:

make_naga_module Composer error: expected assignment or increment/decrement, found 'tmax'

But we cannot use alias for now: https://github.com/bevyengine/naga_oil/issues/79 If we try to prepend alias int = i32; to our shader chunk:

Composable module identifiers must not require substitution according to naga writeback rules: `int`

So we have to do some alias work when borrowing shaders from compute-toys. Here is the complete alias map used in compute-toys:

alias int = i32;
alias uint = u32;
alias float = f32;
alias int2 = vec2<i32>;
alias int3 = vec3<i32>;
alias int4 = vec4<i32>;
alias uint2 = vec2<u32>;
alias uint3 = vec3<u32>;
alias uint4 = vec4<u32>;
alias float2 = vec2<f32>;
alias float3 = vec3<f32>;
alias float4 = vec4<f32>;
alias bool2 = vec2<bool>;
alias bool3 = vec3<bool>;
alias bool4 = vec4<bool>;
alias float2x2 = mat2x2<f32>;
alias float2x3 = mat2x3<f32>;
alias float2x4 = mat2x4<f32>;
alias float3x2 = mat3x2<f32>;
alias float3x3 = mat3x3<f32>;
alias float3x4 = mat3x4<f32>;
alias float4x2 = mat4x2<f32>;
alias float4x3 = mat4x3<f32>;
alias float4x4 = mat4x4<f32>;

Or use predeclares instead, eg.

xiaoiver commented 8 months ago

Just use the latest @antv/g-device-api@1.6.8 & WASM:

import { WebGPUDeviceContribution } from '@antv/g-device-api';

const deviceContribution = new WebGPUDeviceContribution({
  shaderCompilerPath: '/glsl_wgsl_compiler_bg.wasm',
  // From CDN
  // shaderCompilerPath: 'https://unpkg.com/@antv/g-device-api@1.6.8/rust/pkg/glsl_wgsl_compiler_bg.wasm',
});

Calls begin/endFrame() at the beginning and end of computePass:

device.beginFrame();

const computePass = device.createComputePass();
computePass.setPipeline(computePipeline);
computePass.setBindings(bindings);
computePass.dispatchWorkgroups(1);
device.submitPass(computePass);

device.endFrame();