denoland / deno

A modern runtime for JavaScript and TypeScript.
https://deno.com
MIT License
97.67k stars 5.38k forks source link

WebGPU computer shader seems not working when using float16 type #23125

Open csAugust opened 7 months ago

csAugust commented 7 months ago

Version:

deno 1.42.0 (release, x86_64-pc-windows-msvc) v8 12.3.219.9 typescript 5.4.3

Problem

I tried to run the webgpu demo program (https://github.com/denoland/webgpu-examples/blob/main/hello-compute/mod.ts) using command deno run --unstable-webgpu --allow-read --allow-write mod.ts. The original code works well, but when I try to use float16 in computer shader, I get no output [ "0", "0", "0", "0" ] with no warning or error messages. The only modification I did was:

const device = await adapter?.requestDevice({
  "requiredFeatures": ['shader-f16'],
});
...
const shaderCode =  `
  enable f16;
  @group(0)
...

I get no output no matter using data type Uint32Array, Float32Array or Float16Array(from https://deno.land/x/float16@v3.8.6/src/index.mjs). But in chrome canary, the same code works.

Could anyone please tell me if deno doesn't support f16 in WebGPU currently, or something else goes wrong? Thanks a lot.

csAugust commented 7 months ago
// import { createBufferInit } from "../utils.ts";
import * as float16 from "https://deno.land/x/float16@v3.8.6/mod.ts";

function createBufferInit(
    device,
    descriptor,
  ) {
    const contents = new Uint8Array(descriptor.contents);
    console.log(contents);

    const alignMask = 4 - 1;
    const paddedSize = Math.max(
      (contents.byteLength + alignMask) & ~alignMask,
      4,
    );

    const buffer = device.createBuffer({
      label: descriptor.label,
      usage: descriptor.usage,
      mappedAtCreation: true,
      size: paddedSize,
    });
    const data = new Uint8Array(buffer.getMappedRange());
    data.set(contents);
    buffer.unmap();
    return buffer;
  }
const OVERFLOW = 0xffffffff;

// Get some numbers from the command line, or use the default 1, 4, 3, 295.
// const Type = Uint32Array
const Type = Float32Array
// const Type = float16.Float16Array
let numbers: Type;
if (Deno.args.length > 0) {
  numbers = new Type(Deno.args.map((a) => parseInt(a)));
} else {
  numbers = new Type([1, 4, 3, 295]);
}

const adapter = await navigator.gpu.requestAdapter();
const required_features = [];
if (adapter.features.has('shader-f16')) {
  required_features.push('shader-f16')
} else {
  alert('need a browser that supports fp16')
}
const device = await adapter?.requestDevice({
  "requiredFeatures": required_features,
});

if (!device) {
  console.error("no suitable adapter found");
  Deno.exit(0);
}

const shaderCode = /* wgsl */ `
  enable f16;
  @group(0)
  @binding(0)
  var<storage, read_write > v_indices: array<f32>; // this is used as both input and output for convenience
  fn collatz_iterations(n_base: f32) -> f32{
    return n_base + f32(1);
  }

  @compute
  @workgroup_size(1)
  fn main(@builtin(global_invocation_id) global_id: vec3<u32>) {
    v_indices[global_id.x] = collatz_iterations(v_indices[global_id.x]);
  }
`;

const shaderModule = device.createShaderModule({
  code: shaderCode,
});

const stagingBuffer = device.createBuffer({
  size: numbers.byteLength,
  usage: GPUBufferUsage.MAP_READ | GPUBufferUsage.COPY_DST,
});

const storageBuffer = createBufferInit(device, {
  label: "Storage Buffer",
  usage: GPUBufferUsage.STORAGE | GPUBufferUsage.COPY_DST |
    GPUBufferUsage.COPY_SRC,
  contents: numbers.buffer,
});

const computePipeline = device.createComputePipeline({
  layout: "auto",
  compute: {
    module: shaderModule,
    entryPoint: "main",
  },
});

const bindGroupLayout = computePipeline.getBindGroupLayout(0);
const bindGroup = device.createBindGroup({
  layout: bindGroupLayout,
  entries: [
    {
      binding: 0,
      resource: {
        buffer: storageBuffer,
      },
    },
  ],
});

const encoder = device.createCommandEncoder();

const computePass = encoder.beginComputePass();
computePass.setPipeline(computePipeline);
computePass.setBindGroup(0, bindGroup);
computePass.insertDebugMarker("compute collatz iterations");
computePass.dispatchWorkgroups(numbers.length);
computePass.end();

encoder.copyBufferToBuffer(
  storageBuffer,
  0,
  stagingBuffer,
  0,
  numbers.byteLength,
);

device.queue.submit([encoder.finish()]);

await stagingBuffer.mapAsync(1);
const arrayBufferData = stagingBuffer.getMappedRange();
const uintData = new Type(arrayBufferData);
const checkedData = Array.from(uintData).map((n) => {
  if (n === OVERFLOW) {
    return "OVERFLOW";
  } else {
    return n.toString();
  }
});
console.log(checkedData);
stagingBuffer.unmap();

This is a simple version that I have tried to run using deno run --unstable-webgpu --allow-read --allow-write mod.ts. When I let enable f16; be annotated, it works.

Hajime-san commented 5 months ago

I met an error message like below when I ran it.

% deno --version                                              
deno 1.43.3 (release, aarch64-apple-darwin)
v8 12.4.254.13
typescript 5.4.5

% deno run --unstable-webgpu --allow-read --allow-write mod.ts
Device::create_shader_module error: 
Shader '' parsing error: expected global item ('struct', 'const', 'var', 'alias', ';', 'fn') or the end of the file, found 'enable'
  ┌─ wgsl:2:3
  │
2 │   enable f16;
  │   ^^^^^^ expected global item ('struct', 'const', 'var', 'alias', ';', 'fn') or the end of the file

It seems that we'll have to wait until wgpu supports it.

https://github.com/gfx-rs/wgpu/issues/5476 https://github.com/gfx-rs/wgpu/issues/4384

FL33TW00D commented 5 months ago

https://github.com/gfx-rs/wgpu/pull/5701