Integrate TFJS kernels into TVM

gyagp commented 1 year ago

Some kernels in TFJS may further improve the performance of TVM for the time being and Intel may provide them.

tqchen commented 1 year ago

One item that would be helpful is to provide nodejs program

tfjs_shader.js --create-shader conv2d_mm

which outputs the shader, then we can likely take over from there and start some initial integrations

tqchen commented 1 year ago

Followup comments, one thing to note is that the kernel can be shape dependent. So one thing that could be helpful instead is something like (where we pass in the input shape spec as well) that way we will be able to allow the tfjs side to get the related kernels

tfjs_shader.js --create-shader conv2d_mm --shapes "[224,224,3], [32, 32, 4, 4]"

tqchen commented 1 year ago

@qjia7 please let me know of the new additional shape config would be sufficient for shader dumping

qjia7 commented 1 year ago

@tqchen We are preparing PRs to dump shaders. The plan is to add a string flag (the dumped kernel name) to tell the backend which kernel to dump. The first PR is here.

tqchen commented 1 year ago

Thank you @qjia7 ! This seems to be a great step.

It would be super nice is to avoid the creation of the tfjs tensor and directly pass in the shape spec that would enable quite natural integration as the command showed above.

axinging commented 1 year ago

@tqchen, for webgpu backend, print shader is now behind a flag WEBGPU_PRINT_SHADER(https://github.com/tensorflow/tfjs/pull/7523). Here are examples.

Print shader in non-model mode

Open below page with urls like: index.html?WEBGPU_PRINT_SHADER=all, index.html?WEBGPU_PRINT_SHADER=binary, index.html?WEBGPU_PRINT_SHADER=binary,depth):

async function testWebGPUPrintShader() {
  tf.env().set('WEBGPU_CPU_FORWARD', false);
  await tf.setBackend('webgpu');
  await tf.ready();
  const re = getURLState(location.search);
  tf.env().set('WEBGPU_PRINT_SHADER', re);
  console.log(tf.env().get('WEBGPU_PRINT_SHADER'));
  // depthwise, matches 'depth'.
  {
    const fSize = 2;
    const pad = 'valid';
    const stride = 1;
    const chMul = 1;
    const inDepth = 1;

    const x = tf.tensor4d(
        [
          0.230664, 0.987388, 0.0685208, 0.419224, 0.887861, 0.731641,
          0.0741907, 0.409265, 0.351377
        ],
        [1, 3, 3, inDepth]);
    const w = tf.tensor4d(
        [0.303873, 0.229223, 0.144333, 0.803373],
        [fSize, fSize, inDepth, chMul],
    );

    const result = tf.depthwiseConv2d(x, w, stride, pad);
  }

  // add(sub,mul), matches 'binary'(Full binary list: https://github.com/tensorflow/tfjs/blob/master/tfjs-backend-webgpu/src/binary_op_util.ts).
  {
    const a = tf.tensor2d([1, 2], [1, 2]);
    const b = tf.tensor2d([1, 2], [1, 2]);
    const c = tf.add(a, b);
  }

  // maxPool, matches 'pool'.
  {
    const x = tf.tensor3d([1, 2, 3, 4, 5, 6, 7, 9, 8], [3, 3, 1]);

    const result = tf.maxPool(x, 2, 1, 0);
  }
}

function getURLState(url) {
  let params = new URLSearchParams(url);
  const keys = [...params.keys()];
  if (keys.length === 0) return '';
  let printShaderString = '';
  if (params.has('WEBGPU_PRINT_SHADER')) {
    printShaderString = params.get('WEBGPU_PRINT_SHADER');
  }
  return printShaderString;
}

Print shader in model mode

If you want try this on a model, you can put this and this under tfjs\e2e\benchmarks\local-benchmark. Setup a web server then type url like:

https://127.0.0.1:8080/tfjs//e2e/benchmarks/local-benchmark/index_model.html?WEBGPU_PRINT_SHADER=binary

tqchen commented 1 year ago

Thank you! is it possible to install tfjs as a nodejs dependency and prin using nodejs? That would allow some native integration of python packages that leverages this

axinging commented 1 year ago

@tqchen I will try how to make this works on node. Will update when any progress.

tqchen commented 1 year ago

cc @Hzfengsy

axinging commented 1 year ago

@tqchen, If you want a quick try on print shader on webgpu-nodejs, I draft a document here: https://github.com/axinging/webgpu-node/tree/main/tfjsmodel-on-external-node

Please note: currently some webgpu APIs are not fully supported in dawn, so this can be only used for dump shader, the predict results are invalid. BTW, I will try to see if there any opportunity to upstream these change and make the usage more simple.

tqchen commented 1 year ago

actually we don't need to see prediction result, instead it would be great to simply get the shaders without running the prediction or even running webgpu api, since we are on the compilation and packaging side.

axinging commented 1 year ago

Hi, @tqchen @Hzfengsy, I drafted a design doc about dump shader here: https://github.com/webatintel/tvm-web/blob/main/TFJS%20WebGPU%20dump%20shader%20design.md So could you please help to review and clarify your detailed request?

tqchen commented 1 year ago

Thank you @axinging , what we want is the ability to get the WGSL shader code without executing them. So effectively an lookup feature.

My understanding is that most of the execution contains two parts of logic(that maybe be coupled together)

S0: shader code based on the workload and return the shader string
S1: compile/cache run the shader code to get the final result

Let me use the following code to show some of the intend of the logic


interface InputSpec {
   shapes: Array<Array<number>>;
};

// Get the shader string based on ket and input shapes (in spec)
//
function getShader(key: str, spec: InputSpec) : string {
   if (spec.shapes[0] match some pattern) {
       return shader0;
   } else {
     ....
   }
}

function matmul(input: Tensor, w: Tensor) {
   const shader = getShader("matmul", [input.shape, w.shape]);
   const output = allocOutput(...);
   // abstract code for compile
   const pipeline = compile(shader);
   ...
   submit(pipeline, inputs, output);
}

What we need is the ability to be able to directly call getShader(key: str, spec: InputSpec) : string by passing in the spec. Note that the definition of input spec can change depending the implementation.

Being able to call this function pragmatically from nodejs, like node tfjsmodel --get-shader conv2d "[[3, 224, 224], [3,3]]" will enable to pattern match conv2d automatically replace our impl with the kernel

pyu10055 commented 1 year ago

@tqchen is there a way to incorporate TFJS backend into the TVM runtime instead of relies on the AOT shader copy?

tqchen commented 1 year ago

Thanks @pyu10055 , the main goal is we would like to be able to do development through python env and recompose the solutions. This would be a orthogonal path from the tvmjs backend runtime integration and use tfjs as a graph exec provider, which i think would also be valuable.

webatintel / tvm-web

Integrate TFJS kernels into TVM #4

Print shader in non-model mode

Print shader in model mode