Open gyagp opened 1 year ago
One item that would be helpful is to provide nodejs program
tfjs_shader.js --create-shader conv2d_mm
which outputs the shader, then we can likely take over from there and start some initial integrations
Followup comments, one thing to note is that the kernel can be shape dependent. So one thing that could be helpful instead is something like (where we pass in the input shape spec as well) that way we will be able to allow the tfjs side to get the related kernels
tfjs_shader.js --create-shader conv2d_mm --shapes "[224,224,3], [32, 32, 4, 4]"
@qjia7 please let me know of the new additional shape config would be sufficient for shader dumping
@tqchen We are preparing PRs to dump shaders. The plan is to add a string flag (the dumped kernel name) to tell the backend which kernel to dump. The first PR is here.
Thank you @qjia7 ! This seems to be a great step.
It would be super nice is to avoid the creation of the tfjs tensor and directly pass in the shape spec that would enable quite natural integration as the command showed above.
@tqchen, for webgpu backend, print shader is now behind a flag WEBGPU_PRINT_SHADER(https://github.com/tensorflow/tfjs/pull/7523). Here are examples.
Open below page with urls like: index.html?WEBGPU_PRINT_SHADER=all, index.html?WEBGPU_PRINT_SHADER=binary, index.html?WEBGPU_PRINT_SHADER=binary,depth):
async function testWebGPUPrintShader() {
tf.env().set('WEBGPU_CPU_FORWARD', false);
await tf.setBackend('webgpu');
await tf.ready();
const re = getURLState(location.search);
tf.env().set('WEBGPU_PRINT_SHADER', re);
console.log(tf.env().get('WEBGPU_PRINT_SHADER'));
// depthwise, matches 'depth'.
{
const fSize = 2;
const pad = 'valid';
const stride = 1;
const chMul = 1;
const inDepth = 1;
const x = tf.tensor4d(
[
0.230664, 0.987388, 0.0685208, 0.419224, 0.887861, 0.731641,
0.0741907, 0.409265, 0.351377
],
[1, 3, 3, inDepth]);
const w = tf.tensor4d(
[0.303873, 0.229223, 0.144333, 0.803373],
[fSize, fSize, inDepth, chMul],
);
const result = tf.depthwiseConv2d(x, w, stride, pad);
}
// add(sub,mul), matches 'binary'(Full binary list: https://github.com/tensorflow/tfjs/blob/master/tfjs-backend-webgpu/src/binary_op_util.ts).
{
const a = tf.tensor2d([1, 2], [1, 2]);
const b = tf.tensor2d([1, 2], [1, 2]);
const c = tf.add(a, b);
}
// maxPool, matches 'pool'.
{
const x = tf.tensor3d([1, 2, 3, 4, 5, 6, 7, 9, 8], [3, 3, 1]);
const result = tf.maxPool(x, 2, 1, 0);
}
}
function getURLState(url) {
let params = new URLSearchParams(url);
const keys = [...params.keys()];
if (keys.length === 0) return '';
let printShaderString = '';
if (params.has('WEBGPU_PRINT_SHADER')) {
printShaderString = params.get('WEBGPU_PRINT_SHADER');
}
return printShaderString;
}
If you want try this on a model, you can put this and this under tfjs\e2e\benchmarks\local-benchmark. Setup a web server then type url like:
https://127.0.0.1:8080/tfjs//e2e/benchmarks/local-benchmark/index_model.html?WEBGPU_PRINT_SHADER=binary
Thank you! is it possible to install tfjs as a nodejs dependency and prin using nodejs? That would allow some native integration of python packages that leverages this
@tqchen I will try how to make this works on node. Will update when any progress.
cc @Hzfengsy
@tqchen, If you want a quick try on print shader on webgpu-nodejs, I draft a document here: https://github.com/axinging/webgpu-node/tree/main/tfjsmodel-on-external-node
Please note: currently some webgpu APIs are not fully supported in dawn, so this can be only used for dump shader, the predict results are invalid. BTW, I will try to see if there any opportunity to upstream these change and make the usage more simple.
actually we don't need to see prediction result, instead it would be great to simply get the shaders without running the prediction or even running webgpu api, since we are on the compilation and packaging side.
Hi, @tqchen @Hzfengsy, I drafted a design doc about dump shader here: https://github.com/webatintel/tvm-web/blob/main/TFJS%20WebGPU%20dump%20shader%20design.md So could you please help to review and clarify your detailed request?
Thank you @axinging , what we want is the ability to get the WGSL shader code without executing them. So effectively an lookup feature.
My understanding is that most of the execution contains two parts of logic(that maybe be coupled together)
Let me use the following code to show some of the intend of the logic
interface InputSpec {
shapes: Array<Array<number>>;
};
// Get the shader string based on ket and input shapes (in spec)
//
function getShader(key: str, spec: InputSpec) : string {
if (spec.shapes[0] match some pattern) {
return shader0;
} else {
....
}
}
function matmul(input: Tensor, w: Tensor) {
const shader = getShader("matmul", [input.shape, w.shape]);
const output = allocOutput(...);
// abstract code for compile
const pipeline = compile(shader);
...
submit(pipeline, inputs, output);
}
What we need is the ability to be able to directly call getShader(key: str, spec: InputSpec) : string
by passing in the spec. Note that the definition of input spec can change depending the implementation.
Being able to call this function pragmatically from nodejs, like node tfjsmodel --get-shader conv2d "[[3, 224, 224], [3,3]]"
will enable to pattern match conv2d automatically replace our impl with the kernel
@tqchen is there a way to incorporate TFJS backend into the TVM runtime instead of relies on the AOT shader copy?
Thanks @pyu10055 , the main goal is we would like to be able to do development through python env and recompose the solutions. This would be a orthogonal path from the tvmjs backend runtime integration and use tfjs as a graph exec provider, which i think would also be valuable.
Some kernels in TFJS may further improve the performance of TVM for the time being and Intel may provide them.