Closed jadsongmatos closed 4 years ago
You are uploading and downloading arrays of virtually no size, of course it is faster. Too the work on the kernel is synchronous, so you are essentially using 1 thread on the GPU. You need to find a way to make that operation asynchronous, so to take advantage of the GPU. There is overhead to use the GPU, especially when you aren't even allowing for throughput.
You are uploading and downloading arrays of virtually no size, of course it is faster. Too the work on the kernel is synchronous, so you are essentially using 1 thread on the GPU. You need to find a way to make that operation asynchronous, so to take advantage of the GPU. There is overhead to use the GPU, especially when you aren't even allowing for throughput.
How do I use more thread ?
The output size {output: [x, y, z]}
determines the number of threads used. Each combination of x, y, and z represents one thread.
The output is an array which is either 1d, 2d or 3d. Each element of this output array is computed by one thread on the GPU and all these threads compute the output in parallel.
I'm doing a very basic test of a repetition between cpu and gpu, but when I run the code on the gpu it is 30% slower
GPU
const { GPU } = require('../src'); const gpu = new GPU({ mode: 'gpu' }); const {performance} = require('perf_hooks'); var t0 = performance.now() function kernelFunction(a, b) { var sum = 0; for (var i = 0; sum < 33554432; i++) { sum += a[this.thread.y] + b[this.thread.x]; } return sum; } const kernel = gpu.createKernel(kernelFunction, { output: [1] }); const result = kernel([1], [1]); var t1 = performance.now() console.log("Call to doSomething took " + (t1 - t0) + " milliseconds.") console.log(result[0]);
CPU
const {performance} = require('perf_hooks'); var t0 = performance.now() function cpu(a, b) { var sum = 0; for (var i = 0; sum < 33554432; i++) { sum += a + b; } return sum; } const result = cpu(1, 1); var t1 = performance.now() console.log("Call to doSomething took " + (t1 - t0) + " milliseconds.") console.log(result)