Open LeviPesin opened 1 year ago
You are talking about this line, right?
Do you mind finding out what part of this line is slow? Is it buffer.getMappedRange()
?
This is potentially something that should be reported to the WebGPU implementers.
The slow part is .set( array )
-- the array that is used in that example consists of 4*1024^2 elements and copying it is so slow. That's why I'm not sure whether it can be fixed at all...
Maybe we can at least skip the array copy if the array is uninitialized and zero-filled (as it is in many cases)?
as it is in many cases
I think this is an edge case since normally the buffer data (position
, normals
etc.) are already defined when the geometry is used for rendering the first time.
Can you please verify if the performance gets better with this approach:
_createBuffer( attribute, usage ) {
const array = attribute.array;
const size = array.byteLength + ( ( 4 - ( array.byteLength % 4 ) ) % 4 ); // ensure 4 byte alignment, see #20441
const buffer = this.device.createBuffer( {
size,
usage: usage | GPUBufferUsage.COPY_SRC | GPUBufferUsage.COPY_DST
} );
this._writeBuffer( buffer, attribute );
attribute.onUploadCallback();
return {
version: attribute.version,
buffer,
readBuffer: null,
usage
};
}
No, I'm afraid it only gets worse -- CPU time taken increases to about 85% of total CPU time in WebGPURenderer.compute()
, and the real time (measured by console.time
/console.timeEnd
) increases from 50-100 ms to 300-600 ms...
By the way, it seems that such large times (for both approaches) happen only about one third of tries -- mostly times are about 0.1 ms... This is really strange, because there does not seem to be a cache?
I don't know what the browser internally does. But maybe the WebGPU Matrix chatroom or the Chromium bug tracker would be good places for finding this out.
@Mugen87 -- I don't really think this is a browser issue... I will check in Firefox both approaches, but I don't think copying array would be significantly faster there (but maybe the second approach will).
I've marked this as a browser issue since I don't think we can solve it here.
Thought about this issue again and I think we should update the code in WebGPUAttributes
to the version of https://github.com/mrdoob/three.js/issues/25295#issuecomment-1386776337.
It seems using getMappedRange()
does only make sense if you don't have an array buffer in your app yet. Meaning you request one, create a typed array with it and then write your data into it.
In our case however data are already defined in buffer attributes. So it should be more performant to not create a new array and perform a copy but just use GPUQueue.writeBuffer()
. The slow .set( array )
part is then gone.
When I last tested the approach in https://github.com/mrdoob/three.js/issues/25295#issuecomment-1386776337 it was much slower than the current code.
I'm looking into playing around with WGPU and was wondering what the performance implications of this issue are. From what I can tell, it looks like the slowdown would be triggered whenever any attributes or their usage mode is updated, but there would be no slowdown if the attributes are initialized with data and not changed afterwards. Is that an accurate assessment?
I think that creation is very slow (this is what the issue is about), but update maybe is even slower (because the second approach in https://github.com/mrdoob/three.js/issues/25295#issuecomment-1386776337 appeared to be slower than the current).
Sorry, my question was a bit misleading. I meant more in the context of WebGPU in Three.js as a whole. Digging around a bit more, it seems like this would affect almost everything involving attributes (i.e. almost everything)? I'll probably hold off a bit until the ecosystem settles 😅
I'm not super well versed in Three.js internals, but this does make me wonder: If copying the array is what's actually taking most of the time, is this something that doesn't have to be done for WebGL? Otherwise it doesn't seem like there should be a difference. On the other hand, if the version that doesn't copy the array is actually slower, that's a bit of a conundrum.
Maybe one line of investigation could be creating benchmarks for WebGPU vs. WebGL for this specific context and comparing step by step to see where the differences appear? Not sure if the two are too different for this to be possible, but could be worth a shot.
Either way, thank you for your contributions 😊
Description
WebGPUAttributes._createBuffer()
is extremely slow because of line WebGPUAttributes.js:128 (copying an array) -- according to profile in https://github.com/mrdoob/three.js/pull/25273#issuecomment-1384921763 it takes about 70% ofWebGPURenderer.compute()
total CPU time (150 ms out of 210 ms in that case). (Tested on Chrome on Windows 10 but I don't expect that copying an array would be much faster on other browsers.) Not sure if something could be done to mitigate that...Reproduction steps
WebGPURenderer.compute()
WebGPUAttributes._createBuffer()
takes very much timeCode
N/A
Live example
N/A
Screenshots
No response
Version
r149-dev
Device
Desktop
Browser
Chrome
OS
Windows