BradLarson / GPUImage3

GPUImage 3 is a BSD-licensed Swift framework for GPU-accelerated video and image processing using Metal.
BSD 3-Clause "New" or "Revised" License
2.68k stars 334 forks source link

MTLCommandEncoder Type #6

Open Danny1451 opened 5 years ago

Danny1451 commented 5 years ago

I have some questions about the MTLCommandEncoder type. In the BasicOperation , all the operations like Color processing , Blend use the MTLRenderCommandEncoder to process the image and pass to next ImageConsumer. Why not use the MTLComputeCommandEncoder to do the image process work ? Since not every filter need to render to the view. and MTLComputeCommandEncoder can do data-parallel compute which maybe more effective. Is there any considerations to use MTLRenderCommandEncoder ?

BradLarson commented 5 years ago

The two reasons we are using render operations initially over compute are a) ease of porting and b) last time I'd checked, the render pipeline was faster than compute for equivalent operations. The previous iterations of GPUImage used OpenGL (ES) shaders and by necessity were built around a rendering architecture. That makes it easier to translate those operations into another rendering architecture, this time in Metal, instead of reworking them for compute right off the bat.

An old Apple Developer Forum thread talks about performance differences that people observed initially when it comes to compute vs. render operations in Metal. I saw something similar in my initial tests years ago, but I don't know if any of that is still the case today. This hardware was originally built and optimized for rendering, but so much has been done on the compute side over the last few years that none of that may be true anymore.

This is something that we'll clearly be benchmarking as we get things stabilized, along with side-by-side comparisons with Metal Performance Shaders for relevant operations. I'm sure we'll learn quite a bit out of that. If it turns out that compute allows for better performance, we'll rework to target that.

The current simple shaders we have run so quickly on Metal-supporting devices that it's hard to benchmark differences in their performance. You're looking at maybe sub-millisecond timing changes on current hardware, so we'll need to come up with good test cases and run on as old of hardware as we can, on up.