Open expenses opened 3 years ago
Thank you for filing this! About the Metal backend, the pipeline caching path is the old stuff we use with SPIRV-Cross. I tried to adjust it for Naga, but it wasn't easy. So for all the purposes, consider there not to be an implementation on Metal right now (since Naga is the future).
Okay, disregard basically everything that I wrote in the Metal section above, because on macOS 11.0 we can use the poorly-named poorly-documented MTLBinaryArchive
which does pretty much what we want. We still have to do some writing to a file then reading back because it takes urls as parameters instead of raw bytes, but that's acceptable enough.
Okay, disregard basically everything that I wrote in the Metal section above, because on macOS 11.0 we can use the poorly-named poorly-documented
MTLBinaryArchive
which does pretty much what we want. We still have to do some writing to a file then reading back because it takes urls as parameters instead of raw bytes, but that's acceptable enough.
I've made a start on this at this branch: https://github.com/gfx-rs/gfx/compare/master...expenses:metal-pipeline-cache
I actually did a small bit of research into the dx12 docs for #2877 last night since it seems to be free now, so I could keep looking into it and see if I get anywhere...I'm not too familiar with gfx-rs though and I've never done anything with dx12 so I wouldn't rely on me, but I will try 😄
Okay, I've been doing some testing of #3719 using a hacky fork of https://github.com/repi/shadertoy-browser. Basically it loads 8866 spir-v fragment shaders and creates a pipeline for each one using a basic vertex shader, then exits.
macOS has a system shader cache at $(getconf DARWIN_USER_CACHE_DIR)/com.apple.metal
, so that needs to be taken into account when timing this.
Here are some timings with and without caches:
wiped system cache, no pipeline cache: 683.82s, 659.88s
hot system cache, no pipeline cache: 24.39s, 27.21s
wiped system cache, hot pipeline cache: 442.45s, 451.47s
hot system cache, hot pipeline cache: 25.56s, 26.97s, 28.54s
So it looks like using Binary Archives as a pipeline cache does have an improvement over no cache, but not nearly to the degree that you'd expect! It could be that the Binary Archive isn't set up correctly, but I've tested this with MTLPipelineOptionFailOnBinaryArchiveMiss
and takes the same amount of time (450.93s
) and successfully compiles all 8866 pipelines.
I'm going to look into a second cache to store SPIR-V -> MSL transformations to see how much that improves things.
Goal
It would be neat and useful to have an implementation of
get_pipeline_cache_data
on all modern platforms (Vulkan, DX12, Metal). Along with the corresponding code increate_pipeline_cache
, this would allow for being able to cache the pipelines to disk on all backends, giving a good performance boost when a lot of pipelines are used.Status
Vulkan
The Vulkan API has this
get_pipeline_cache_data
function built in.Metal
Edit: disregard this whole section, see https://github.com/gfx-rs/gfx/issues/3716#issuecomment-813698647.
The Metal backend has a pipeline cache: https://github.com/gfx-rs/gfx/blob/2a93d52661aafcbd6441ea83e739c8ced906cd21/src/backend/metal/src/native.rs#L207-L211
However there is no way to serialize or deserialize it at present.
The key blocker for this is that the
ModuleInfo
struct stores ametal::Library
:https://github.com/gfx-rs/gfx/blob/2a93d52661aafcbd6441ea83e739c8ced906cd21/src/backend/metal/src/native.rs#L200-L205
While there is no way in the Metal API to serialize a
MTLLibrary
(the underlying type), there is aserialize
function forMTLDynamicLibrary
which I believe we could convert into. It serializes directly into a file though, which is pretty gross. Presumably we'd then have to read back from this file.The other option would be to just store the metal source code for the shader that has been converted from spir-v. This would not give as big a performance improvement though.
MoltenVK
MoltenVK implements a pipeline cache with
MVKPipelineCache
. Similar to what we do with metal, this storesMVKShaderLibraryCache
s which in turn storeMVKShaderLibrary
s. When implementinggetPipelineCacheData
, it writes the metal source code, similar to what I suggest as an option above.As an example of this, here's some of the output of a pipeline cache I generated:
DX12
The DirectX 12 backend doesn't have a pipeline cache. However, there is an issue that lays out how one could be created: https://github.com/gfx-rs/gfx/issues/2877, similar to what the Metal backend does.