NVIDIAGameWorks / NRI

Low-level abstract render interface
MIT License
217 stars 25 forks source link

Sparse texture support #26

Open vertver opened 9 months ago

vertver commented 9 months ago

So, as an addition to #24 thread, I'll add that D3D11 also supports sparse (or tiled) textures, and all of the D3D12/Vulkan GPUs already support them (more info). So, maybe we don't need an extension for this?

dzhdanNV commented 9 months ago

Despite that all GAPIs expose tiled textures, I would prefer to start with a separate extension. IMO, it would be easier to discuss and progress. There is a big chance that it will have gone into core by the time of the release. IMO, tiled resources are tricky:

While I'm learning specs, I would be happy to see a draft proposal in any form :)

vertver commented 9 months ago

I thought a little longer and looked into the API for working with tiles, and what can I say:

  1. It will take too much effort to unify the work with tile and regular resources (and it will not be possible not to unify them because of API problems, the resource must be accessed as a regular resource). I thought about solving the problem through an additional allocator, but I think a total change of approach to working with resources is needed here

  2. The differences between D3D11 and D3D12 in this case are quite big. Since D3D11 does not directly access GPU memory, it is necessary to create a separate tile pool, which works as a separate memory.

Thus, I concluded that this is probably not worth implementing because it requires overcomplicating the API for virtual mapping.

dzhdanNV commented 9 months ago

Giving up? :) As I said, I will look at the specs and discuss with colleagues. Finding a common denominator (even omitting or limiting D3D11 support) will be a nice achievement.

vertver commented 9 months ago

Giving up? :)

Just saying that it would be hard to implement this without breaking the whole resource API. So if you are okay with it - I can try to think how it can be implemented.

Finding a common denominator (even omitting or limiting D3D11 support) will be a nice achievement.

In D3D12 and Vulkan, it works as a separate resource type (i.e. "reserved" for D3D12), so it can be generalised. Also, it looks like in D3D11 API trying to match D3D12 Reserved Resource but with separate pools to manage memory allocations, so they can easily be combined.

The sad news is that I haven't found any official example for the D3D11 tiled resource, only for the reserved D3D12 resource (but I will refer to D3D12 as the most complete solution for managing resources).

vertver commented 9 months ago

I find it funny though...

D3D11:

  void UpdateTileMappings(
    [in]           ID3D12Resource                        *pResource,
                   UINT                                  NumResourceRegions,
    [in, optional] const D3D12_TILED_RESOURCE_COORDINATE *pResourceRegionStartCoordinates,
    [in, optional] const D3D12_TILE_REGION_SIZE          *pResourceRegionSizes,
    [in, optional] ID3D12Heap                            *pHeap,
                   UINT                                  NumRanges,
    [in, optional] const D3D12_TILE_RANGE_FLAGS          *pRangeFlags,
    [in, optional] const UINT                            *pHeapRangeStartOffsets,
    [in, optional] const UINT                            *pRangeTileCounts,
                   D3D12_TILE_MAPPING_FLAGS              Flags
  );

D3D12

HRESULT UpdateTileMappings(
  [in]           ID3D11Resource                        *pTiledResource,
  [in]           UINT                                  NumTiledResourceRegions,
  [in, optional] const D3D11_TILED_RESOURCE_COORDINATE *pTiledResourceRegionStartCoordinates,
  [in, optional] const D3D11_TILE_REGION_SIZE          *pTiledResourceRegionSizes,
  [in, optional] ID3D11Buffer                          *pTilePool,
  [in]           UINT                                  NumRanges,
  [in, optional] const UINT                            *pRangeFlags,
  [in, optional] const UINT                            *pTilePoolStartOffsets,
  [in, optional] const UINT                            *pRangeTileCounts,
  [in]           UINT                                  Flags
);

No difference at all! I think that we maybe can implement heap emulation for D3D11 via resources, what do you think?

dzhdanNV commented 9 months ago

Heaps are already emulated in D3D11. It's not a real emulation, of course, heaps are dummy objects. So, yes, it's doable.

dzhdanNV commented 9 months ago

Just saying that it would be hard to implement this without breaking the whole resource API. So if you are okay with it - I can try to think how it can be implemented.

Yes, please. Let's stay in touch on this topic.

vertver commented 9 months ago

Can you update NRIFramework and NRISamples to the latest version? Or can I just use their current version?

dzhdanNV commented 9 months ago

I'm starting to update NRI and NRIFramework to the latest (including ReBAR support). ETA is 3-5 h.

dzhdanNV commented 9 months ago

Can you update NRIFramework and NRISamples to the latest version?

Updated.

vertver commented 9 months ago

While working on implementing sparse textures, I got to thinking about memory management for resources. I don't like the fact that you have to first bind a texture to allocate memory, because for sparse textures you have to first create a reserved resource and only then update the regions using the UpdateTilingMappings command queue method. Maybe we should make a more advanced interface for memory management rather than just binding/unbinding logic inside textures? I'm very confused by this

dzhdanNV commented 9 months ago

In VK only this is possible:

This fact dictates the API. No more comments for now because I'm still out of the topic...

vertver commented 9 months ago

So, my additional thoughts on sparse (info for PR #28).

D3D11 does not support allocating tile memory outside of the tile pool, whereas D3D12 and Vulkan allow it. For this reason, I decided to make a separate structure called TilePool which directly translates to a memory heap on D3D12 and a memory pointer on Vulkan. (I'm also thinking of making an additional translation layer for Metal to add support for sparse textures, because MoltenVK also doesn't support sparse resources since Metal has the same behaviour as D3D11).

I added a draft version of the tiling extension that tries to combine all the ways to handle sparse textures (Vulkan is more low-level than D3D12, so I decided to implicitly define all the necessary information for the bind function and simplify the function call). I can answer your questions on this topic here.

References: General: https://github.com/gpuweb/gpuweb/issues/455 Vulkan: https://docs.vulkan.org/spec/latest/chapters/sparsemem.html D3D11: https://learn.microsoft.com/en-us/windows/win32/api/d3d11_2/nf-d3d11_2-id3d11devicecontext2-updatetilemappings#remarks D3D12: https://learn.microsoft.com/en-us/windows/win32/api/d3d12/nf-d3d12-id3d12commandqueue-updatetilemappings#remarks Metal: https://github.com/KhronosGroup/MoltenVK/issues/1700

dzhdanNV commented 9 months ago

Thank you, Anton. Please, give me more time (I would like to get feedback from colleagues, what will take additional time).

dzhdanNV commented 9 months ago

Thanks for solid baseline. Not looking at implementation details, here are my comments for the interface:

// DZ: TextureUsageBits should be extended to include VK_IMAGE_CREATE_SPARSE_BINDING_BIT 
// DZ: BufferUsageBits should be extended to include VK_BUFFER_CREATE_SPARSE_BINDING_BIT
// Yes, buffers can be sparse too. Do we want to support it?

NRI_NAMESPACE_BEGIN

// DZ: better remove entity and use Memory
// It's a dummy entity for D3D11, so it can be used to store tile buffer natively
NRI_STRUCT(TilePool);

// DZ: "subresource" concept is not exposed in NRI (but appears in HelperInterface). I would be good to clearer expose it
NRI_STRUCT(SubresourceTilingDesc)
{    
    uint32_t widthInTiles; // DZ: uint32_t => Dim_t
    uint16_t heightInTiles; // DZ: uint16_t => Dim_t
    uint16_t depthInTiles; // // DZ: uint16_t => Dim_t
    uint32_t resourceOffsetInTiles;
};

NRI_STRUCT(TileBindDesc) // DZ: TileBindingDesc
{
    // DZ: add "texture" here
    NRI_NAME(TextureRegionDesc) region;
    NRI_NAME(TilePool)* tilePool; // DZ: TilePool => Memory
    uint64_t offsetInTiles;
};

NRI_STRUCT(TextureTilingRequirementsDesc)
{
    NRI_NAME(Dim_t) tileWidth;
    NRI_NAME(Dim_t) tileHeight;
    NRI_NAME(Dim_t) tileDepth;
    NRI_NAME(Mip_t) mipStandardNum;
    NRI_NAME(Mip_t) mipPackedNum;
    uint32_t packedTileNum;
    uint32_t startPackedTileIndex;
};

NRI_STRUCT(TilingInterface)
{
    // DZ: Create/Destroy not needed if we replace "TilePool" with memory
    Result (NRI_CALL* CreateTilePool)(Device& device, uint32_t nodeMask, NRI_NAME(MemoryType) memoryType, uint64_t poolSize, NRI_NAME_REF(TilePool*) tilePool);
    void (NRI_CALL* DestroyTilePool)(Device& device, NRI_NAME_REF(TilePool) tilePool);

    // DZ: is "texture" missing?
    // DZ: "Device" is not needed here
    void (NRI_CALL* GetSubresourceTiling)(Device& device, uint32_t subresource, NRI_NAME_REF(SubresourceTilingDesc) subresourceTiling);
    void (NRI_CALL* GetTextureTilingRequirements)(Device & device, NRI_NAME_REF(TextureTilingRequirementsDesc) memoryRequirements);

    // DZ: better rephraze to achieve "BindTextureMemory"-like behavior: allow multiple textures in one call. "UpdateTileMapping" calls MUST be grouped as much as possible
    // DZ: TileBindingDescs* tileBindingDescs
    // DZ: "Device" is not needed here
    Result (NRI_CALL* BindTextureTiles)(Device& device, NRI_NAME_REF(CommandQueue) commandQueue, NRI_NAME_REF(Texture) texture, const NRI_NAME(TileBindDesc)* binds, uint32_t bindNum);
};

NRI_NAMESPACE_END

Thanks for docs! I'm still not an expert (VK doc is very long), but I believe this functionality will go into core. Also, currently I'm assuming that this API is implementable for all GAPIs (I trust you).

dzhdanNV commented 9 months ago

From the doc:

The operation is similar to ID3D11DeviceContext2::UpdateTileMappings with the one key difference that D3D12 allows a reserved resource to have tiles from multiple heaps.

I'm OK to ignore such weird functionality. Having a single sparse resource using tiles from several memory heaps looks like a bad engine design.

(UPDATE) After consultations I realized that such functionality is very useful. You can have a tiled resource (incredibly large) and just "assign" a new memory chunk to some tiles without a need to recreate everything.

vertver commented 9 months ago

OK, these changes are fine to me, I'll fix them as soon as possible (currently unavailable due to work)