microsoft / graphics-driver-samples

This repository contains graphics driver samples used to demonstrate how to write graphics driver for the windows platform.
Other
381 stars 134 forks source link

Implement raspberry pi tiled texture #22

Open indygit opened 8 years ago

indygit commented 8 years ago
  1. At resource creation time, load the linear initial data into the GPU memory in tiled format (VC4 spec Section 11)
  2. Make sure the texture sampling continues to work
  3. We can start with the R8G8B8A8 and then move onto other formats.
indygit commented 8 years ago

Hi, Marek:

For creating textures, render targets and depth/stencil buffer, app calls the CreateTexture2D API and then the runtime calls 2 DDIs:

  1. DdiCalcPrivateResourceSize, this returns the size of RosUmdResource, which will be used to store the driver side of the information for the resource.
  2. Runtime allocates the memory for RosUmdResource and then call DdiCreateResource to actually creates the resource. This main involves the operation below: <1> The UMD calculation the size of resource, decides its memory layout (linear, tiled) <2> UMD uses pfnAllocateCb to call graphics kernel (dxgkrnl.sys, dxgmms1.sys) to allocate video memory with size from #2. <3> UMD initialized the allocation from step 2 with data provided by the app. One simple way is to map the video memory allocation into UM with pfnLockCb and then fill into “bits”. This step will become more complex once you start working on tiling. Right now this is a simplistic memcpy.
  Vertex/index/constant buffers work similarly, but they go through CreateBuffer API.

                                        Thanks, Indy, 1/28/2016

0: kd> kn

Child-SP RetAddr Call Site

00 0014dae0 7568f858 rosumd!RosUmdDeviceDdi::DdiCalcPrivateResourceSize 01 (Inline) -------- d3d11!CDevice::CalcPrivateResourceSize+0x82 02 0014dae0 7568ac12 d3d11!CDevice::GetLayeredChildSize+0x60c 03 (Inline) -------- d3d11!CBridgeImpl<ID3D11LayeredDevice,ID3D11LayeredDevice,CLayeredObject >::GetLayeredChildSize+0x2e 04 (Inline) -------- d3d11!NDXGI::CDevice::GetLayeredChildSize+0x2e 05 (Inline) -------- d3d11!CBridgeImpl<ID3D11LayeredDevice,ID3D11LayeredDevice,CLayeredObject >::GetLayeredChildSize+0x2e 06 0014f2c0 7568e574 d3d11!NOutermost::CDevice::CreateLayeredChild+0xaa 07 (Inline) -------- d3d11!CDevice::CreateAndRecreateLayeredChild+0x30 08 0014f3a8 75696ae8 d3d11!CDevice::CreateTexture2D_Worker+0x4e8 09 0014f6c8 008e9f86 d3d11!CDevice::CreateTexture2D+0x7c 0a 0014f738 00000000 CubeTest!D3DDepthStencilBuffer::D3DDepthStencilBuffer+0xaa

1: kd> kn

Child-SP RetAddr Call Site

00 0014e5d0 7569beb4 rosumd!RosUmdDeviceDdi::DdiCreateResource 01 0014e5d0 7569ba2c d3d11!CResource::CLS::FinalConstruct+0x474 02 (Inline) -------- d3d11!CTexture2D::CLS::FinalConstruct+0x1a 03 0014e880 75691ba2 d3d11!TCLSWrappers::CLSFinalConstructFn+0x24 04 (Inline) -------- d3d11!CLayeredObjectWithCLS::FinalConstruct+0x4e 05 (Inline) -------- d3d11!CLayeredObjectWithCLS::{ctor}+0xcc 06 (Inline) -------- d3d11!CLayeredObjectWithCLS::CreateInstance+0x10e 07 0014e898 756823e6 d3d11!CDevice::CreateLayeredChild+0x6b6 08 (Inline) -------- d3d11!CD3D11LayeredChild<ID3D11DeviceChild,NDXGI::CDevice,64>::FinalConstruct+0x1a 09 0014f170 756828cc d3d11!NDXGI::CDeviceChild<IDXGIResource1,IDXGISwapChainInternal>::FinalConstruct+0x26 0a 0014f1a8 75684cd8 d3d11!NDXGI::CResource::FinalConstruct+0x20 0b (Inline) -------- d3d11!CLayeredObjectNDXGI::CResource::{ctor}+0x6e 0c (Inline) -------- d3d11!CLayeredObjectNDXGI::CResource::CreateInstance+0x6e 0d 0014f1d0 7568ad8a d3d11!NDXGI::CDevice::CreateLayeredChild+0x290 0e (Inline) -------- d3d11!CBridgeImpl<ID3D11LayeredDevice,ID3D11LayeredDevice,CLayeredObject >::CreateLayeredChild+0x28 0f (Inline) -------- d3d11!NOutermost::CDeviceChild::FinalConstruct+0x32 10 (Inline) -------- d3d11!CUseCountedObjectNOutermost::CDeviceChild::{ctor}+0x7e 11 (Inline) -------- d3d11!CUseCountedObjectNOutermost::CDeviceChild::CreateInstance+0x8c 12 0014f290 7568e574 d3d11!NOutermost::CDevice::CreateLayeredChild+0x222 13 (Inline) -------- d3d11!CDevice::CreateAndRecreateLayeredChild+0x30 14 0014f3a8 75696ae8 d3d11!CDevice::CreateTexture2D_Worker+0x4e8 15 0014f6c8 008e9f86 d3d11!CDevice::CreateTexture2D+0x7c 16 0014f738 00000000 CubeTest!D3DDepthStencilBuffer::D3DDepthStencilBuffer+0xaa

indygit commented 8 years ago

From: Indy Zhu Sent: Friday, January 29, 2016 11:28 AM To: Marek Kedzierski marekkedzierski@interia.pl; Bart House bhouse@microsoft.com; Hideyuki Nagase hideyukn@microsoft.com; Jordan Rhee jordanrh@microsoft.com; Jeff Wickenheiser jeffwick@microsoft.com Subject: RE: Would you be interested in working on tiled texture support ?

Hi, Marek:

We are working on the symbols issue, it is important for development, I will notify you when there is a conclusion.

You have a good plan in general, but I want to start with a smaller scope at the beginning.

Can you please take a look at RosUmdResource::CalculateMemoryLayout() ?

At the start of the tiling support work, I would suggest we start with texture with these characteristics:

  1. m_usage of D3D10_DDI_USAGE_DEFAULT
  2. m_bindFlags has only D3D10_DDI_BIND_SHADER_RESOURCE bit

This avoids the complexity of supporting :

  1. Tiled render target (or depth stencil buffer) When we have both tiled RT and texture, it would be hard to tell if rendering went wrong or sampling. We should support tiled RT later. At that time you need to work on VC4TileRenderingModeConfig::MemoryFormat. I think we can open another “issue” to keep track of this work.
  2. Texture with m_usage of D3D10_DDI_USAGE_DYNAMIC and D3D10_DDI_USAGE_STAGING. Those textures can be mapped/lock using the Map() API to give app direct CPU access, so we want to keep them linear.

I think the 1st place to utilize the tiling support is for initializing the texture with app supplied initial data. Can you please take a look at this commit: https://github.com/Microsoft/graphics-driver-samples/commit/f5773a0b7123b59788a28d6b1923694d9ef9cf13

I suggest you use the Cubetests and its R8G8B8A8 texture for starting the tiling work, so that we can easily compare rendering result of linear and tiled texture.

One of our goals for the VC4 driver is to render D2D content (for Windows shell/UI), have you used D2D ? Besides the common R8G8B8A8 format, D2D uses DXGI_FORMAT_R8_UNORM, DXGI_FORMAT_A8_UNORM, DXGI_FORMAT_R8G8_UNORM in its rendering (of text/font for example). After the work for tiled R8G8B8A8 texture is done, can you please make sure we can create and sample from texture of those formats correctly ? I suggest you build a new test with Cubetests as the starting point.

And then as you planned, please move onto ResourceCopy().

Jordan is building the Raspberry into a full driver, meaning it will drive the display/monitor. For that reason, we have to make fullscreen render target/primary unlockable, meaning we can no long map it in user mode. So please keep in mind that at some point we also need tiling support code in KMD’s RosKmdRapAdapter::ProcessRenderBuffer() for the ResourceCopy case. That is used to copy render target unlockable from UM into a staging resource, which can then be Map()-ed by app.

Please let me know if you have any question. I hope you can have a bigger picture view of the project besides the technical details.

                                             Thanks, Indy, 1/29/2016

From: Marek Kedzierski [mailto:marekkedzierski@interia.pl] Sent: Thursday, January 28, 2016 3:29 PM To: Indy Zhu indyz@microsoft.com; Bart House bhouse@microsoft.com; Hideyuki Nagase hideyukn@microsoft.com; Jordan Rhee jordanrh@microsoft.com Subject: RE: Would you be interested in working on tiled texture support ?

I forgotten to add, that this is a general sketch, I've started working on the code so I expect to have more questions soon:-)

Marek

Od: "Marek Kedzierski" marekkedzierski@interia.pl Do: "Indy Zhu" indyz@microsoft.com; Wysłane: 0:19 Piątek 2016-01-29 Temat: RE: Would you be interested in working on tiled texture support ? Indy,

I swear some of the missing symbols (like those from dxgkrnel) were available on sever about two months ago.

Version command:

Windows 10 Kernel Version 10586 MP (4 procs) Free ARM (NT) Thumb-2 Built by: 10586.63.armfre.th2_release.160104-1513 Machine Name: Kernel base = 0x81052000 PsLoadedModuleList = 0x8123b8d8 Debug session time: Thu Jan 28 16:50:47.432 2016 (UTC + 1:00) System Uptime: 0 days 0:53:01.515 Remote KD: KdSrv:Server=@{},Trans=@{COM:Port=.\com3,Baud=921600,Timeout=4000}

Thanks for stack frames , yestarday I've already figured out the flow (at least I think:-)) through intensive debugging.

The general sketch for tiles support looks like this:

Please correct the places where I'm wrong - then I'll put the sketch on github as commentary.

Cheers,

Marek

marekkedzierski commented 8 years ago

Indy, after filling memory with right tiled format, how to inform HW that it has to deal with RosHwLayout::Tiled? Cheers, Marek

hideyukn88 commented 8 years ago

You will need to specify texture data type at texture config parameter 0/1 (TYPE/TYPE4), see hardware spec page 42/43. Currently it's set to 16 (RGBA32R) which is linear format. Once you can swizzle texture, then non-Raster format, such as 0 (RGBA8888) or 1 (RGBX8888) can be used for tiled texture.

marekkedzierski commented 8 years ago

Thanks!

marekkedzierski commented 8 years ago

All, Short status update:

I did tiled textures support (m_usage of D3D10_DDI_USAGE_DEFAULT, m_bindFlags has only D3D10_DDI_BIND_SHADER_RESOURCE bit/RGBA8888) for T type texture as Indy suggested. I will create pool request soon. I used CubeTests - I did tests on special prepared images of size 64x64 with markers, because they really helped me to figure out what is the real layout of subtiles and tiles). Right now I'm making sure everything works - I'm moving to bigger textures.

Cheers,

Marek

indygit commented 8 years ago

Thanks Marek. We are looking forward to your pull request.