narzoul / DDrawCompat

DirectDraw and Direct3D 1-7 compatibility, performance and visual enhancements for Windows Vista, 7, 8, 10 and 11
BSD Zero Clause License
893 stars 67 forks source link

More help about D3DDDI hacking #163

Closed ghotik closed 7 months ago

ghotik commented 1 year ago

Hi again.

In my other request/thread I got your help to add W-Based fog in a Direct3D version 1 game like the original CD version of "Star Wars: Shadows of the Empire". You suggested to hook D3DDDI through the OpenAdapter and GetPrivateDDITable, but as a matter of fact the hook through OpenAdapter was enough for this specific case. Now I'd like to complete the task by adding also the hook for GetPrivateDDITable, but I wouldn't like to add untested code to the DxWnd project, so this is my plea: Do you remember some case where the trick worked through GetPrivateDDITable? If you name some game/situation where this happens I wish to implement and test this path as well.

P.s. About the benefits of using DDrawCompat to fix some Z-ordering problem, I devised a hopefully smart enough plan: I got your ddraw.dll finally compiled on my computer (yes, I had to install VS2002 and the DVK) so now I could try to bypass chunks of your code commenting sections in a coarse to finer way until I meet the part that causes the beneficial effect. If I succeed, I'll let you know where the trick is located, maybe it could be interesting to you as well. In DxWnd it seems that I can make SW-SOTE playable by setting the "Clean Z-buffer @1.0 fix" flag, maybe DDRawCompat does something like that.

Thanks a lot gho

narzoul commented 1 year ago

GetPrivateDDITable is not game dependent, it's driver dependent. It's used only when D3D9On12 is used instead of a vendor provided D3D9 driver, which is the case with recent Intel GPUs, which don't ship with a D3D9 driver anymore. You can find more info about it here: https://www.vogons.org/viewtopic.php?f=8&t=86912

If you want to force using it on any other GPU, you'll probably want to add an option for that in DxWnd, it's pretty easy to implement, only needs a hook for D3DKMTQueryAdapterInfo: https://github.com/narzoul/DDrawCompat/commit/fb9f28456e4083901043972e5f011516cde649d4

As for the Z-order issue, I prefer git bisect myself to find the commit that fixed or broke something, but then you may need to compile MS Detours also for some of the older commits. The SDK/WDK version differences can be worked around by retargeting the solution to the newer one before each build.

ghotik commented 1 year ago

Thanks, the forcing of D3D9On12 is pretty straight forward, but it doesn't work on my computer. I followed more or less the same steps that I did for W-based fog and the D3DKMTQueryAdapterInfo call in gdi32.dll is hooked with my custom routine (a copy of yours in DDrawCompat). I get logs that prove that the D3DKMTQueryAdapterInfo wrapper is called and receives meaningful data (as far as binary data may seem reasonable) but no matter of what I do in the wrapper the program crashes afterwards, despite that even the original function works well (it returns NTSTATUS 0). I wonder if the problem could depend on the fact that I don't have D3D9on12 installed on my computer (a Win11 home edition with integrated video card), or maybe Microsoft added some anti-hooking protection? It's a little too early to draw conclusions, I'll let you know ...

edit my fault, I abused of cut&paste from the web pages and forgot to add the necessary WINAPI call method. Now the programs work again and the log shows that the simulated STATUS_INVALID_PARAMETER error code enables the GetPrivateDDITable branch, as you suggested. Now I have a way to fill the patching code for this path as well, thanks.

ghotik commented 1 year ago

Hello!

It seems that I somehow fixed my biggest troubles and now I got more doubts. I do hope not to annoy you. In any case, I attached the sources of the two modules with the calls (ddHookVideoAdapter and this new ddForceD3D9on12) that enable respectively the W-based fog and the d3d9on12 replacement. Here are some questions for you:

1) since the ddForceD3D9on12 call is currently run inside the ddraw.dll hook, it doesn't make any effect on games that don't use ddraw.dll and use only other DirectX versions like D3D8 or D3D9. Is this correct or should i add the same call also in the hook procedures of these other libraries?

2) the effect of ddForceD3D9on12 on legacy ddraw games in general is not positive: in the best case it doesn't make harm, in other cases it gives some trouble. But I suppose this is not unexpected.

3) in DDrawCompat code for the D3DKMTQueryAdapterInfo wrapper there is a switch branch (that I prudently commented) that seems to enlarge the reported DedicatedVideoMemorySize value. Why that? Maybe because DDrawCompat in some way forces the usage of system memory as a replacement of the video memory? Is this something that could become useful for DxWnd as well and, in that case, under what circumstances? dxwnd.from.ddrawcompat.zip

Well, I suppose that now I abused of your patience more than enough, thanks again gho

narzoul commented 1 year ago
  1. since the ddForceD3D9on12 call is currently run inside the ddraw.dll hook, it doesn't make any effect on games that don't use ddraw.dll and use only other DirectX versions like D3D8 or D3D9. Is this correct or should i add the same call also in the hook procedures of these other libraries?

If the hook is installed as an IAT hook in ddraw.dll only, then of course it will only affect apps that use ddraw. The same kind of IAT hooks can be installed for D3D8 and D3D9 also, if needed. Or you can use something like hot patching of the original function in gdi32.dll, then it will work for all 3 runtimes automatically.

  1. the effect of ddForceD3D9on12 on legacy ddraw games in general is not positive: in the best case it doesn't make harm, in other cases it gives some trouble. But I suppose this is not unexpected.

I have the same experience with it, it's not very stable. Many games have visual glitches or just crash, especially the 3D ones.

  1. in DDrawCompat code for the D3DKMTQueryAdapterInfo wrapper there is a switch branch (that I prudently commented) that seems to enlarge the reported DedicatedVideoMemorySize value. Why that? Maybe because DDrawCompat in some way forces the usage of system memory as a replacement of the video memory? Is this something that could become useful for DxWnd as well and, in that case, under what circumstances?

I think I had issues with my Intel HD 4600 running out of video memory in some cases, especially after adding features like antialiasing and resolution scaling that need a lot of video memory. From what I remember, the driver reports only a small amount of dedicated video memory (maybe up to 128 MB or so), and once it's exhausted, the runtime refuses to create more surfaces in video memory or something. But after reading up on it, I found that Intel uses something called DVMT (Dynamic Video Memory Technology) that can allocate further dedicated video memory from system memory on demand, but DirectDraw was somehow unable to take advantage of this. So I cheated on the reported total dedicated video memory to get around the issue, which seemed to work well enough. I just assumed that this extension would happen from the DedicatedSystemMemorySize, so I simply moved that part into DedicatedVideoMemorySize upfront, which should prevent the runtime from assuming there is no more usable video memory.

ghotik commented 1 year ago

Hello. I said that my goal was to find some more clever, stable and powerful tricks that I could "import" from the DDrawCompat project to DxWnd. I spent some time trying to identify a common target, that is a damned game that could easily show problems on many platforms. I don't know if I succeeded, but so far my best guess is "Star Wars Episode I - Racer". It has some nice features: it had a downloadable RIP from old-games.ru, it is compact (167 MB), it doesn't require updates to the registry and can be hooked by DxWnd with default options. Most important, with default options the textures are rendered as a complete mess almost in all situations. The game can be fixed with a DxWnd option (the "Direct3D / Clean ZBUFFER @1.0 fix"), but this option is based on the periodic clearing of the D3DDevice viewport. This seems a different way from What DDrawCompat does. In any case, this is what happens (at least, to me): with DxWnd default options the game is rendered very badly, but if I drop a DDrawCompat 4.0 ddraw.dll wrapper in the game folder then everything works, so evidently DDrawCompat has a beneficial effect here. This could be a good starting point for my wicked idea: cut functionalities from DDrawCompat until I find the one that was fixing the problem. Unfortunately, this can't be done with the DDrawCompat available branches because it seems that 4.0 is the only release that can fix the problem, the previous release 0.3.2 doesn't (please, remember that I'm still talking about DDrawCompat combined with DxWnd, not alone!). But, of course, a little hint and some explanation from you would ease my work a lot. Sadly, I'm not a real expert of 3D rendering and several DxWnd tricks, I must admit, were just imported from other projects. Will you help me?

narzoul commented 1 year ago

Please track down the exact commit between v0.3.2 and v0.4.0 that fixed the issue, then we can talk about it. I don't want to spend too much time debugging issues that I have already fixed somehow, when there are plenty of unfixed issues for me to work on as it is.

ghotik commented 1 year ago

After quite a lot of troubles to set VS2022 (I had to add the d3dumddi.h file manually, like Been Nath) at least I was lucky enough when back-tracking the git releases. In "Star Wars Episode I - Racer" the commit bb0eadc3 is the last showing the problem and 13e4d901 the first release that fixes it. The comment just tells "Added Software Device setting".

narzoul commented 1 year ago

Unlucky, it's one of the bigger commits. I would usually try reverting parts of the commit then until it breaks again. My first guess would be the pfnDepthFill function though, I seem to recall getting errors on this function with the native driver implementation for some reason. The new implementation might work around this. Try commenting out the SET_DEVICE_FUNC(pfnDepthFill); line and putting back the old SET_FLUSH_PRIMITIVES_FUNC(pfnDepthFill); line, and see if that breaks it again.

ghotik commented 1 year ago

Nice shot, you got it at the first try!! This is the last release 0.4.0 with the replacement in DeviceFuncs.cpp, the picture shows exactly what I would see with DxWnd default options:

        SET_DEVICE_FUNC(pfnCreateResource2);
        //SET_DEVICE_FUNC(pfnDepthFill);
        SET_FLUSH_PRIMITIVES_FUNC(pfnDepthFill);
        SET_DEVICE_FUNC(pfnDestroyDevice);

broken

and this is the 0.4.0 with recovered code:

        SET_DEVICE_FUNC(pfnCreateResource2);
        SET_DEVICE_FUNC(pfnDepthFill);
        //SET_FLUSH_PRIMITIVES_FUNC(pfnDepthFill);
        SET_DEVICE_FUNC(pfnDestroyDevice);

fixed

So it seems that pfnDepthFill did the trick. I think I could reach and hook that callback function, but what is it the theory behind the repaired textures? I am eager to verify if your trick is much more powerful than the several application-level patches added to DxWnd.

ghotik commented 1 year ago

Guess what? This injured and abused copy of your code works as a charm on both "Star Wars Episode I: Racer" and "Star Wars: Shadows of the empire"! Of course in the hurry to make some tests and see some results I had cut some more sophisticated logic that will handle more complex cases (see the brute comments), but your technique is charming. I will need your help to tell what could be ignored because it is relative to GPU (not currently on my scope of work) from what is instead essential and should be added properly. Thanks, thanks, thanks!!!


HRESULT WINAPI dxwDepthFill(HANDLE hDevice, const D3DDDIARG_DEPTHFILL* data)
{
    ApiName("dxwDepthFill");
    HRESULT res;
    OutTrace("%s: hDevice=%#x\n", ApiRef, hDevice);
    res = pDepthFill(hDevice, data);
    //res = 0;
    //HANDLE resource = getResource(data->hResource);
    //HANDLE customResource = resource->getCustomResource();
    //int fi = getFormatInfo(resource->getFixedDesc().Format);  
    D3DDDIARG_CLEAR clear = {};
    clear.Flags = D3DCLEAR_ZBUFFER;
    //clear.FillDepth = getComponentAsFloat(data->Depth, fi.depth);
    clear.FillDepth = 1.0f;
    //if (0 != fi.stencil.bitCount)
    //{
        clear.Flags |= D3DCLEAR_STENCIL;
        //clear.FillStencil = getComponent(data->Depth, fi.stencil);
    //}
    pClear(hDevice, &clear, 1, &data->DstRect);
    return res;
}

P.s. after fiddling some more with this code I found that the only part that is relevant for the two "Star Wars" games is the clearing of the ZBUFFER with 1.0f argument, that I suppose is the D3DDDI alter-ego of the DxWnd "Clean ZBUFFER @1.0 fix" flag, but with the undoubted vantage of being written is a single place for every D3D flavors. The Clearing of the stencil buffer is not relevant in these cases, and I'd like to ask what this section is supposed to do, possibly if you recall some game where this was necessary.

narzoul commented 1 year ago

See "Clearing Depth Buffers" in the DX7 SDK help file (Direct3D Immediate Mode Essentials -> Depth Buffers -> Using Depth Buffers). pfnClear corresponds to the Clear method and pfnDepthFill corresponds to the Blt method. DDrawCompat converts pfnDepthFill to pfnClear because the former doesn't work well. On my Radeon it just returns 0x8876086c (D3DERR_INVALIDCALL). It's probably some driver bug.

There are some important differences between the two: pfnClear works on the currently selected depth buffer (with pfnSetDepthStencil), while pfnDepthFill receives the depth buffer resource handle in the hResource parameter. Hence you should pass hResource first to pfnSetDepthStencil and then restore the previous one when done. So, keep track of what was last passed to pfnSetDepthStencil by the runtime.

The other main difference is that pfnDepthFill receives the depth and stencil value in the native format of the depth buffer, e.g., D3DDDIFMT_D24S8 receives the depth bits in the upper 24 bits and the stencil bits in the lower 8 bits. pfnClear specifies them in separate parameters (or can even leave one of them untouched based on the flags passed to it). Additionally, pfnClear specifies the depth value as a float between 0.0 and 1.0, so some arithmetic conversion is needed between the integer and float value. The integer range uses all available bits, so the max value in this case is the 2^24-1 (it depends on the number of depth bits in the format in general). So, in order to convert the depth/stencil values properly, you also need to keep track of the format of the depth buffer, which is specified when creating the resource with either pfnCreateResource or pfnCreateResource2 (depending on WDDM version).

I think that's about it. The rest is related to other enhancements in DDrawCompat which you probably don't need at this point.

I don't know which games use stencil buffers, especially with pfnDepthFill. I just added it to the code for completeness.