libcg / grvk

Vulkan-based Mantle API implementation
https://en.wikipedia.org/wiki/Mantle_(API)
zlib License
221 stars 14 forks source link

BF4, BF Hardline, DA:I, Plants vs. Zombies: Garden Warfare / Frostbite 3 Engine #10

Closed xatornet closed 2 years ago

xatornet commented 3 years ago

Games status:

Details:

BF4 and Dragon Age Inquisiton Error Message (both the same)

Here's BF 4 log:

=== GRVK 0.3.0 === I/grInitAndEnumerateGpus: app "Battlefield" (01000000), engine "Frostbite" (00C00000), api 00018000 W/grInitAndEnumerateGpus: unhandled alloc callbacks W/grGetExtensionSupport: STUB W/grGetExtensionSupport: STUB

Here's DAI log:

=== GRVK 0.3.0 === I/grInitAndEnumerateGpus: app "DragonAgeInquisition" (01000000), engine "Frostbite" (00FDE001), api 00018000 W/grInitAndEnumerateGpus: unhandled alloc callbacks W/grGetExtensionSupport: STUB W/grGetExtensionSupport: STUB

libcg commented 3 years ago

Thanks for the report, I haven't tried any games other than Star Swarm. Fortunately the GR_BORDER_COLOR_PALETTE extension can be implemented using VK_EXT_custom_border_color (spec)

xatornet commented 3 years ago

No, thanks to you for this amazing project. Keep up the good work :-)

Cherser-s commented 3 years ago

Did you try to launch BF4 from wine? It doesn't detect Mantle API at all because I think, the client tries to find amdmantle64.dll first instead of mantle64.dll. It probably needs a function called IcdInit, which isn't included in the API docs.

xatornet commented 3 years ago

I tried it on windows 10 without wine. To make games be able to detect mantle using Nvidia, you have to move amdmantle64.dll and mantleaxl64.dll into the executable folder, and then paste GRVK's dlls, in this case mantle64.dll.

You can get those dlls downloading the Adrenaline 19.4.3 driver drom AMD.

Cherser-s commented 3 years ago

I hope there will be at least some explanation of function parameters from such libraries (there is a ton of parameters for this function for example), so we could implement these functions to avoid using proprietary libraries.

libcg commented 3 years ago

Tracking these two issues here: https://github.com/libcg/grvk/issues/11 https://github.com/libcg/grvk/issues/12

libcg commented 3 years ago

@xatornet can you restrict this issue for Frostbite games, and create separate issues for UE3 and other engines?

Cherser-s commented 3 years ago

Well, considering it's possible to select Mantle backend in bf4 now, does the game even work with it (at least in-menu)?

libcg commented 3 years ago

It looks like this with some additional hacks. I get a crash after the grWsiWinSetMaxQueuedFrames call for some reason.

T/0000094C/grInitAndEnumerateGpus: 000000006101f3c8 000000006101f3b8 0000000142739998
I/0000094C/grInitAndEnumerateGpus: app "Battlefield" (01000000), engine "Frostbite" (00C00000), api 00018000
W/0000094C/grInitAndEnumerateGpus: unhandled alloc callbacks
T/0000094C/grGetGpuInfo: 0000000000e242a0 0x6100 000000006101f3b0 000000006101f3f0
T/0000094C/grGetGpuInfo: 0000000000e242a0 0x6100 000000006101e8e8 0000000087ff45c0
T/0000094C/grGetGpuInfo: 0000000000e242a0 0x6101 000000006101e8e8 0000000087ff59c0
W/0000094C/grGetExtensionSupport: STUB GR_WSI_WINDOWS
W/0000094C/grGetExtensionSupport: STUB GR_BORDER_COLOR_PALETTE
W/0000094C/grGetExtensionSupport: STUB GR_DMA_QUEUE
W/0000094C/grGetExtensionSupport: STUB GR_ADVANCED_MSAA
W/0000094C/grGetExtensionSupport: STUB GR_TIMER_QUEUE
T/0000094C/grCreateDevice: 0000000000e242a0 000000006101e948 0000000087ff5b00
I/0000094C/grCreateDevice: 1002:7300 "AMD RADV FIJI (ACO)" (Vulkan 1.2.145, driver 21.0.2)
T/0000094C/grWsiWinGetDisplays: 0000000000e28230 000000006101e8f4 000000006101ea00
T/0000094C/grGetObjectInfo: 0000000000e28070 0x206801 000000006101e9a8 000000006101ea88
T/0000094C/grGetObjectInfo: 0000000000e280b0 0x206801 000000006101e9a8 000000006101ea88
T/0000094C/grGetObjectInfo: 0000000000e280f0 0x206801 000000006101e9a8 000000006101ea88
T/0000094C/grGetObjectInfo: 0000000000e28130 0x206801 000000006101e9a8 000000006101ea88
T/0000094C/grGetObjectInfo: 0000000000e28170 0x206801 000000006101e9a8 000000006101ea88
T/0000094C/grGetObjectInfo: 0000000000e281b0 0x206801 000000006101e9a8 000000006101ea88
T/0000094C/grGetObjectInfo: 0000000000e28910 0x206801 000000006101e9a8 000000006101ea88
T/0000094C/grGetObjectInfo: 0000000000e28950 0x206801 000000006101e9a8 000000006101ea88
T/0000094C/grWsiWinGetDisplayModeList: 0000000000e28070 000000006101e8f0 0000000000000000
T/0000094C/grWsiWinGetDisplayModeList: 0000000000e28070 000000006101e8f0 0000000008ee0450
T/0000094C/grGetGpuInfo: 0000000000e242a0 0x6102 000000006101e8c8 0000000000000000
T/0000094C/grGetGpuInfo: 0000000000e242a0 0x6102 000000006101e8c8 0000000008eb0080
T/0000094C/grGetDeviceQueue: 0000000000e28230 0x1000 0 0000000008e70148
T/0000094C/grGetObjectInfo: 0000000000e28c90 0x206800 000000006101e8c8 0000000008e70160
T/0000094C/grGetDeviceQueue: 0000000000e28230 0x1001 0 0000000008e701e8
T/0000094C/grGetObjectInfo: 0000000000e4bba0 0x206800 000000006101e8c8 0000000008e70200
T/0000094C/grGetMemoryHeapCount: 0000000000e28230 0000000087ff5d38
T/0000094C/grGetMemoryHeapInfo: 0000000000e28230 0 0x6200 000000006101e9a0 0000000087ff5d40
T/0000094C/grGetMemoryHeapInfo: 0000000000e28230 1 0x6200 000000006101e9a0 0000000087ff5d70
T/0000094C/grGetMemoryHeapInfo: 0000000000e28230 2 0x6200 000000006101e9a0 0000000087ff5da0
T/0000094C/grGetMemoryHeapInfo: 0000000000e28230 3 0x6200 000000006101e9a0 0000000087ff5dd0
W/0000094C/grWsiWinSetMaxQueuedFrames: STUB
Cherser-s commented 3 years ago

Interesting, might look into it as well.

I think it's probably due to T/0000094C/grGetObjectInfo: 0000000000e28070 0x206801 000000006101e9a8 000000006101ea88, since client requests object info with weird object type 0x206801, it's probably 0x6801 which is parent device (seems to be a hack), but it's not handled at all. Also there is some request with info type GR_WSI_WIN_INFO_TYPE_QUEUE_PROPERTIES for device queues, which is not documented and exists in include files.

UPD: object type is really 0x206801, which is GR_WSI_WIN_INFO_TYPE_DISPLAY_PROPERTIES flag (used for GR_WSI_WIN_DISPLAY_PROPERTIES), so it checks display info. I think it's better to launch bf4 in windowed mode btw. Well, it will probably require grWsiWinGetDisplays anyway.

libcg commented 3 years ago

@Cherser-s correct, I'm handling that and using windowed mode, but it still crashes.

Cherser-s commented 3 years ago

Wait, did you already implemented the code, which is handling these 3 features? Can you push it at least to another branch, so I can look into it? I've implemented it partially myself though.

Also, do you handle ""present"" queue flags info as well?

Cherser-s commented 3 years ago

Hm weird, it doesn't initialize on my machine at all, it doesn't get past grGetGpuInfo and fallbacks to dxvk. The game probably didn't like gpu info.

libcg commented 3 years ago

@Cherser-s That's because BF4 checks the driver version. I pushed a fix earlier today, I'll post another branch tomorrow so you can take a look

Cherser-s commented 3 years ago

Yeah I got those commits, and managed to reproduce the same problem as well. Interesting, that crash happens much later, perhaps during the game initialization, as there are no calls at all to Mantle libraries.

It's really weird that the game (the code in executable that is) itself crashes.

xatornet commented 3 years ago

v0.4.0 has the same problem on DA:I

Sin título

Here's grvk.log: === GRVK 0.4.0 === I/00003ADC/grInitAndEnumerateGpus: app "DragonAgeInquisition" (01000000), engine "Frostbite" (00FDE001), api 00018000 W/00003ADC/grInitAndEnumerateGpus: unhandled alloc callbacks I/00003ADC/grCreateDevice: 10DE:2204 "NVIDIA GeForce RTX 3090" (Vulkan 1.2.168, driver 466.11.0) W/00003ADC/grWsiWinGetDisplays: semi-stub W/00003ADC/grGetObjectInfo: unsupported info type 0x206801 E/00003ADC/grGetGpuInfo: unsupported info type 0x6102

libcg commented 3 years ago

I have patches for this that I haven't posted yet.

xatornet commented 3 years ago

I have patches for this that I haven't posted yet.

I'll be looking for those on later releases

Osyfe commented 3 years ago

For me, DA:I is working just fine (Windows 10, Radeon RX 570) with the versions 0.2.0 (about 30h of gameplay without any issues) and 0.3.0, although it seems that multisampling is not working in 0.2.0. However, with the newest version I get

=== GRVK 0.4.0 === I/00006174/grInitAndEnumerateGpus: app "DragonAgeInquisition" (01000000), engine "Frostbite" (00FDE001), api 00018000 W/00006174/grInitAndEnumerateGpus: unhandled alloc callbacks I/00006174/grCreateDevice: 1002:67DF "Radeon RX 570 Series" (Vulkan 1.2.159, driver 2.0.168) W/00006174/decodeInstruction: unhandled opcode 258 W/00006174/emitInstr: unhandled instruction 258 E/00006174/loadSource: source register 4 4099 not found E/00006174/loadSource: source register 4 4096 not found E/00006174/emitStructuredSrvLoad: resource 0 not found E/00006174/loadSource: source register 4 4099 not found E/00006174/loadSource: source register 4 4097 not found E/00006174/emitStructuredSrvLoad: resource 0 not found E/00006174/loadSource: source register 4 4099 not found E/00006174/loadSource: source register 4 4098 not found E/00006174/emitStructuredSrvLoad: resource 0 not found

and the game crashes silently.

libcg commented 3 years ago

@Osyfe there's no way the game actually runs on Mantle with 0.2.0, it's most likely falling back to DX11 at boot.

Osyfe commented 3 years ago

@Osyfe there's no way the game actually runs on Mantle with 0.2.0, it's most likely falling back to DX11 at boot.

Oh, interesting. Is there a way to check which API actually is used by an application?

Cherser-s commented 3 years ago

@Osyfe there's no way the game actually runs on Mantle with 0.2.0, it's most likely falling back to DX11 at boot.

Confirm this, in the case if either there isn't mantleaxl64.dll present, or API version isn't supported by the client (as in case with FB3 games), then the game just fallbacks to using D3D11.

W/00006174/decodeInstruction: unhandled opcode 258 W/00006174/emitInstr: unhandled instruction 258

Hmmm, also have to implement more UAV operations. Interesting, how it doesn't crash after calling WSI functions here...

Cherser-s commented 3 years ago

Try applying this patch to the latest commit from master branch to avoid those errors in shader translation:

diff --git a/src/amdilc/amdilc_compiler.c b/src/amdilc/amdilc_compiler.c
index 52e8e97..5b752d8 100644
--- a/src/amdilc/amdilc_compiler.c
+++ b/src/amdilc/amdilc_compiler.c
@@ -2125,13 +2125,26 @@ static void emitUavAtomicOp(
     IlcSpvId src1Id = loadSource(compiler, &instr->srcs[1], COMP_MASK_XYZW, vecTypeId);
     IlcSpvId valueId = emitVectorTrim(compiler, src1Id, vecTypeId, COMP_INDEX_X, 1);

-    if (instr->opcode == IL_OP_UAV_ADD || instr->opcode == IL_OP_UAV_READ_ADD) {
-        readId = ilcSpvPutAtomicOp(compiler->module, SpvOpAtomicIAdd, resource->texelTypeId,
-                                   texelPtrId, scopeId, semanticsId, valueId);
-    } else {
+    IlcSpvWord operation;
+    switch (instr->opcode) {
+    case IL_OP_UAV_ADD:
+    case IL_OP_UAV_READ_ADD:
+        operation = SpvOpAtomicIAdd;
+        break;
+    case IL_OP_UAV_MAX:
+    case IL_OP_UAV_READ_MAX:
+        operation = SpvOpAtomicSMax;
+        break;
+    case IL_OP_UAV_MIN:
+    case IL_OP_UAV_READ_MIN:
+        operation = SpvOpAtomicSMin;
+        break;
+    default:
         assert(false);
+        break;
     }
-
+    readId = ilcSpvPutAtomicOp(compiler->module, operation, resource->texelTypeId,
+                               texelPtrId, scopeId, semanticsId, valueId);
     if (instr->dstCount > 0) {
         IlcSpvId resId = emitVectorGrow(compiler, readId, resource->texelTypeId, 1);
         storeDestination(compiler, &instr->dsts[0], resId, vecTypeId);
@@ -2407,6 +2420,10 @@ static void emitInstr(
         break;
     case IL_OP_UAV_ADD:
     case IL_OP_UAV_READ_ADD:
+    case IL_OP_UAV_MAX:
+    case IL_OP_UAV_READ_MAX:
+    case IL_OP_UAV_MIN:
+    case IL_OP_UAV_READ_MIN:
         emitUavAtomicOp(compiler, instr);
         break;
     case IL_OP_DCL_STRUCT_SRV:
diff --git a/src/amdilc/amdilc_decoder.c b/src/amdilc/amdilc_decoder.c
index 309fb1e..2003b73 100644
--- a/src/amdilc/amdilc_decoder.c
+++ b/src/amdilc/amdilc_decoder.c
@@ -99,6 +99,10 @@ static const OpcodeInfo mOpcodeInfos[IL_OP_LAST] = {
     [IL_OP_UAV_STORE] = { IL_OP_UAV_STORE, 0, 2, 0, false },
     [IL_OP_UAV_ADD] = { IL_OP_UAV_ADD, 0, 2, 0, false },
     [IL_OP_UAV_READ_ADD] = { IL_OP_UAV_READ_ADD, 1, 2, 0, false },
+    [IL_OP_UAV_MAX] = { IL_OP_UAV_MAX, 0, 2, 0, false },
+    [IL_OP_UAV_READ_MAX] = { IL_OP_UAV_READ_MAX, 1, 2, 0, false },
+    [IL_OP_UAV_MIN] = { IL_OP_UAV_MIN, 0, 2, 0, false },
+    [IL_OP_UAV_READ_MIN] = { IL_OP_UAV_READ_MIN, 1, 2, 0, false },
     [IL_OP_DCL_STRUCT_SRV] = { IL_OP_DCL_STRUCT_SRV, 0, 0, 1, false },
     [IL_OP_SRV_STRUCT_LOAD] = { IL_OP_SRV_STRUCT_LOAD, 1, 1, 0, false },
     [IL_DCL_STRUCT_LDS] = { IL_DCL_STRUCT_LDS, 0, 0, 2, false },

P.S.: I think the UAV atomics should be typed.

Osyfe commented 3 years ago

Try applying this patch to the latest commit from master branch to avoid those errors in shader translation:

I still get the same error.

Cherser-s commented 3 years ago

Ah damn it, it seems that I assume the wrong opcode Sorry, it wasn't UAV operations, instead it was RAW_SRV resource handling, which is not implemented yet...

Cherser-s commented 3 years ago

Ok, I have added this missing instruction handling, please try it out. I also added dump handling for these instructions.

diff --git a/src/amdilc/amdilc_compiler.c b/src/amdilc/amdilc_compiler.c
index 52e8e97..b549301 100644
--- a/src/amdilc/amdilc_compiler.c
+++ b/src/amdilc/amdilc_compiler.c
@@ -1078,6 +1078,41 @@ static void emitTypedUav(
     addResource(compiler, &resource);
 }

+static void emitRawSrv(
+    IlcCompiler* compiler,
+    const Instruction* instr)
+{
+    uint16_t id = GET_BITS(instr->control, 0, 13);
+
+    IlcSpvId arrayId = ilcSpvPutRuntimeArrayType(compiler->module, compiler->floatId, true);
+    IlcSpvId structId = ilcSpvPutStructType(compiler->module, 1, &arrayId);
+    IlcSpvId pointerId = ilcSpvPutPointerType(compiler->module, SpvStorageClassStorageBuffer,
+                                              structId);
+    IlcSpvId resourceId = ilcSpvPutVariable(compiler->module, pointerId,
+                                            SpvStorageClassStorageBuffer);
+
+    IlcSpvWord arrayStride = sizeof(float);
+    IlcSpvWord memberOffset = 0;
+    ilcSpvPutDecoration(compiler->module, arrayId, SpvDecorationArrayStride, 1, &arrayStride);
+    ilcSpvPutDecoration(compiler->module, structId, SpvDecorationBlock, 0, NULL);
+    ilcSpvPutMemberDecoration(compiler->module, structId, 0, SpvDecorationOffset, 1, &memberOffset);
+    ilcSpvPutDecoration(compiler->module, resourceId, SpvDecorationNonWritable, 0, NULL);
+
+    ilcSpvPutName(compiler->module, arrayId, "rawSrv");
+    emitBinding(compiler, resourceId, ILC_BASE_RESOURCE_ID + id, VK_DESCRIPTOR_TYPE_STORAGE_BUFFER);
+
+    const IlcResource resource = {
+        .id = resourceId,
+        .typeId = arrayId,
+        .texelTypeId = compiler->floatId,
+        .ilId = id,
+        .ilType = IL_USAGE_PIXTEX_UNKNOWN,
+        .strideId = 0,
+    };
+
+    addResource(compiler, &resource);
+}
+
 static void emitStructuredSrv(
     IlcCompiler* compiler,
     const Instruction* instr)
@@ -2125,13 +2160,38 @@ static void emitUavAtomicOp(
     IlcSpvId src1Id = loadSource(compiler, &instr->srcs[1], COMP_MASK_XYZW, vecTypeId);
     IlcSpvId valueId = emitVectorTrim(compiler, src1Id, vecTypeId, COMP_INDEX_X, 1);

-    if (instr->opcode == IL_OP_UAV_ADD || instr->opcode == IL_OP_UAV_READ_ADD) {
-        readId = ilcSpvPutAtomicOp(compiler->module, SpvOpAtomicIAdd, resource->texelTypeId,
-                                   texelPtrId, scopeId, semanticsId, valueId);
-    } else {
+    IlcSpvWord operation;
+    switch (instr->opcode) {
+    case IL_OP_UAV_ADD:
+    case IL_OP_UAV_READ_ADD:
+        operation = SpvOpAtomicIAdd;
+        break;
+    case IL_OP_UAV_MAX:
+    case IL_OP_UAV_READ_MAX:
+        operation = SpvOpAtomicSMax;
+        break;
+    case IL_OP_UAV_MIN:
+    case IL_OP_UAV_READ_MIN:
+        operation = SpvOpAtomicSMin;
+        break;
+    case IL_OP_UAV_OR:
+    case IL_OP_UAV_READ_OR:
+        operation = SpvOpAtomicOr;
+        break;
+    case IL_OP_UAV_AND:
+    case IL_OP_UAV_READ_AND:
+        operation = SpvOpAtomicAnd;
+        break;
+    case IL_OP_UAV_XOR:
+    case IL_OP_UAV_READ_XOR:
+        operation = SpvOpAtomicXor;
+        break;
+    default:
         assert(false);
+        break;
     }
-
+    readId = ilcSpvPutAtomicOp(compiler->module, operation, resource->texelTypeId,
+                               texelPtrId, scopeId, semanticsId, valueId);
     if (instr->dstCount > 0) {
         IlcSpvId resId = emitVectorGrow(compiler, readId, resource->texelTypeId, 1);
         storeDestination(compiler, &instr->dsts[0], resId, vecTypeId);
@@ -2202,6 +2262,65 @@ static void emitStructuredSrvLoad(
     storeDestination(compiler, dst, loadId, compiler->float4Id);
 }

+static void emitRawSrvLoad(
+    IlcCompiler* compiler,
+    const Instruction* instr)
+{
+    uint8_t ilResourceId = GET_BITS(instr->control, 0, 7);
+    bool indexedResourceId = GET_BIT(instr->control, 12);
+
+    if (indexedResourceId) {
+        LOGW("unhandled indexed resource ID\n");
+    }
+
+    const IlcResource* resource = findResource(compiler, ilResourceId);
+    const Destination* dst = &instr->dsts[0];
+
+    if (resource == NULL) {
+        LOGE("resource %d not found\n", ilResourceId);
+        return;
+    }
+
+    IlcSpvId srcId = loadSource(compiler, &instr->srcs[0], COMP_MASK_XYZW, compiler->int4Id);
+    IlcSpvId byteAddrId = emitVectorTrim(compiler, srcId, compiler->int4Id, COMP_INDEX_X, 1);
+
+    const IlcSpvId divIds[] = {
+        byteAddrId, ilcSpvPutConstant(compiler->module, compiler->intId, 4)
+    };
+    IlcSpvId wordAddrId = ilcSpvPutAlu(compiler->module, SpvOpSDiv, compiler->intId, 2, divIds);
+
+    // Read up to four components based on the destination mask
+    IlcSpvId zeroId = ilcSpvPutConstant(compiler->module, compiler->intId, ZERO_LITERAL);
+    IlcSpvId oneId = ilcSpvPutConstant(compiler->module, compiler->intId, 1);
+    IlcSpvId ptrTypeId = ilcSpvPutPointerType(compiler->module, SpvStorageClassStorageBuffer,
+                                              resource->texelTypeId);
+    IlcSpvId fZeroId = ilcSpvPutConstant(compiler->module, compiler->floatId, ZERO_LITERAL);
+    IlcSpvWord constituents[] = { fZeroId, fZeroId, fZeroId, fZeroId };
+
+    for (unsigned i = 0; i < 4; i++) {
+        if (dst->component[i] == IL_MODCOMP_NOWRITE) {
+            break;
+        }
+
+        if (i > 0) {
+            // Increment address
+            const IlcSpvId incrementIds[] = { wordAddrId, oneId };
+            wordAddrId = ilcSpvPutAlu(compiler->module, SpvOpIAdd, compiler->intId,
+                                      2, incrementIds);
+        }
+
+        const IlcSpvId indexIds[] = { zeroId, wordAddrId };
+        IlcSpvId ptrId = ilcSpvPutAccessChain(compiler->module, ptrTypeId, resource->id,
+                                              2, indexIds);
+        constituents[i] = ilcSpvPutLoad(compiler->module, resource->texelTypeId, ptrId);
+    }
+
+    IlcSpvId loadId = ilcSpvPutCompositeConstruct(compiler->module, compiler->float4Id,
+                                                  4, constituents);
+    storeDestination(compiler, dst, loadId, compiler->float4Id);
+}
+
+
 static void emitImplicitInput(
     IlcCompiler* compiler,
     SpvBuiltIn spvBuiltIn,
@@ -2407,6 +2526,16 @@ static void emitInstr(
         break;
     case IL_OP_UAV_ADD:
     case IL_OP_UAV_READ_ADD:
+    case IL_OP_UAV_MAX:
+    case IL_OP_UAV_READ_MAX:
+    case IL_OP_UAV_MIN:
+    case IL_OP_UAV_READ_MIN:
+    case IL_OP_UAV_AND:
+    case IL_OP_UAV_READ_AND:
+    case IL_OP_UAV_OR:
+    case IL_OP_UAV_READ_OR:
+    case IL_OP_UAV_XOR:
+    case IL_OP_UAV_READ_XOR:
         emitUavAtomicOp(compiler, instr);
         break;
     case IL_OP_DCL_STRUCT_SRV:
@@ -2415,6 +2544,12 @@ static void emitInstr(
     case IL_OP_SRV_STRUCT_LOAD:
         emitStructuredSrvLoad(compiler, instr);
         break;
+    case IL_OP_DCL_RAW_SRV:
+        emitRawSrv(compiler, instr);
+        break;
+    case IL_OP_SRV_RAW_LOAD:
+        emitRawSrvLoad(compiler, instr);
+        break;
     case IL_DCL_STRUCT_LDS:
         emitStructuredLds(compiler, instr);
         break;
diff --git a/src/amdilc/amdilc_decoder.c b/src/amdilc/amdilc_decoder.c
index 309fb1e..10ff8c4 100644
--- a/src/amdilc/amdilc_decoder.c
+++ b/src/amdilc/amdilc_decoder.c
@@ -99,8 +99,20 @@ static const OpcodeInfo mOpcodeInfos[IL_OP_LAST] = {
     [IL_OP_UAV_STORE] = { IL_OP_UAV_STORE, 0, 2, 0, false },
     [IL_OP_UAV_ADD] = { IL_OP_UAV_ADD, 0, 2, 0, false },
     [IL_OP_UAV_READ_ADD] = { IL_OP_UAV_READ_ADD, 1, 2, 0, false },
+    [IL_OP_UAV_MAX] = { IL_OP_UAV_MAX, 0, 2, 0, false },
+    [IL_OP_UAV_READ_MAX] = { IL_OP_UAV_READ_MAX, 1, 2, 0, false },
+    [IL_OP_UAV_MIN] = { IL_OP_UAV_MIN, 0, 2, 0, false },
+    [IL_OP_UAV_READ_MIN] = { IL_OP_UAV_READ_MIN, 1, 2, 0, false },
+    [IL_OP_UAV_AND] = { IL_OP_UAV_AND, 0, 2, 0, false },
+    [IL_OP_UAV_READ_AND] = { IL_OP_UAV_READ_AND, 1, 2, 0, false },
+    [IL_OP_UAV_OR] = { IL_OP_UAV_OR, 0, 2, 0, false },
+    [IL_OP_UAV_READ_OR] = { IL_OP_UAV_READ_OR, 1, 2, 0, false },
+    [IL_OP_UAV_XOR] = { IL_OP_UAV_XOR, 0, 2, 0, false },
+    [IL_OP_UAV_READ_XOR] = { IL_OP_UAV_READ_XOR, 1, 2, 0, false },
     [IL_OP_DCL_STRUCT_SRV] = { IL_OP_DCL_STRUCT_SRV, 0, 0, 1, false },
     [IL_OP_SRV_STRUCT_LOAD] = { IL_OP_SRV_STRUCT_LOAD, 1, 1, 0, false },
+    [IL_OP_DCL_RAW_SRV] = { IL_OP_DCL_RAW_SRV, 0, 0, 0, false },
+    [IL_OP_SRV_RAW_LOAD] = { IL_OP_SRV_RAW_LOAD, 1, 1, 0, false },
     [IL_DCL_STRUCT_LDS] = { IL_DCL_STRUCT_LDS, 0, 0, 2, false },
     [IL_OP_U_BIT_EXTRACT] = { IL_OP_U_BIT_EXTRACT, 1, 3, 0, false },
     [IL_OP_U_BIT_INSERT] = { IL_OP_U_BIT_INSERT, 1, 4, 0, false },
diff --git a/src/amdilc/amdilc_dump.c b/src/amdilc/amdilc_dump.c
index 6d2d173..28827ce 100644
--- a/src/amdilc/amdilc_dump.c
+++ b/src/amdilc/amdilc_dump.c
@@ -721,6 +721,36 @@ static void dumpInstruction(
     case IL_OP_UAV_READ_ADD:
         fprintf(file, "uav_read_add_id(%u)", GET_BITS(instr->control, 0, 13));
         break;
+    case IL_OP_UAV_MAX:
+        fprintf(file, "uav_max_id(%u)", GET_BITS(instr->control, 0, 13));
+        break;
+    case IL_OP_UAV_READ_MAX:
+        fprintf(file, "uav_read_max_id(%u)", GET_BITS(instr->control, 0, 13));
+        break;
+    case IL_OP_UAV_MIN:
+        fprintf(file, "uav_min_id(%u)", GET_BITS(instr->control, 0, 13));
+        break;
+    case IL_OP_UAV_READ_MIN:
+        fprintf(file, "uav_read_min_id(%u)", GET_BITS(instr->control, 0, 13));
+        break;
+    case IL_OP_UAV_OR:
+        fprintf(file, "uav_or_id(%u)", GET_BITS(instr->control, 0, 13));
+        break;
+    case IL_OP_UAV_READ_OR:
+        fprintf(file, "uav_read_or_id(%u)", GET_BITS(instr->control, 0, 13));
+        break;
+    case IL_OP_UAV_AND:
+        fprintf(file, "uav_and_id(%u)", GET_BITS(instr->control, 0, 13));
+        break;
+    case IL_OP_UAV_READ_AND:
+        fprintf(file, "uav_read_and_id(%u)", GET_BITS(instr->control, 0, 13));
+        break;
+    case IL_OP_UAV_XOR:
+        fprintf(file, "uav_xor_id(%u)", GET_BITS(instr->control, 0, 13));
+        break;
+    case IL_OP_UAV_READ_XOR:
+        fprintf(file, "uav_read_xor_id(%u)", GET_BITS(instr->control, 0, 13));
+        break;
     case IL_OP_DCL_STRUCT_SRV:
         fprintf(file, "dcl_struct_srv_id(%u) %u",
                 GET_BITS(instr->control, 0, 13), instr->extras[0]);
@@ -729,6 +759,13 @@ static void dumpInstruction(
         fprintf(file, "srv_struct_load%s_id(%u)",
                 GET_BIT(instr->control, 12) ? "_ext" : "", GET_BITS(instr->control, 0, 7));
         break;
+    case IL_OP_DCL_RAW_SRV:
+        fprintf(file, "dcl_raw_srv_id(%u)", GET_BITS(instr->control, 0, 13));
+        break;
+    case IL_OP_SRV_RAW_LOAD:
+        fprintf(file, "srv_raw_load%s_id(%u)",
+                GET_BIT(instr->control, 12) ? "_ext" : "", GET_BITS(instr->control, 0, 7));
+        break;
     case IL_DCL_STRUCT_LDS:
         fprintf(file, "dcl_struct_lds_id(%u) %u, %u",
                 GET_BITS(instr->control, 0, 13), instr->extras[0], instr->extras[1]);
Osyfe commented 3 years ago

Ok, I have added this missing instruction handling, please try it out. I also added dump handling for these instructions.

The instruction errors have disappeared, but the game still crashes:

=== GRVK 0.4.0 === I/0000029C/grInitAndEnumerateGpus: app "DragonAgeInquisition" (01000000), engine "Frostbite" (00FDE001), api 00018000 W/0000029C/grInitAndEnumerateGpus: unhandled alloc callbacks I/0000029C/grCreateDevice: 1002:67DF "Radeon RX 570 Series" (Vulkan 1.2.170, driver 2.0.179) E/0000029C/loadSource: source register 4 4099 not found E/0000029C/loadSource: source register 4 4096 not found E/0000029C/loadSource: source register 4 4099 not found E/0000029C/loadSource: source register 4 4097 not found E/0000029C/loadSource: source register 4 4099 not found E/0000029C/loadSource: source register 4 4098 not found

Cherser-s commented 3 years ago

Now that's weird, especially considering that there are no longer missing instructions in the log...

Cherser-s commented 3 years ago

I'm getting these errors while launching bf4 (also I'm getting graphical errors obviously), also I've resolved AMD IL opcodes myself:

Cherser-s commented 3 years ago

ok, so I've implemented some changes regarding some of these instructions:

diff --git a/src/amdilc/amdilc_compiler.c b/src/amdilc/amdilc_compiler.c
index a150d26..5d042d8 100644
--- a/src/amdilc/amdilc_compiler.c
+++ b/src/amdilc/amdilc_compiler.c
@@ -48,6 +48,7 @@ typedef struct {
     uint32_t ilId;
     uint8_t ilType;
     IlcSpvId strideId;
+    bool structured;
 } IlcResource;

 typedef struct {
@@ -1144,6 +1145,7 @@ static void emitResource(
         .ilId = id,
         .ilType = type,
         .strideId = 0,
+        .structured = false,
     };

     addResource(compiler, &resource);
@@ -1210,6 +1212,7 @@ static void emitTypedUav(
         .ilId = id,
         .ilType = type,
         .strideId = 0,
+        .structured = false,
     };

     addResource(compiler, &resource);
@@ -1219,6 +1222,7 @@ static void emitUav(
     IlcCompiler* compiler,
     const Instruction* instr)
 {
+    bool isStructured = instr->opcode == IL_OP_DCL_STRUCT_UAV;
     uint16_t id = GET_BITS(instr->control, 0, 13);

     IlcSpvId arrayId = ilcSpvPutRuntimeArrayType(compiler->module, compiler->floatId, true);
@@ -1234,7 +1238,7 @@ static void emitUav(
     ilcSpvPutDecoration(compiler->module, structId, SpvDecorationBlock, 0, NULL);
     ilcSpvPutMemberDecoration(compiler->module, structId, 0, SpvDecorationOffset, 1, &memberOffset);

-    ilcSpvPutName(compiler->module, arrayId, "structUav");
+    ilcSpvPutName(compiler->module, arrayId, isStructured ? "structUav" : "rawUav");
     emitBinding(compiler, resourceId, ILC_BASE_RESOURCE_ID + id, VK_DESCRIPTOR_TYPE_STORAGE_BUFFER);

     const IlcResource resource = {
@@ -1244,7 +1248,9 @@ static void emitUav(
         .texelTypeId = compiler->floatId,
         .ilId = id,
         .ilType = IL_USAGE_PIXTEX_UNKNOWN,
-        .strideId = instr->extras[0],
+        .strideId = ilcSpvPutConstant(compiler->module, compiler->intId,
+                                      isStructured ? instr->extras[0] : 4),
+        .structured = isStructured,
     };

     addResource(compiler, &resource);
@@ -1283,6 +1289,35 @@ static void emitSrv(
         .ilType = IL_USAGE_PIXTEX_UNKNOWN,
         .strideId = ilcSpvPutConstant(compiler->module, compiler->intId,
                                       isStructured ? instr->extras[0] : 4),
+        .structured = isStructured,
+    };
+
+    addResource(compiler, &resource);
+}
+
+static void emitLds(
+    IlcCompiler* compiler,
+    const Instruction* instr)
+{
+    uint16_t id = GET_BITS(instr->control, 0, 13);
+    unsigned length = instr->extras[0];
+
+    IlcSpvId lengthId = ilcSpvPutConstant(compiler->module, compiler->uintId, length);
+    IlcSpvId arrayId = ilcSpvPutArrayType(compiler->module, compiler->uintId, lengthId);
+    IlcSpvId pArrayId = ilcSpvPutPointerType(compiler->module, SpvStorageClassWorkgroup, arrayId);
+    IlcSpvId resourceId = ilcSpvPutVariable(compiler->module, pArrayId, SpvStorageClassWorkgroup);
+
+    ilcSpvPutName(compiler->module, arrayId, "rawLds");
+
+    const IlcResource resource = {
+        .resType = RES_TYPE_LDS,
+        .id = resourceId,
+        .typeId = arrayId,
+        .texelTypeId = compiler->uintId,
+        .ilId = id,
+        .ilType = IL_USAGE_PIXTEX_UNKNOWN,
+        .strideId = ilcSpvPutConstant(compiler->module, compiler->intId, 4),
+        .structured = false,
     };

     addResource(compiler, &resource);
@@ -1311,6 +1346,7 @@ static void emitStructuredLds(
         .ilId = id,
         .ilType = IL_USAGE_PIXTEX_UNKNOWN,
         .strideId = ilcSpvPutConstant(compiler->module, compiler->intId, stride),
+        .structured = true,
     };

     addResource(compiler, &resource);
@@ -2450,30 +2486,169 @@ static void emitUavStore(
     ilcSpvPutImageWrite(compiler->module, resourceId, addressId, elementId);
 }

-static void emitUavAtomicOp(
+static void emitStructUavStore(
     IlcCompiler* compiler,
     const Instruction* instr)
 {
     uint8_t ilResourceId = GET_BITS(instr->control, 0, 14);

     const IlcResource* resource = findResource(compiler, RES_TYPE_GENERIC, ilResourceId);
+    const Destination* dst = &instr->dsts[0];

     if (resource == NULL) {
         LOGE("resource %d not found\n", ilResourceId);
         return;
     }

-    IlcSpvId vecTypeId = ilcSpvPutVectorType(compiler->module, resource->texelTypeId, 4);
-    IlcSpvId pointerTypeId = ilcSpvPutPointerType(compiler->module, SpvStorageClassImage,
+    //IlcSpvId resourceId = ilcSpvPutLoad(compiler->module, resource->typeId, resource->id);
+    IlcSpvId srcId = loadSource(compiler, &instr->srcs[0], COMP_MASK_XYZW, compiler->int4Id);
+    IlcSpvId indexId = emitVectorTrim(compiler, srcId, compiler->int4Id, COMP_INDEX_X, 1);
+    IlcSpvId offsetId = emitVectorTrim(compiler, srcId, compiler->int4Id, COMP_INDEX_Y, 1);
+
+    IlcSpvId elementTypeId = ilcSpvPutVectorType(compiler->module, resource->texelTypeId, 4);
+    IlcSpvId elementId = loadSource(compiler, &instr->srcs[1], COMP_MASK_XYZW, elementTypeId);
+
+    // addr = (index * stride + offset) / 4
+    const IlcSpvId mulIds[] = { indexId, resource->strideId };
+    IlcSpvId baseId = ilcSpvPutAlu(compiler->module, SpvOpIMul, compiler->intId, 2, mulIds);
+    const IlcSpvId addIds[] = { baseId, offsetId };
+    IlcSpvId byteAddrId = ilcSpvPutAlu(compiler->module, SpvOpIAdd, compiler->intId, 2, addIds);
+    const IlcSpvId divIds[] = {
+        byteAddrId, ilcSpvPutConstant(compiler->module, compiler->intId, 4)
+    };
+    IlcSpvId wordAddrId = ilcSpvPutAlu(compiler->module, SpvOpSDiv, compiler->intId, 2, divIds);
+
+    IlcSpvId oneId = ilcSpvPutConstant(compiler->module, compiler->intId, 1);
+    IlcSpvId ptrTypeId = ilcSpvPutPointerType(compiler->module, SpvStorageClassStorageBuffer,
+                                              resource->texelTypeId);
+    // Write up to four components based on the destination mask
+    for (unsigned i = 0; i < 4; i++) {
+        if (dst->component[i] == IL_MODCOMP_NOWRITE) {
+            break;
+        }
+
+        if (i > 0) {
+            // Increment address
+            const IlcSpvId incrementIds[] = { wordAddrId, oneId };
+            wordAddrId = ilcSpvPutAlu(compiler->module, SpvOpIAdd, compiler->intId,
+                                      2, incrementIds);
+        }
+
+        IlcSpvId ptrId = ilcSpvPutAccessChain(compiler->module, ptrTypeId, resource->id,
+                                              1, &wordAddrId);
+        IlcSpvId componentId = emitVectorTrim(compiler, elementId, elementTypeId, i, 1);
+        ilcSpvPutStore(compiler->module, ptrId, componentId);
+    }
+}
+
+static void emitLdsAtomicOp(
+    IlcCompiler* compiler,
+    const Instruction* instr)
+{
+    uint8_t ilResourceId = GET_BITS(instr->control, 0, 4);
+
+    const IlcResource* resource = findResource(compiler, RES_TYPE_LDS, ilResourceId);
+
+    if (resource == NULL) {
+        LOGE("resource %d not found\n", ilResourceId);
+        return;
+    }
+
+    IlcSpvId pointerTypeId = ilcSpvPutPointerType(compiler->module, SpvStorageClassWorkgroup,
                                                   resource->texelTypeId);
     IlcSpvId addressId = loadSource(compiler, &instr->srcs[0], COMP_MASK_XYZW, compiler->int4Id);
-    IlcSpvId trimAddressId = emitVectorTrim(compiler, addressId, compiler->int4Id, COMP_INDEX_X,
-                                            getResourceDimensionCount(resource->ilType));
-    IlcSpvId zeroId = ilcSpvPutConstant(compiler->module, compiler->intId, ZERO_LITERAL);
-    IlcSpvId texelPtrId = ilcSpvPutImageTexelPointer(compiler->module, pointerTypeId, resource->id,
-                                                     trimAddressId, zeroId);
+    IlcSpvId byteAddrId;
+    if (resource->structured) {
+        IlcSpvId indexId = emitVectorTrim(compiler, addressId, compiler->int4Id, COMP_INDEX_X, 1);
+        IlcSpvId offsetId = emitVectorTrim(compiler, addressId, compiler->int4Id, COMP_INDEX_Y, 1);
+        // addr = (index * stride + offset) / 4
+        const IlcSpvId mulIds[] = { indexId, resource->strideId };
+        IlcSpvId baseId = ilcSpvPutAlu(compiler->module, SpvOpIMul, compiler->intId, 2, mulIds);
+        const IlcSpvId addIds[] = { baseId, offsetId };
+        byteAddrId = ilcSpvPutAlu(compiler->module, SpvOpIAdd, compiler->intId, 2, addIds);
+    } else {
+        byteAddrId = emitVectorTrim(compiler, addressId, compiler->int4Id, COMP_INDEX_X, 1);
+    }
+    const IlcSpvId divIds[] = {
+        byteAddrId, ilcSpvPutConstant(compiler->module, compiler->intId, 4)
+    };
+    IlcSpvId wordAddrId = ilcSpvPutAlu(compiler->module, SpvOpSDiv, compiler->intId, 2, divIds);
+    IlcSpvId bufferPtrId = ilcSpvPutAccessChain(compiler->module, pointerTypeId, resource->id,
+                                      1, &wordAddrId);
+    IlcSpvId readId = 0;
+    IlcSpvId vecTypeId = ilcSpvPutVectorType(compiler->module, resource->texelTypeId, 4);
+    IlcSpvId scopeId = ilcSpvPutConstant(compiler->module, compiler->intId, SpvScopeDevice);
+    IlcSpvId semanticsId = ilcSpvPutConstant(compiler->module, compiler->intId,
+                                             SpvMemorySemanticsAcquireReleaseMask |
+                                             SpvMemorySemanticsImageMemoryMask);
+    IlcSpvId src1Id = loadSource(compiler, &instr->srcs[1], COMP_MASK_XYZW, vecTypeId);
+    IlcSpvId valueId = emitVectorTrim(compiler, src1Id, vecTypeId, COMP_INDEX_X, 1);
+
+    if (instr->opcode == IL_OP_LDS_ADD || instr->opcode == IL_OP_LDS_READ_ADD) {
+        readId = ilcSpvPutAtomicOp(compiler->module, SpvOpAtomicIAdd, resource->texelTypeId,
+                                   bufferPtrId, scopeId, semanticsId, valueId);
+    } else if (instr->opcode == IL_OP_LDS_UMAX || instr->opcode == IL_OP_LDS_READ_UMAX) {
+        readId = ilcSpvPutAtomicOp(compiler->module, SpvOpAtomicUMax, resource->texelTypeId,
+                                   bufferPtrId, scopeId, semanticsId, valueId);
+    } else {
+        assert(false);
+    }

+    if (instr->dstCount > 0) {
+        IlcSpvId resId = emitVectorGrow(compiler, readId, resource->texelTypeId, 1);
+        storeDestination(compiler, &instr->dsts[0], resId, vecTypeId);
+    }
+}
+
+static void emitUavAtomicOp(
+    IlcCompiler* compiler,
+    const Instruction* instr)
+{
+    uint8_t ilResourceId = GET_BITS(instr->control, 0, 14);
+
+    const IlcResource* resource = findResource(compiler, RES_TYPE_GENERIC, ilResourceId);
+
+    if (resource == NULL) {
+        LOGE("resource %d not found\n", ilResourceId);
+        return;
+    }
+
+    IlcSpvId texelPtrId ;
+
+    if (resource->strideId == 0) {
+        IlcSpvId pointerTypeId = ilcSpvPutPointerType(compiler->module, SpvStorageClassImage,
+                                                      resource->texelTypeId);
+        IlcSpvId addressId = loadSource(compiler, &instr->srcs[0], COMP_MASK_XYZW, compiler->int4Id);
+        IlcSpvId trimAddressId = emitVectorTrim(compiler, addressId, compiler->int4Id, COMP_INDEX_X,
+                                                getResourceDimensionCount(resource->ilType));
+        IlcSpvId zeroId = ilcSpvPutConstant(compiler->module, compiler->intId, ZERO_LITERAL);
+        texelPtrId = ilcSpvPutImageTexelPointer(compiler->module, pointerTypeId, resource->id,
+                                                trimAddressId, zeroId);
+    } else {
+        IlcSpvId pointerTypeId = ilcSpvPutPointerType(compiler->module, SpvStorageClassStorageBuffer,
+                                                      resource->texelTypeId);
+        IlcSpvId addressId = loadSource(compiler, &instr->srcs[0], COMP_MASK_XYZW, compiler->int4Id);
+        IlcSpvId byteAddrId;
+        if (resource->structured) {
+            IlcSpvId indexId = emitVectorTrim(compiler, addressId, compiler->int4Id, COMP_INDEX_X, 1);
+            IlcSpvId offsetId = emitVectorTrim(compiler, addressId, compiler->int4Id, COMP_INDEX_Y, 1);
+            // addr = (index * stride + offset) / 4
+            const IlcSpvId mulIds[] = { indexId, resource->strideId };
+            IlcSpvId baseId = ilcSpvPutAlu(compiler->module, SpvOpIMul, compiler->intId, 2, mulIds);
+            const IlcSpvId addIds[] = { baseId, offsetId };
+            byteAddrId = ilcSpvPutAlu(compiler->module, SpvOpIAdd, compiler->intId, 2, addIds);
+        } else {
+            byteAddrId = emitVectorTrim(compiler, addressId, compiler->int4Id, COMP_INDEX_X, 1);
+        }
+        const IlcSpvId divIds[] = {
+            byteAddrId, ilcSpvPutConstant(compiler->module, compiler->intId, 4)
+        };
+        IlcSpvId wordAddrId = ilcSpvPutAlu(compiler->module, SpvOpSDiv, compiler->intId, 2, divIds);
+        texelPtrId = ilcSpvPutAccessChain(compiler->module, pointerTypeId, resource->id,
+                                              1, &wordAddrId);
+    }
     IlcSpvId readId = 0;
+    IlcSpvId vecTypeId = ilcSpvPutVectorType(compiler->module, resource->texelTypeId, 4);
     IlcSpvId scopeId = ilcSpvPutConstant(compiler->module, compiler->intId, SpvScopeDevice);
     IlcSpvId semanticsId = ilcSpvPutConstant(compiler->module, compiler->intId,
                                              SpvMemorySemanticsAcquireReleaseMask |
@@ -2484,6 +2659,9 @@ static void emitUavAtomicOp(
     if (instr->opcode == IL_OP_UAV_ADD || instr->opcode == IL_OP_UAV_READ_ADD) {
         readId = ilcSpvPutAtomicOp(compiler->module, SpvOpAtomicIAdd, resource->texelTypeId,
                                    texelPtrId, scopeId, semanticsId, valueId);
+    } else if (instr->opcode == IL_OP_UAV_UMAX || instr->opcode == IL_OP_UAV_READ_UMAX) {
+        readId = ilcSpvPutAtomicOp(compiler->module, SpvOpAtomicUMax, resource->texelTypeId,
+                                   texelPtrId, scopeId, semanticsId, valueId);
     } else {
         assert(false);
     }
@@ -2777,7 +2955,9 @@ static void emitInstr(
     case IL_OP_DCL_TYPED_UAV:
         emitTypedUav(compiler, instr);
         break;
+    case IL_OP_DCL_STRUCT_UAV:
     case IL_OP_DCL_TYPELESS_UAV:
+    case IL_OP_DCL_RAW_UAV:
         emitUav(compiler, instr);
         break;
     case IL_OP_UAV_LOAD:
@@ -2786,10 +2966,21 @@ static void emitInstr(
     case IL_OP_UAV_STORE:
         emitUavStore(compiler, instr);
         break;
+    case IL_OP_UAV_STRUCT_STORE:
+        emitStructUavStore(compiler, instr);
+        break;
     case IL_OP_UAV_ADD:
     case IL_OP_UAV_READ_ADD:
+    case IL_OP_UAV_UMAX:
+    case IL_OP_UAV_READ_UMAX:
         emitUavAtomicOp(compiler, instr);
         break;
+    case IL_OP_LDS_ADD:
+    case IL_OP_LDS_READ_ADD:
+    case IL_OP_LDS_UMAX:
+    case IL_OP_LDS_READ_UMAX:
+        emitLdsAtomicOp(compiler, instr);
+        break;
     case IL_OP_DCL_RAW_SRV:
     case IL_OP_DCL_STRUCT_SRV:
         emitSrv(compiler, instr);
@@ -2800,6 +2991,9 @@ static void emitInstr(
     case IL_DCL_STRUCT_LDS:
         emitStructuredLds(compiler, instr);
         break;
+    case IL_DCL_LDS:
+        emitLds(compiler, instr);
+        break;
     case IL_DCL_GLOBAL_FLAGS:
         emitGlobalFlags(compiler, instr);
         break;

But some changes must be made: 1) according to docs, atomic operations have different logic on address calculation for structs, raw and typed uavs. Currently only typed UAV atomics are working properly. 2) add optional atomic counter to each resource 3) LDS atomic operations have to be implemented as well UPD: 1 and 3 were implemented in patch above

bazookaben commented 2 years ago

Tried out 0.5.0 on a Radeon 5700xt w/ Ryzen 1700x in Windows 10.

For some reason, the game would only run in exclusive fullscreen mode. If I try to run in borderless I get a crash to desktop.

Also, I noticed v-sync is broken.

Beyond that, the only other bug I noticed is that some post-effects seem to break the graphics. Like if I go out of bounds or underwater, some or all parts of the world go black.

As far as performance goes, compared to DX11 it looks like CPU is 20% slower but GPU seems about 10% faster, just based on a quick look at the game's built in performance graph.

The settings I use DX11 run a tightrope between GPU and CPU though, so on some maps I would be GPU limited, others CPU limited. So with GRVK that's become completely CPU limited (unless there is some memory bandwidth bottleneck that the in-game performance graph wouldn't show).

By the way, Frostbite Engine's peformance graph is super useful to analyze CPU/GPU performance. You can enable it in the dev console with perfoverly.DrawGraph 1

grvk.log

Also, I noticed no mantle cache was created in Documents/Battlefield 4/cache.

libcg commented 2 years ago

@bazookaben Thanks for the feedback, it's a small miracle that the game is running at all, looking at the logs some image format is not supported. You're correct that Vsync is not implemented right now, I kinda forgot about it because I'm using Freesync. I don't expect the native pipeline cache to be ever implemented, because Mantle assumes that the pipeline can be compiled on the spot, which isn't the case with the current state of Vulkan. I'll check out the performance graph for sure!

niobium93 commented 2 years ago

I'm not sure if I'm missing something obvious, but the Graphics API setting seems missing in my game? 2021-12-07-23:15:39-screenshot

libcg commented 2 years ago

@niobium93 check that the DLLs are in the correct place, and upload grvk.log if you see it in the game folder.

niobium93 commented 2 years ago

They are right next to bf4.exe No grvk.log is created by the game.

libcg commented 2 years ago

@niobium93 what's your setup like? are you using BF4 from Steam?

niobium93 commented 2 years ago

I'm on wine-tkg 6.22.r11.g61c3c024-326 and mesa-git-22.0.0_devel.147797.92d84f189c7. DXVK v1.9.2-62-gc13395db is also present. No Steam.

libcg commented 2 years ago

Moving to https://github.com/libcg/grvk/issues/37 https://github.com/libcg/grvk/issues/38 https://github.com/libcg/grvk/issues/39 https://github.com/libcg/grvk/issues/40