GPUOpen-Drivers / llpc

LLVM-Based Pipeline Compiler
MIT License
163 stars 116 forks source link

[WIP][LLPC] Always move non-uniform descriptor loads inside the waterfall loop #2859

Closed kmitropoulou closed 5 months ago

kmitropoulou commented 7 months ago

Currently, we bail-out scalarization if one of the operands of the image call is uniform. In this patch, we enable the scalarization only for the non-uniform operands. To do this, I refactored the createWaterfallLoop() .

amdvlk-admin commented 7 months ago

Test summary for commit fe0df0b6d371c83c31724e412baa11db0c6fa310

CTS tests (Failed: 0/138378)
  • Built with version 1.3.5.2
  • Ubuntu navi3x, Srdcvk
    • Passed: 35162/69163 (50.8%)
    • Failed: 0/69163 (0.0%)
    • Not Supported: 34001/69163 (49.2%)
    • Warnings: 0/69163 (0.0%)
    Ubuntu navi2x, Srdcvk
    • Passed: 35242/69215 (50.9%)
    • Failed: 0/69215 (0.0%)
    • Not Supported: 33973/69215 (49.1%)
    • Warnings: 0/69215 (0.0%)
amdvlk-admin commented 7 months ago

aaa28a512796ae19f7a8d4f75711491d1bb417f7 Jenkins build error. /jenkins/workspace/vulkan/sanitized-opensource/Github-PR/llpc-github-pr/driver_build/drivers/llpc/lgc/builder/BuilderImpl.cpp:722:30: error: ‘get32BitNonUniformIndex’ was not declared in this scope; did you mean ‘traceNonUniformIndex’? 722 | Value new32BitValue = get32BitNonUniformIndex(nonUniformIndex); | ^~~~~~~ | traceNonUniformIndex /jenkins/workspace/vulkan/sanitized-opensource/Github-PR/llpc-github-pr/driver_build/drivers/llpc/lgc/builder/BuilderImpl.cpp:729:19: error: ‘getSharedIndex’ was not declared in this scope; did you mean ‘sharedIndex’? 729 | sharedIndex = getSharedIndex(nonUniformIndices, nonUniformIndex32BitVal, traceNonUniformIndex, nonUniformInst); | ^~~~~~ | sharedIndex /jenkins/workspace/vulkan/sanitized-opensource/Github-PR/llpc-github-pr/driver_build/drivers/llpc/lgc/builder/BuilderImpl.cpp: At global scope: /jenkins/workspace/vulkan/sanitized-opensource/Github-PR/llpc-github-pr/driver_build/drivers/llpc/lgc/builder/BuilderImpl.cpp:641:13: error: ‘bool instructionsEqual(llvm::Instruction, llvm::Instruction)’ defined but not used [-Werror=unused-function] 641 | static bool instructionsEqual(Instruction lhs, Instruction rhs) { | ^~~~~ cc1plus: some warnings being treated as errors [136/322] Building CXX object compiler/llpc/llvm/tools/Continuations/CMakeFiles/LLVMContinuations.dir/lib/RegisterBuffer.cpp.o /jenkins/workspace/vulkan/sanitized-opensource/Github-PR/llpc-github-pr/driver_build/drivers/llpc/shared/continuations/lib/RegisterBuffer.cpp: In member function ‘llvm::Value llvm::RegisterBufferPass::computeMemAddr(llvm::IRBuilder<>&, llvm::Value*)’:

amdvlk-admin commented 7 months ago

Test summary for commit f1f1070bccd6d2909a16f0775cd5235747d8b43c

CTS tests (Failed: 0/138378)
  • Built with version 1.3.5.2
  • Ubuntu navi3x, Srdcvk
    • Passed: 35162/69163 (50.8%)
    • Failed: 0/69163 (0.0%)
    • Not Supported: 34001/69163 (49.2%)
    • Warnings: 0/69163 (0.0%)
    Ubuntu navi2x, Srdcvk
    • Passed: 35242/69215 (50.9%)
    • Failed: 0/69215 (0.0%)
    • Not Supported: 33973/69215 (49.1%)
    • Warnings: 0/69215 (0.0%)
piotrAMD commented 7 months ago

I understand this PR supersedes #2759 as the first commit (195b936b8b8deb5b68a3e7d19d777dea8ea85415) is the same as in #2759. Can you describe what changes are being made in the other one (f1f1070bccd6d2909a16f0775cd5235747d8b43c)?

kmitropoulou commented 7 months ago

I understand this PR supersedes #2759 as the first commit (195b936) is the same as in #2759. Can you describe what changes are being made in the other one (f1f1070)?

The first patch (#2759) has an initial implementation for the scalarization of descriptor loads. This patch enables the scalarization of the non-uniform descriptor loads even if one of the other operand is uniform.

kmitropoulou commented 7 months ago

ping

amdvlk-admin commented 6 months ago

Test summary for commit bd31f5954a3816b3898bf9f956f4bdbc8080fa08

CTS tests (Failed: 0/138378)
  • Built with version 1.3.5.2
  • Ubuntu navi3x, Srdcvk
    • Passed: 35162/69163 (50.8%)
    • Failed: 0/69163 (0.0%)
    • Not Supported: 34001/69163 (49.2%)
    • Warnings: 0/69163 (0.0%)
    Ubuntu navi2x, Srdcvk
    • Passed: 35241/69215 (50.9%)
    • Failed: 0/69215 (0.0%)
    • Not Supported: 33974/69215 (49.1%)
    • Warnings: 0/69215 (0.0%)
kmitropoulou commented 6 months ago

ping

amdvlk-admin commented 6 months ago

Test summary for commit 6ffd47fb289d6e0810b80dd26278e718d51ca0d3

CTS tests (Failed: 568/138443)
  • Built with version 1.3.5.2
  • Ubuntu navi3x, Srdcvk
    • Passed: 35154/69228 (50.8%)
    • Failed: 57/69228 (0.1%)

      Failures: ``` FAILURE: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_bool_requiredsubgroupsize64 Stack trace: Crash FAILURE: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_dvec2_requiredsubgroupsize32 Stack trace: Crash FAILURE: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_dvec4 Stack trace: Crash FAILURE: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_f16vec2 Stack trace: Crash FAILURE: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_nonconst_bvec2 Stack trace: Crash FAILURE: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_nonconst_bvec3 Stack trace: Crash FAILURE: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_nonconst_bvec4 Stack trace: Crash FAILURE: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_nonconst_double_requiredsubgroupsize32 Stack trace: Crash FAILURE: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_nonconst_dvec2 Stack trace: Crash FAILURE: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_nonconst_dvec4_requiredsubgroupsize64 Stack trace: Crash FAILURE: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_nonconst_f16vec4_requiredsubgroupsize64 Stack trace: Crash FAILURE: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_nonconst_float16_t Stack trace: Crash FAILURE: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_nonconst_float_requiredsubgroupsize64 Stack trace: Crash FAILURE: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_nonconst_i16vec4_requiredsubgroupsize32 Stack trace: Crash FAILURE: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_nonconst_i8vec2_requiredsubgroupsize64 Stack trace: Crash FAILURE: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_nonconst_u16vec4_requiredsubgroupsize64 Stack trace: Crash FAILURE: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_nonconst_vec3_requiredsubgroupsize32 Stack trace: Crash FAILURE: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_u16vec3_requiredsubgroupsize64 Stack trace: Crash FAILURE: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_u8vec4_requiredsubgroupsize64 Stack trace: Crash FAILURE: dEQP-VK.subgroups.ballot_broadcast.framebuffer.subgroupbroadcast_i16vec3geometry Stack trace: Crash ... ```

    • Not Supported: 34017/69228 (49.1%)
    • Warnings: 0/69228 (0.0%)
    Ubuntu navi2x, Srdcvk
    • Passed: 34731/69215 (50.2%)
    • Failed: 511/69215 (0.7%)

      Failures: ``` FAILURE: dEQP-VK.memory_model.message_passing.core11.u32.coherent.control_barrier.atomicwrite.subgroup.payload_nonlocal.image.guard_nonlocal.buffer.comp Stack trace: Crash FAILURE: dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.subgroup.payload_nonlocal.image.guard_local.buffer.frag Stack trace: Crash FAILURE: dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.subgroup.payload_nonlocal.workgroup.guard_nonlocal.image.comp Stack trace: Crash FAILURE: dEQP-VK.memory_model.message_passing.ext.f32.coherent.atomic_atomic.atomicrmw.subgroup.payload_local.buffer.guard_local.physbuffer.frag Stack trace: Crash FAILURE: dEQP-VK.memory_model.message_passing.ext.f32.coherent.atomic_atomic.atomicrmw.subgroup.payload_local.buffer.guard_nonlocal.image.frag Stack trace: Crash FAILURE: dEQP-VK.memory_model.message_passing.ext.f32.coherent.atomic_atomic.atomicrmw.subgroup.payload_local.physbuffer.guard_local.physbuffer.comp Stack trace: Crash FAILURE: dEQP-VK.memory_model.message_passing.ext.f32.coherent.atomic_atomic.atomicrmw.subgroup.payload_local.physbuffer.guard_nonlocal.image.comp Stack trace: Crash FAILURE: dEQP-VK.memory_model.message_passing.ext.f32.coherent.atomic_atomic.atomicrmw.subgroup.payload_local.physbuffer.guard_nonlocal.image.frag Stack trace: Crash FAILURE: dEQP-VK.memory_model.message_passing.ext.f32.coherent.atomic_atomic.atomicrmw.subgroup.payload_nonlocal.buffer.guard_local.image.vert Stack trace: Crash FAILURE: dEQP-VK.memory_model.message_passing.ext.f32.coherent.atomic_atomic.atomicrmw.subgroup.payload_nonlocal.buffer.guard_local.physbuffer.frag Stack trace: Crash FAILURE: dEQP-VK.memory_model.message_passing.ext.f32.coherent.atomic_atomic.atomicrmw.subgroup.payload_nonlocal.buffer.guard_nonlocal.image.comp Stack trace: Crash FAILURE: dEQP-VK.memory_model.message_passing.ext.f32.coherent.atomic_atomic.atomicrmw.subgroup.payload_nonlocal.image.guard_local.image.vert Stack trace: Crash FAILURE: dEQP-VK.memory_model.message_passing.ext.f32.coherent.atomic_atomic.atomicwrite.subgroup.payload_nonlocal.buffer.guard_nonlocal.physbuffer.comp Stack trace: Crash FAILURE: dEQP-VK.memory_model.message_passing.ext.f32.coherent.atomic_atomic.atomicwrite.subgroup.payload_nonlocal.workgroup.guard_nonlocal.image.comp Stack trace: Crash FAILURE: dEQP-VK.memory_model.message_passing.ext.f32.noncoherent.atomic_atomic.atomicrmw.subgroup.payload_local.image.guard_nonlocal.image.vert Stack trace: Crash FAILURE: dEQP-VK.memory_model.message_passing.ext.f32.noncoherent.atomic_atomic.atomicrmw.subgroup.payload_local.physbuffer.guard_local.buffer.vert Stack trace: Crash FAILURE: dEQP-VK.memory_model.message_passing.ext.f32.noncoherent.atomic_atomic.atomicrmw.subgroup.payload_nonlocal.image.guard_nonlocal.image.vert Stack trace: Crash FAILURE: dEQP-VK.memory_model.message_passing.ext.f32.noncoherent.atomic_atomic.atomicrmw.subgroup.payload_nonlocal.physbuffer.guard_local.image.vert Stack trace: Crash FAILURE: dEQP-VK.memory_model.message_passing.ext.f32.noncoherent.atomic_atomic.atomicrmw.subgroup.payload_nonlocal.physbuffer.guard_nonlocal.physbuffer.frag Stack trace: Crash FAILURE: dEQP-VK.memory_model.message_passing.ext.f32.noncoherent.atomic_atomic.atomicwrite.subgroup.payload_local.buffer.guard_nonlocal.physbuffer.frag Stack trace: Crash ... ```

    • Not Supported: 33973/69215 (49.1%)
    • Warnings: 0/69215 (0.0%)
amdvlk-admin commented 6 months ago

ef070126fd46d5485f1448f28cc11f0a4460b598 Jenkins build error. /jenkins/workspace/vulkan/sanitized-opensource/Github-PR/llpc-github-pr/driver_build/drivers/llpc/shared/continuations/lib/DXILCont.cpp:314:19: error: ‘class llvm::IRBuilder<>’ has no member named ‘getInt8PtrTy’; did you mean ‘getIntPtrTy’? 314 | auto PtrTy = B.getInt8PtrTy(static_cast(StackAddrspace)); | ^~~~ | getIntPtrTy [131/322] Building CXX object compiler/llpc/CMakeFiles/llpcinternal.dir/translator/lib/SPIRV/libSPIRV/SPIRVEntry.cpp.o [132/322] Building CXX object compiler/llpc/llvm/tools/Continuations/CMakeFiles/LLVMContinuations.dir/lib/LgcRtDialect.cpp.o

/jenkins/workspace/vulkan/sanitized-opensource/Github-PR/llpc-github-pr/driver_build/drivers/llpc/shared/continuations/lib/RegisterBuffer.cpp:197:22: error: ‘getWithSamePointeeType’ is not a member of ‘llvm::PointerType’ 197 | PointerType::getWithSamePointeeType( | ^~~~~~ /jenkins/workspace/vulkan/sanitized-opensource/Github-PR/llpc-github-pr/driver_build/drivers/llpc/shared/continuations/lib/RegisterBuffer.cpp:205:24: error: ‘getWithSamePointeeType’ is not a member of ‘llvm::PointerType’ 205 | PointerType::getWithSamePointeeType( | ^~~~~~ /jenkins/workspace/vulkan/sanitized-opensource/Github-PR/llpc-github-pr/driver_build/drivers/llpc/shared/continuations/lib/RegisterBuffer.cpp: In member function ‘llvm::Value llvm::RegisterBufferPass::handleSingleLoadStore(llvm::IRBuilder<>&, llvm::Type, llvm::Value, llvm::Value, llvm::Align, llvm::AAMDNodes, bool)’: /jenkins/workspace/vulkan/sanitized-opensource/Github-PR/llpc-github-pr/driver_build/drivers/llpc/shared/continuations/lib/RegisterBuffer.cpp:245:29: error: ‘getWithSamePointeeType’ is not a member of ‘llvm::PointerType’ 245 | Address, PointerType::getWithSamePointeeType(AddressType, | ^~~~~~ [135/322] Building CXX object compiler/llpc/llvm/tools/Continuations/CMakeFiles/LLVMContinuations.dir/lib/LowerAwait.cpp.o [136/322] Building CXX object compiler/llpc/llvm/tools/Continuations/CMakeFiles/LLVMContinuations.dir/lib/LegacyCleanupContinuations.cpp.o [137/322] Building CXX object compiler/llpc/llvm/tools/Continuations/CMakeFiles/LLVMContinuations.dir/lib/RemoveTypesMetadata.cpp.o