DTolm / VkFFT

Vulkan/CUDA/HIP/OpenCL/Level Zero/Metal Fast Fourier Transform library
MIT License
1.52k stars 91 forks source link

Async multistream synchronization for CUDA and HIP backends #164

Open DejvBayer opened 6 months ago

DejvBayer commented 6 months ago

This commit should fix #163.

Changes

  1. InitializeApp
    • removed streamID variable initialization
  2. RunApp
    • function VkFFTSync does nothing now for CUDA and HIP
    • added pre and post synchronization in VkFFTAppend function as proposed in issue
    • removed streamCounter variable initialization
  3. DispatchPlan
    • kernels are launched to either 0 stream or app->configuration.stream[0] if some streams were passed
    • removed event recording completely
  4. Structs
    • removed streamCounter and streamID members from CUDA and HIP configuration