Closed ShatrovOA closed 4 months ago
Hello,
creating app structure like this VkFFTApplication app = {};
makes it scope local - meaning it will be deallocated once vkfft_create call finishes. What you want is likely to call VkFFTApplication app = {};
outside the vkfft_create and then pass the pointer &app to vkfft_create. Or you can manually allocate memory for app in vkfft_create with VkFFTApplication* app = (VkFFTApplication*)calloc(1, sizeof(VkFFTApplication));
and free it later in vkfft_free call, for example.
Best regards, Dmitrii
Hello Dmitrii,
Thank you for your quick response. Second option really made it work.
I changed subroutine signature a bit:
void vkfft_create(int size, int how_many, int double_precision, VkFFTApplication **app_handle) {
VkFFTConfiguration config = {};
VkFFTApplication* app = (VkFFTApplication*)calloc(1, sizeof(VkFFTApplication));
// Populating config values
// ...
VKFFT_CALL(initializeVkFFT(app, config));
*app_handle = app;
}
and I can clearly see that app is pointing to different locations. Thank you.
I have another unrelated question. Are there any memory estimations of memory that VkFFT will allocate internally for M batches of DCT2 transforms of size N?
For DCT2 the additional memory usage depends on the system size. If system fits in shared memory of a GPU (<4096, approximately) it will not use additional memory (regardless of M). for bigger sequences it depends if system is decomposable as small primes or can be done with Rader algorithm - then the additional size will be 2x the system size (M*N). If the system uses Bluestein's algorithm, the size will be 4x. Some small additional memory is used for twiddle factors (at least M times smaller).
Thanks for clarification. I'm closing this issue.
I should start by telling, that I am not that good in C. So forgive me if answer is obvious.
I am trying to run 3d DCT on a cluster with GPUs. VkFFT is used to perform 1d batched FFT. Then data is redistributed across GPUs and FFT on another dimension is launched.
I checked out VkFFT on a single direction. Everything was fine and I was very impressed by its performance. So I started implementing it i multiple directions. I iterate over dimensions and create different VkFFTApplication plan that have their own length of transform and number of batches.
Below is how i create VkFFTApplication object
I store
app_handle
pointer and use it later to execute plan.Everything works on first iteration over dimensions. But second iteration fails on a calloc call: https://github.com/DTolm/VkFFT/blob/master/vkFFT/vkFFT/vkFFT_AppManagement/vkFFT_InitializeApp.h#L1485 Program received SIGSEGV
It turns out that creating empty app structure like this
VkFFTApplication app = {};
, app has same address that it had on a previous iteration. So, basically it is trying tocalloc
already allocated data.My question is: How should I initialize VkFFTApplication structure to make it point to different memory address every iteration?
Thanks!