Closed tfarago closed 5 years ago
Pinging @rshkarin if he knows about existing issues …
Hello guys,
I have carefully checked frequency indexing once more in September. Since all frequencies appear in equations in the second power, the indexing is OK providing that your implementation of fft returns results in a conventional order like f.i. fftw library. Famous galaxy-long line 23: frequency indices are combined in the expression float sin_arg = prefac (n_idy n_idy + n_idx * n_idx) / 2.0f so it does not matter whether Nyquist frequency is treated as the largest negative or positive frequency; indices are evaluated before in expression float n_idx = (idx >= floor((float)width / 2.0f)) ? idx - width : idx which moves amplitude at Nyquist frequency (index N/2) to the largest negative frequency.
What is interesting though that at least half of arithmetic operations done on the line 23 are absolutely unnecessary. For instance, it is assumed that dimensions of the data set can be odd numbers. Even though fftw3 supports odd-dimensions, in practice it NEVER happens because one has to pad images with something to be sure that boundary artifacts are minimized. Expressions n_idx = n_idx / (1 - normalize (1 - width)); n_idy = n_idy / (1 - normalize (1 - height)); on line 23 also unnecessary complicated. Normalize parameter has default value=1 which essentially simplifies expression to n_idx=n_idx/width. Moreover any other value of normalize parameter would truly produce mismatch between fft and phase retrieval frequencies. I wonder what was the purpose of this operation?
Have a good weekend, Sergei.
Thanks for the investigation! I just untangled the monstrous macro, so it's easier to read now.
Normalize parameter has default value=1 which essentially simplifies expression to n_idx=n_idx/width. Moreover any other value of normalize parameter would truly produce mismatch between fft and phase retrieval frequencies. I wonder what was the purpose of this operation?
Yeah, I noticed that too. And for the time being, will remove it both from the kernel and the calling task code. @rshkarin can chime in, if he wants it back or clarify it's purpose. But I don't see any as well.
Hello again,
it has occurred to me that Tomas is right. It is still necessary to check whether the frequencies are treated in the same way in both filter and Fourier transformed projection. The Nyquist frequency must be considered as the largest negative frequency in both cases (if f>=N/2 then f-=N), otherwise there is a mismatch between spatial frequencies of a tomographic view and the phase retrieval filters.
Hi all, I've been looking at the kernels and found out the check of n_idx == 0 && n_idy == 0. Why do we do that when the regularization rate always makes sure that the denominator is not zero?
You mean the idx == 0 && idy == 0
check? Looks like an optimization to me but should only be correct for the tie_method
kernel. Ignoring this issue I doubt it will improve performance much in any way.
Why do we do that when the regularization rate always makes sure that the denominator is not zero?
Actually, it's pow
that makes sure that the denominator does not become zero. End-of-Klugscheiß ;-)
By correct you mean influence the global offset to be correct? Because that's exactly my point, it doesn't seem correct to me. Tomorrow I am going to discuss this with some people to make sure my suspicions are correct. In any case, does it just generate random offsets for the other methods? If so, one more reason to get rid of it.
Regarding optimization, one thread optimization via an if statement is the exact opposite of optimization for GPUs if I remember correctly, code branching really hurts stuff. I would actually get rid of that if statement.
Regarding optimization, I would rather get rid of the sine computation in the tie
kernel, because that's a real performance killer. I know the setup stuff will get divided, but at least for the most used method I would like to guarantee that we are doing it a) correctly and b) most efficiently. I will wait for tomorrow and fix the stuff accordingly after the meeting. Anyone feel free to chip in any insights.
I would rather get rid of the sine computation in the tie kernel, because that's a real performance killer.
Yeah, I noticed that too when I refactored the code some time ago. I just pushed a change that does that but leave the correctness for you ;-)
Thanks for your generosity :smile:
Now I have to fix merge confilcts in the code I had prepared before you committed :rofl: Just kidding, those are tiny changes, all good.
Hi guys,
besides computation of the sine (which is totally useless for TIE) further improvement (and seemingly quite significant) can be achieved if you assume that only images with even-number of pixels in width/length are subjected for processing. You can then get rid of if statements which check for odd/even length when setting frequencies. I know that you like to keep things "general" but this does not reduce functionality of phase retrieval in any way, since one would always pad to the next power of two before producing FFT of the input image.
Cheers.
That is true, thanks for the remark!
Especially if the frequency indexing in the filter fits FFT indexing.