Closed astrogewgaw closed 7 months ago
It's one of these things that I imagine myself doing "when I get some free time" but that's unlikely to happen soon. It's harder and less useful than you might think though. With a handful of CPUs you can already cover the long period regime with good phase resolution; even if you could obtain large speedup, you'd mostly gain the ability to cover shorter periods, where the extra sensitivity over the FFT is not particularly dramatic. Also, searching shorter periods requires searching more tightly spaced DM trials, so there's a double whammy cost increase when searching short periods (I talk about this in the paper, the b^3 term where b is the number of phase bins). Another thing to consider is that the peak finding algorithm consumes a fair amount of time at the moment, so if you made the FFA + matched filtering say 10x faster, you'd have to make peak finding much faster as well for it to matter (and it needs to remain as good as it is now, otherwise the extra sensitivity of the FFA is compromised). Lastly, the processing model where I use non-integer downsampling to maintain a roughly constant phase resolution might not be the way to go on the GPU. So all in all, lots of things to be carefully thought about.
PS: If you're bold enough to try implementing this, please do it by small increments; i.e. start by implementing a working FFA transform kernel on the GPU, then a matched filtering kernel, then think about interfacing it with the existing codebase. No merge pull requests with 500+ lines changed in a single commit please :smile: If you could demonstrate something with solid potential, then I'd be happy to make deeper changes in the code to accommodate it.
I'm going to close this because let's face it, it's not happening, but I'm happy to be proven wrong in the future :smile: