Neighboring channel bleed-over: combine multiple FFT bins?

charlie-foxtrot commented 4 years ago

I'm picking up multiple NFM channels, some really well but others are getting a lot of bleed-over from neighboring channels. Channels are 5 KHz wide with 7.5 KHz spacing. I'm using an FFT size of 1024 and sample rate of 2.4 MHz so bins are ~4.69 KHz wide.

I think what I want to do is increase the FFT size to reduce the bleed-over, but when I do that things get worse on the "good" channels, likely because only a single bin's worth of signal is being used. So now I'm thinking of increasing the FFT size and using multiple bins based on the signal bandwidth.

This would mean adding a bandwidth configuration to the channel, then if set, combine multiple bins as necessary to cover at least that bandwidth. Setting FFT size at 2048 and combining 3 bins would be ~7.03 KHz or FFT size of 4096 and 5 bins would be ~5.86 KHz.

Does this approach sound right? Are there better alternatives? Any pointers of how / where / when to combine multiple FFT bins (I know enough to get myself into trouble, but maybe not enough to get out)?

szpajder commented 4 years ago

A better way would be to use larger bins and then band-pass-filter and downconvert every individual channel that falls into the chosen bin.

charlie-foxtrot commented 4 years ago

Larger bin and bandpass I understand, but if a channel’s center frequency is at a bin edge / spanning bins that wont be enough.

I don’t understand the trade offs between the FFT approach and filtering. Is there a reasonable upper limit to “larger bins”? Taking this approach to the extreme would be a single “bin” (ie no FFT at all) and just using bandpass filters to pull out each channel

szpajder commented 4 years ago

What you have just described is the most straightforward way of signal channelization called direct downconversion. Of course it works, but its main drawback is that it is very computationally expensive, because filtering and downconversion need to be performed on the wideband signal at the full sampling rate (be it 2.5Msps for the RTL or even 10 or more Msps for other, more capable receivers). What's more, computational load increases linearly with the number of narrowband channels being extracted from the wideband signal. This is not a big deal for big fat multicore x86-64 CPUs, but all these little single board computers would struggle with just a couple of channels. This is why other ways of signal channelization have been invented, including FFT channelizer being used in this program. No matter how many channels you configure, the CPU load is more-or-less constant, because the only two operations performed at full sampling rate are signal windowing and FFT. The break-even with DDC is AFAIR at about 3-4 channels.

This program dates back to the times of Raspberry Pi v1, which had just a single core ARMv6 CPU. With a signal of 2.5 Msps it couldn't do DDC with more than 1 narrowband channel. FFT channelization was therefore a must and it had to be done on the GPU, because the main CPU was unable to do 8000 FFTs of size 512 per second.

As newer generations of RPi (and other SBCs) came out, this became less of an issue. Over the course of 3 generations the main CPU of the RPi got a lot more performance boost than it's GPU, so it's no longer a problem to run the program without GPU being used. But still, FFT channelization is a lot more computationally efficient than direct downconversion. The result is that you can do more channels simultaneously before the CPU gets saturated.

charlie-foxtrot commented 4 years ago

@szpajder thanks for the detailed answer, that really helps.

Looking around it seems that doing a lowpass filter after down converting is a common approach so I've been playing with that. I have it working well to drop the power of the side channels, but because it is after the squelch logic, the channel is still "activated". I've tried moving the filtering before the squelch but hit performance issues quickly (rpi4 with 54 channels).

While I play with various approaches it would be nice to run the exact same input through each time. Do you have a way to play a recording through? If not, I was thinking of adding an input-file that can take in a rtl_sdr recording (or any binary file for that matter)

szpajder commented 4 years ago

Yep, rtl_airband is cheating on squelch (and a few other things) for speed.

I thought a few times that an IQ file input would be a nice to have, but never had time to actually code it.

charlie-foxtrot commented 4 years ago

@szpajder I've been trying to wrap my head around the demodulate() function and, although I'm sure I'm still missing something, I think when using 2.4 Msps targeting NFM channels with 7.5kHz spacing, the minimum FFT size is 1024.

I have a recording I'm using for testing that is centered at 151.1025 MHz, sample rate of 2.4Msps, is ~6.5 sec long, and has ~3 sec of my target signal at 151.145 MHz:

Using #174 I'm playing the same file through under multiple configurations. I also am writing out I/Q files of the FFT output as well as post-rotation.

Because my problem is with bleeding across channels, I'm using a configuration file with channels every 7.5 KHz and trying different FFT sizes. When my FFT size is 1024 I get recordings for the primary channel plus channels one higher and one lower. When I look at the primary channel post-rotation it is centered at zero:

while the two other channels are off zero:

meaning a low-pass filter post rotation will easily clean them up.

Next I tried an FFT size of 512 and things got bad. I now have recordings on the primary channel plus three channels above and two channels below (6 channels total). When I look at the post rotation of the channels two above and two below, my signal is showing up close to zero in both cases:

This means that a low pass filter post rotation wont be able to filter out noise thats two channels away.

I admit I'm not following the interplay between the FFT size, windowing, down mixing, etc. But taking a step back, the channels two above and two below are each 15 kHz away from the signal, while the WAVE_RATE for NFM is 16 kHz. So I think what I'm seeing is an FFT of 512 trying to stick more than 16kHz worth of signal in and causing "overlap". What I dont follow is how dividing 2.4Msps across the 256 bins in FFT of size 512 gives you more than 16 kHz, but maybe something with the windowing 🤷‍♂️

Thinking back to when I first was playing with my setup I was getting audio from channels I had not setup to record. I thought this was an issue with my device offset, but it was more likely the default FFT size of 512 picking getting a lot of overlap.

Another thing I find very interesting is the FFT output for 256, 512, and 1024 all look identical and there isnt bin truncation that I was expecting to see. This brought me back to thinking about using an FFT size of 2048 to solve my problem . . . apparently I don't actually have to merge bins to get the full signal because something I don't quite understand. But maybe this won't work for channels "close" to the bins edge?

Heres what the post-FFT looks like for 256, 512, 1024, and 2048:

Does this make sense? Is it reasonable that 1024 is the lowest FFT I can use? Does an FFT less than 1024 ever work for NFM (except for a lack of neighboring channels)? Is there some equation that takes in the FFT size, WAVE_RATE, and sample rate to warn that there will be overlap? Does a filter after shifting the signal (and some amount of re-running squelch) still seem like a better approach than increasing the FFT size?

szpajder commented 4 years ago

The bin size is not the only parameter which determines the channelizer selectivity. Another one is the window function which is cutting the infinite input signal into portions of fft_size samples. See this article: https://en.wikipedia.org/wiki/Window_function for a description of various windows and their properties.

To achieve highest possible channelizer selectivity we need a window which:

has a very narrow main lobe,
has the lowest possible sidelobe level

Unfortunately these are contradicting requirements, as shown on various graphs in this article (in particular this one: https://en.wikipedia.org/wiki/Window_function#/media/File:Window_functions_in_the_frequency_domain.png).

rtl_airband uses Blackmann-Harris window. Its central lobe width is approx. 5 bins and the highest side lobe level is below -90 dB. Compare this to other window functions. It's not that bad, eh?

Here is another, even better reading on this topic: http://www.dtic.mil/get-tr-doc/pdf?AD=ADA034956 (although it's a little old and does not know about Blackmann-Harris). Chapter 5, "Harmonic resolution" describes how the window function influences the ability to discern closely spaced spectral lines. See the graphs showing how various window functions perform with this task.

Hopefully this anwers your question, whether increasing fft_size would increase selectivity. Yes, it would, because the bin would be narrower and so the central lobe would be narrower as well, as its width depend on the fft_size. So feel free to increase fft_size, however this is going to become computationally expensive.

Now on the overlap thing that you are seeing. This is aliasing caused by the limited output rate.

There is always some spectral leakage between FFT bins. This is unavoidable when analyzing non-deterministic finite-time signals. So if your channel of interest is centered, let's say, in the bin number 300, you will also see it (albeit with a smaller energy) in bins: 297, 298, 299, 301, 302, 303. The question is: where will this spurious signal appear on the bin spectrum?

This depends on the WAVE_RATE, ie. the sample rate of the output stream. When NFM support is not compiled in, WAVE_RATE is 8000, which is enough to represent 8kHz of bandwidth using complex samples (ie. from -4kHz to +4kHz). When NFM is enabled, WAVE_RATE is 16000, so the representable bandwidth is from -8 to +8kHz. Any signal outside this range will cause the spectrum to fold back and appear on a different frequency than it really is.

Example:

WAVE_RATE=16000, FFT bin width: 5 kHz, centerfreq = 100 MHz, channel of interest: 101.000 MHz.

The channel of interest will land in bin 201 and its frequency will appear in this bin at freq=0.

Now let's add a second channel at 100.995 MHz, keeping centerfreq unchanged. The new channel frequency will land in bin 200 and will get shifted to 0. However due to spectral leakage the signal at 101.000 will also be visible in this bin. Where it will show up? The anwer is: at +5kHz, because this is the distance between this signal and the configured channel frequency.

Now the same thing, but with WAVE_RATE=8000, which is enough to represent the bandwidth from -4 to +4kHz. The spurious signal at +5kHz won't fit and will fold back to -3kHz. This is where it will show up in the spectrum.

Let's do this again - new channel freq: 100.990 MHz. This lands in bin 199. The signal at 101.0 is +10 kHz away. With WAVE_RATE=16000 it will show up at -6kHz (wrap-around from +8kHz). With WAVE_RATE=8000 it will appear at +2kHz (4kHz to the right from DC, then wrap-around to -4kHz and then another +6kHz to the right).

charlie-foxtrot commented 4 years ago

@szpajder thanks for all the detailed answers. I opened a PR here: #184 and will close this

charlie-foxtrot / RTLSDR-Airband

Neighboring channel bleed-over: combine multiple FFT bins? #169