glidernet / ogn-rf

This software listens to OGN radio messages and sends it to Open Glider Network.
GNU General Public License v3.0
18 stars 15 forks source link

The GPU accelerated package does not work on the Raspberry Pi 4 #36

Closed rreckel closed 4 years ago

rreckel commented 5 years ago

This is more a problem of the Raspberry Pi 4 firmware.

This issue exists only to tell everyone to use the standard ARM package on a RPI4, until the firmware is up to date.

rreckel commented 5 years ago

The GPU accelerated version is unsupported on RPI4: https://github.com/raspberrypi/firmware/issues/1205

apurvaaeron commented 5 years ago

Hi, I am using 4B version , what is the solution for this issue?

rreckel commented 5 years ago

There is none. Just use the unaccelerated version. The GPU changed a lot with the new RPI. But anyway, The RPI4 has enough power to run ogn on the main CPU ;-)

wimrijnders commented 3 years ago

Any interest in a Pi4 version of hello_fft? I'm holding myself back implementing it. Give me a reason.

Credibility: doing Pi4 GPU programming already.

snip commented 3 years ago

Thanks @wimrijnders for the proposal, but i don't think it is necessary as Pi4 is powerful enough. The added value is maybe to not have 2 versions for Pi. Maybe adding in the code an option to detect VideoCore IV can be an alternative. @pjalocha any thought?

wimrijnders commented 3 years ago

'powerful enough': ok, but with the usage of the VideoCore VI it becomes even more powerful, doesn't it?

In any case, thanks for responding.

pjalocha commented 3 years ago

For multicore Raspberry PI the GPU is really not that important, it reduces somewhat the CPU usage for the FFT thread but with or wthout the receiver runs with the same performance it is just a little difference in the CPU usage. GPU was important for the original Raspberry PI, there it was really life of death: it works or it doesn't, but not any more now.

But, I noticed that the GPU FFT being fast I would spend significant CPU on doing the window, as I need to do the sliding window FFT thus each batch I need to multiply by a given (constant) window. I think if this part could be as well done by the GPU it would help a lot, the CPU gain would be really significant. Thus if I could encourage you to take this direction... the window is constant thus it would be preloaded into the GPU memory and each FFT batch would be multiplied by it which should be a very easy and quick process. As well for sliding window part of the batch could be kept in the GPU memory reducing the transfer, thus lot of things could be optimized.

If you could produce such code it would be worth to use it I think.

wimrijnders commented 3 years ago

@pjalocha Thanks for the explanation! I understand now that the FFT is part of another application involving a receiver, which explains the description of 'powerful enough'.

I sort of understand the sliding window thing you are describing. I have some idea how to handle this, but not the full picture. Meaning that I have to think about this. I'll get back to you if when I know what I'm going to do.

Again, thanks for responding.

pjalocha commented 3 years ago

Yes, we use the. GPU FFT as the first processing stage of a receiver which processes 1MHz or 2MHz of bandwidth at 868MHz thus the ISM band (or 915MHz band in other regions). The sliding window FFT allows for processing of a continues signal in the frequency domain so we can extract the selected frequency channels from the full SDR bandwidth and then process them further. Implementing the sliding window FFT and inverse FFT would be fairly useful for such applications. Sliding window is basically taking the input in slices which partly overlap. In this case here I use slices 4096 long which overlap by half thus 2048 points. Each slice is processed with FFT but just before it is multiplied by a window 4096 points long which in our case is simply half of the sine from 0 to 180deg.