jketterl / openwebrx

Open source, multi-user SDR receiver software with a web interface
https://www.openwebrx.de
GNU Affero General Public License v3.0
1k stars 145 forks source link

Cannot run on CPUs without SSE4 / AVX #38

Closed sm4xas closed 4 years ago

sm4xas commented 4 years ago

Hello Just installed your openwebrx fork on a brand new Linux Mint 19, trying to run docker and get this errors. The web server runs but no audio/RX data

docker run --device /dev/bus/usb -p 8073:8073 -v openwebrx-config:/etc/openwebrx jketterl/openwebrx 2019-12-26 20:12:38,810 - owrx.sdr - INFO - SDR sources loaded. Availables SDRs: RTL-SDR USB Stick, Airspy HF+, SDRPlay RSP2 2019-12-26 20:12:51,596 - owrx.connection - DEBUG - client connection intitialized 2019-12-26 20:12:51,606 - owrx.source - DEBUG - activating profile 70cm 2019-12-26 20:12:51,609 - owrx.source - DEBUG - activating profile 20m 2019-12-26 20:12:51,610 - owrx.source - DEBUG - activating profile 20m 2019-12-26 20:12:51,624 - owrx.source - INFO - Started rtl source: rtl_connector -p 4950 -c 34763 -s 2400000 -f 438800000 -g 30 -P 0 Found 1 device(s): 0: Realtek, RTL2838UHIDIR, SN: 00000001

Using device 0: Generic RTL2832U OEM Found Rafael Micro R820T tuner [R82XX] PLL not locked! IQ worker thread started socket setup complete, waiting for connections setting up control socket... control socket started on 34763 2019-12-26 20:12:52,329 - owrx.source.connector - DEBUG - opening control socket... control connection established 2019-12-26 20:12:52,329 - owrx.dsp - DEBUG - received STATE_RUNNING, attempting DspSource restart client connection establised 2019-12-26 20:12:52,330 - csdr.csdr - DEBUG - Command = nc -v 127.0.0.1 4950 | csdr shift_addition_cc --fifo /tmp/openwebrx/openwebrx_pipe_140442144764304_shift_pipe | csdr fir_decimate_cc 217 0.0006912442396313364 HAMMING | csdr bandpass_fir_fft_cc --fifo /tmp/openwebrx/openwebrx_pipe_140442144764304_bpf_pipe 0.028933333333333332 HAMMING | csdr squelch_and_smeter_cc --fifo /tmp/openwebrx/openwebrx_pipe_140442144764304_squelch_pipe --outfifo /tmp/openwebrx/openwebrx_pipe_140442144764304_smeter_pipe 5 1 | csdr fmdemod_quadri_cf | csdr limit_ff | csdr fractional_decimator_ff 1.0031662434559077 | csdr deemphasis_nfm_ff 11025 | csdr convert_f_s16 | csdr encode_ima_adpcm_i16_u8 2019-12-26 20:12:52,341 - owrx.dsp - DEBUG - adding new output of type audio Connection to 127.0.0.1 4950 port [tcp/] succeeded! client connection establised csdr squelch_and_smeter_cc: csdr bandpass_fir_fft_cc: csdr shift_addition_cc: fifo control mode on fifo control mode on fifo control mode on csdr fractional_decimator_ff: csdr squelch_and_smeter_cc: closing client socket Illegal instruction (core dumped) Illegal instruction (core dumped) Illegal instruction (core dumped) Illegal instruction (core dumped) Illegal instruction (core dumped) Illegal instruction (core dumped) Illegal instruction (core dumped) closing client socket Illegal instruction (core dumped) Illegal instruction (core dumped) 2019-12-26 20:12:52,850 - csdr.csdr - DEBUG - dsp thread ended with rc=0 2019-12-26 20:12:52,850 - csdr.csdr - DEBUG - restarting since rc = 0, self.running = true, and no modification 2019-12-26 20:12:52,851 - csdr.csdr - DEBUG - Command = nc -v 127.0.0.1 4950 | csdr shift_addition_cc --fifo /tmp/openwebrx/openwebrx_pipe_140442144764304_shift_pipe | csdr fir_decimate_cc 217 0.0006912442396313364 HAMMING | csdr bandpass_fir_fft_cc --fifo /tmp/openwebrx/openwebrx_pipe_140442144764304_bpf_pipe 0.028933333333333332 HAMMING | csdr squelch_and_smeter_cc --fifo /tmp/openwebrx/openwebrx_pipe_140442144764304_squelch_pipe --outfifo /tmp/openwebrx/openwebrx_pipe_140442144764304_smeter_pipe 5 1 | csdr fmdemod_quadri_cf | csdr limit_ff | csdr fractional_decimator_ff 1.0031662434559077 | csdr deemphasis_nfm_ff 11025 | csdr convert_f_s16 | csdr encode_ima_adpcm_i16_u8 2019-12-26 20:12:52,859 - owrx.dsp - DEBUG - adding new output of type audio Connection to 127.0.0.1 4950 port [tcp/] succeeded! csdr shift_addition_cc: client connection establised fifo control mode on csdr squelch_and_smeter_cc: fifo control mode on csdr bandpass_fir_fft_cc: fifo control mode on csdr fractional_decimator_ff: csdr squelch_and_smeter_cc: Illegal instruction (core dumped) Illegal instruction (core dumped) Illegal instruction (core dumped) Illegal instruction (core dumped) Illegal instruction (core dumped) closing client socket Illegal instruction (core dumped) Illegal instruction (core dumped) Illegal instruction (core dumped) Illegal instruction (core dumped)

jketterl commented 4 years ago

I may have gone over the top with the compiler optimization flags. what's your CPU? would you mind posting the "flags" line from /proc/cpuinfo?

sm4xas commented 4 years ago

Hi

See below

processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz stepping : 11 microcode : 0xb6 cpu MHz : 1599.891 cache size : 4096 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm pti tpr_shadow vnmi flexpriority dtherm bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs bogomips : 4799.41 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management:

processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz stepping : 11 microcode : 0xb6 cpu MHz : 1601.518 cache size : 4096 KB physical id : 0 siblings : 4 core id : 1 cpu cores : 4 apicid : 1 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm pti tpr_shadow vnmi flexpriority dtherm bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs bogomips : 4799.41 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management:

processor : 2 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz stepping : 11 microcode : 0xb6 cpu MHz : 1602.277 cache size : 4096 KB physical id : 0 siblings : 4 core id : 2 cpu cores : 4 apicid : 2 initial apicid : 2 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm pti tpr_shadow vnmi flexpriority dtherm bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs bogomips : 4799.41 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management:

processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz stepping : 11 microcode : 0xb6 cpu MHz : 1599.939 cache size : 4096 KB physical id : 0 siblings : 4 core id : 3 cpu cores : 4 apicid : 3 initial apicid : 3 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm pti tpr_shadow vnmi flexpriority dtherm bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs bogomips : 4799.41 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management:

jketterl commented 4 years ago

oh wow. i was expecting avx to be the culprit, but this is a rather old cpu it seems, it doesn't even have the sse4. i'll try to do a custom build, but this may as well fail at a later stage. software defined radio profits a lot from cpu optimized machine code, so if this causes performance issues on newer cpus, i won't be able keep it.

i'll update here with the new version. i'll give it a separate tag in docker.

sm4xas commented 4 years ago

Yes this was an old computer I had laying around. I am planning to replace it shortly once I know exactly what I need :)

But I know that I'm not the only HAM using old computers for SDR :)

73

jketterl commented 4 years ago

I just pushed jketterl/openwebrx:sse3-latest-x86_64 to the hub. Please try that, and let me know if it works. I will need to run some performance tests with that image myself.

sm4xas commented 4 years ago

Hi!

The new version works fine, I've only had time to do a quick test however.

Thank you!

jketterl commented 4 years ago

alright :) i have updated the title correspondingly. i still need to test the performance impact on newer cpus, which will require some more refined setup... if it's not too hard, i can actually scratch the newer extensions.

ofadam commented 4 years ago

Is this version also available via Github?

ofadam commented 4 years ago

Or is there a particular dependency that I need to swap to make it work on SSE3 CPUs?

jketterl commented 4 years ago

There's no swapping. Things need to be re-compiled with the optimization settings for the CPU in question.

shruggie commented 4 years ago

Hi, had the same error yesterday:

csdr fractional_decimator_ff: csdr squelch_and_smeter_cc: closing client socket Illegal instruction (core dumped)

Looks like L5640 is missing AVX, can you elaborate on what I have to re-compile is it enough to recompile csdr ?

jketterl commented 4 years ago

Am I reading that right, a Xeon family processor without AVX? That's a bit of a surprise for me. It was release in 2010, but still, that's unexpected.

Either way... the components affected (to my knowledge) are csdr and the owrx_connector. Are you installing natively or using docker? For the latter, there's a build.sh available in the project root.

shruggie commented 4 years ago

Correct it's an old Xeon that lives in an R410 I think 2009. Yesterday I installed from the debian repo, will try docker and or build manual this evening. Thanks for the suggestions.

Alex, DC5AJ

jketterl commented 4 years ago

Possible solution: https://lwn.net/Articles/691932/

jketterl commented 4 years ago

hit another speedbump: once again, musl library in alpine linux does not support things. this time, no ifunc in musl, which is required to make the target_clones work (in fact, it won't even compile without).

jketterl commented 4 years ago

OK so here's the current situation: I have dabbled with this and it seems to be possible, but will require to dump alpine as a base for the docker builds. It's going to be tedious, and even though I have been defending them before, this is tipping point. Before I do that, however, I need to know if it's even working, which largely depends on how the package builds are doing:

The current package builds for csdr do now contain the gcc FMV solution mentioned above for some methods (fir_decimate_cc and shift_addfast_cc, the former one plays an essential role in OpenWebRX demodulation), so they should now work across many platforms (I specifically included this list: "avx", "sse4.2", "sse3", "sse2", which should cover most cases, and I can still add to that if necessary).

You can find the updated packages in the experimental Repositories. You can find more information about them here (Debian) and here(Ubuntu). When testing, please make sure that you're using the version from the packages, not any locally compiled binaries that may have been installed before.

I will need any feedback I can get about the packages, since this will have a massive influence over what will happen to the docker builds. If the feedback is as expected, I can backport the same solution into the owrx_connector, too.

In the meantime, the current docker builds in :latest have been reset to use a very light -O2 compiler optimization setting. That should make them work across many platforms, but will probably drive the CPU load up for all existing users. I'd appreciate anybody who can confirm or contradict these two assumptions as well since I don't have much hardware available to test myself!

For all feedback, please state clearly if you're using docker or packages. If you can, include package versions or docker image hashes for eventual analysis.

If anybody wants to test compiling manually: The default build still works as before and is locally optimized. You can switch to the FMV by using the following commands: make clean && CSDR_PACKAGE_BUILD=1 CSDR_FMV=1 make. This may become default if all works out well.

jketterl commented 4 years ago

With very little reassuring feedback (I have one confirmation that it's now working where it didn't before), I'm a little out on a wager here... I spent the day restructuring the docker build to debian:buster-slim. The images have gained some size, it seems: +30MB in compressed, and about +100MB on the harddrive. Build times have gone up, too, though there's usually not many changes that require a full rebuild.

This should in theory finally resolve this issue, but I'm not fully convinced. I'm still hoping for some kind of feeback... I guess if it's broken, I'll know soon...

jketterl commented 4 years ago

additional input came in on #111 - seems like this does not only affect my software. I have tracked down and set some SIMD flags that are available in the LimeSuite (the package that allows connections to LimeSDR devices). This seems to be narrowing down on docker builds now.

jketterl commented 4 years ago

I just pushed new packages of csdr (0.16.0) and owrx_connector (0.2.0) into the stable repositories for Debian and Ubuntu, so this should be resolved for everybody on package installs as soon as the updates are applied.

ofadam commented 4 years ago

I'm getting a "OWARNING: Soapy overflow" message shortly after lunching openwebrx on an SSE3 CPU - is that related?

jketterl commented 4 years ago

no, that's just the buffers overflowing when they're not processed fast enough.

ofadam commented 4 years ago

One other possibly related question – I got everything working on the SSE3 CPU machine, but am getting stuttering audio. Strangely, both the CPU percentage on the host machine and the one showing in the openwebrx status bar show around 25% or even less utilization.

jketterl commented 4 years ago

I wouldn't think so. The bug at hand prevented the binaries to run, so you'd be getting no audio at all in that case. Have you had a look if there's anything peculiar in the logs?

Stuttering audio can be caused by some other factors... First thing to check is the sample rate (is it supported by the device? is the device actually using it? does your waterfall look as expected? use csdr through to measure actual data rates). The other thing to keep in mind is that even though the demodulation process is split into a series of smaller processes to spread things among cores, these processes are of different complexity, and as such CPU usage varies among them. What that means that if the most complex of these processes exceeds the capacity of a single core, you will probably get stuttering audio, too.

jketterl commented 4 years ago

Stuttering may also happen on the client side, depending on the implementation. The "AudioNode" api (the old and established one) has some issues since it needs to share the thread with the rendering (and that's pretty busy with the waterfall). The new "AudioWorklets" api is supposed to improve that, but it's only available on chrome, and only over https (or localhost, may be relevant). I have found the new audio api to be somewhat unreliable on linux clients, but only at times. No idea what's happening there, but it occurs across all applications (including the AudioWorklets demo apps), so it's probably not related to OpenWebRX.

This is mostly relevant when running on less capable machines; are you running the browser on the same machine?

ofadam commented 4 years ago

Strangely, the waterfall works as expected with signals present - and the audio does as well, other than stuttering a whole lot.

I see nothing amiss in the openwebrx log that's running. I'm using an SDRPlay RSP1A device with it.

Screen Shot 2020-06-05 at 11 53 04 PM

I'm running the browser on a different machine, although it's the same result when using a browser on the host machine.

I checked all four logical cores on the host and they have room to spare (generally a bit under 50% each.)

I appreciate all the help you've already given - this isn't a critical project, but one I was hopeful to use with an older (yet quad-core) machine.