jontio / JAERO

Demodulate and decode Aero signals. These signals contain SatCom ACARS messages as used by planes beyond VHF ACARS range
https://jontio.zapto.org/hda1/jaero.html
MIT License
218 stars 39 forks source link

Problem running under Linux #28

Closed sv1 closed 4 years ago

sv1 commented 6 years ago

For some strange reason I can not run JAERO 1.4.0.7 under Linux

(gdb) r Starting program: /opt/linux/JAERO/JAERO/JAERO [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". [New Thread 0x7fffe9b65700 (LWP 3468)] [New Thread 0x7fffe8f4f700 (LWP 3469)] [New Thread 0x7fffcffff700 (LWP 3470)] [New Thread 0x7fffcf7fe700 (LWP 3471)] [New Thread 0x7fffceffd700 (LWP 3472)] [New Thread 0x7fffce7fc700 (LWP 3473)] [New Thread 0x7fffcdffb700 (LWP 3474)] [New Thread 0x7fffcd7fa700 (LWP 3475)]

Thread 1 "JAERO" received signal SIGILL, Illegal instruction. 0x00007ffff7bd421f in fill_table () from /usr/local/lib/libcorrect.so

jontio commented 6 years ago

Strange. I tried a VM with uname -a == Linux qos 4.9.0-5-amd64 #1 SMP Debian 4.9.65-3+deb9u2 (2018-01-04) x86_64 GNU/Linux. The processor was an Intel G3220 and knows MMX,SSE,SSE2,SSE3,SSE4.1,SSE4.2,EM64T and VT-x.

I compiled JAERO then libcorrect. I cloned JAERO ( git clone https://github.com/jontio/JAERO.git ) as it is the same as version 1.4.0.7. I had to add LIBS += -lcorrect to the JAERO.pro file. libcorrect was the one than is in the JAERO repo. I then installed the libcorrect library. I then ran gdb, set the exe file and ran, waited for a bit then quit JAERO...

(gdb) exec-file ./JAERO
(gdb) r
Starting program: /home/jonti/Desktop/JAERO/JAERO/JAERO JAERO
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffe9b7d700 (LWP 20700)]
[New Thread 0x7fffe3fff700 (LWP 20701)]
[New Thread 0x7fffe37fe700 (LWP 20702)]
PulseAudioService: pa_context_connect() failed
[Thread 0x7fffe37fe700 (LWP 20702) exited]
[New Thread 0x7fffe37fe700 (LWP 20704)]
[New Thread 0x7fffe2ffd700 (LWP 20705)]
[New Thread 0x7fffe27fc700 (LWP 20706)]
[New Thread 0x7fffe1ffb700 (LWP 20707)]
[Thread 0x7fffe37fe700 (LWP 20704) exited]
using null input device, none available
using null input device, none available
using null input device, none available
using null input device, none available
[Thread 0x7fffe27fc700 (LWP 20706) exited]
[Thread 0x7fffe1ffb700 (LWP 20707) exited]
[Thread 0x7fffe2ffd700 (LWP 20705) exited]
[Thread 0x7fffe9b7d700 (LWP 20700) exited]
[Thread 0x7fffe3fff700 (LWP 20701) exited]
[Inferior 1 (process 20699) exited normally]
(gdb)  

So I didn't manage to get the illegal instruction in libcorrect's filltable function.

What libcorrect version did you use? The one with JAERO or the latest one from https://github.com/quiet/libcorrect ?

sv1 commented 6 years ago

Hello Jonti,

On Fri, 2018-07-06 at 15:37 -0700, Jonti Olds wrote:

Strange. I tried a VM with uname -a == Linux qos 4.9.0-5-amd64 #1 SMP Debian 4.9.65-3+deb9u2 (2018-01-04) x86_64 GNU/Linux. The processor was an Intel G3220 and knows MMX,SSE,SSE2,SSE3,SSE4.1,SSE4.2,EM64T and VT-x.

I am compiling on a Debian Linux machine, Linux apollo 4.9.0-6-amd64 #1 SMP Debian 4.9.88-1+deb9u1 (2018-05-07) x86_64 GNU/Linux

with an Intel Core2 Duo E8400 CPU

I compiled JAERO then libcorrect. I cloned JAERO ( git clone https://github.com/jontio/JAERO.git ) as it is the same as version 1.4.0.7.

Are you sure?

$ ./JAERO -v JAERO 1.0.4.4

I had to add LIBS += -lcorrect to the JAERO.pro file.

So did I, but I can not compile JAERO without first compiling/installing libcorrect. I get an error "-lcorrect not found"

If I compile/install libcorrect first, then JAERO compilation runs without errors.

libcorrect was the one than is in the JAERO repo. I then installed the libcorrect library.

I have tried both, and the one on JAERO repo, and the one from "quite" repo.

Do we need to enable "libfec compatibility layer" of libcorrect? I have not tried it yet cause I have already a libfec installation.

Thanks again for your help

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

unixpunk commented 6 years ago

I vaguely remember having a similar issue when there is no input sound device available on the system. HTH

sv1 commented 6 years ago

On Sun, 2018-07-08 at 10:44 -0700, unixpunk wrote:

I vaguely remember having a similar issue when there is no input sound device available on the system. HTH

$ lspci |grep -i audio 05:00.0 Multimedia audio controller: Creative Labs EMU10k2/CA0100/CA0102/CA10200 [Sound Blaster Audigy Series] (rev 03) 05:01.0 Multimedia audio controller: Ensoniq ES1371/ES1373 / Creative Labs CT2518 (rev 06)

jontio commented 6 years ago

It's hard to figure out what is going on when I can't reproduce the error. However what I think is happening is this...

Your CPU doesn't support SSE4.2. In SSE4.2 the instruction POPCNT was introduced that is a fast was of counting bits set to 1. In fill_table in lookup.c of libcorrect line 15 says out |= (popcount(i & poly[j]) % 2) ? mask : 0; . In portable.h we have

#ifdef __GNUC__
#define HAVE_BUILTINS
#endif

#ifdef HAVE_BUILTINS
#define popcount __builtin_popcount
#define prefetch __builtin_prefetch
#else

static inline int popcount(int x) {
    /* taken from the helpful http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel */
    x = x - ((x >> 1) & 0x55555555);
    x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
    return ((x + (x >> 4) & 0x0f0f0f0f) * 0x01010101) >> 24;
}

static inline void prefetch(void *x) {}

#endif

When you are running cmake GNUC will be true so HAVE_BUILTINS is true. I think gcc thinks you have SSE4.2 support and compiles __builtin_popcount as POPCNT rather than defining a function itself.

I don't know if that is a bug with gcc or not. When I tried to compile libcorrect on a pi I didn't get a POPCNT instruction used.

I think the easiest way to fix the problem would simply be masking out #define HAVE_BUILTINS in portable.h (change line 2 of portable.h to //#define HAVE_BUILTINS)

sv1 commented 6 years ago

You are absolutely correct!!!

Commenting the #define HAVE_BUILTINS in libcorrect/include/portable.h

solved the problem.

I think that this is an issue of libcorrect. Libcorrect should check for different SSE versions and compile accordingly.

On Sun, 2018-07-08 at 13:14 -0700, Jonti Olds wrote:

It's hard to figure out what is going on when I can't reproduce the error. However what I think is happening is this... Your CPU doesn't support SSE4.2. In SSE4.2 the instruction POPCNT was introduced that is a fast was of counting bits set to 1. In fill_table in lookup.c of libcorrect line 15 says out |= (popcount(i & poly[j]) % 2) ? mask : 0; . In portable.h we have

ifdef GNUC

define HAVE_BUILTINS

endif

ifdef HAVE_BUILTINS

define popcount __builtin_popcount

define prefetch __builtin_prefetch

else

static inline int popcount(int x) {     / taken from the helpful http://graphics.stanford.edu/~seander/b ithacks.html#CountBitsSetParallel /     x = x - ((x >> 1) & 0x55555555);     x = (x & 0x33333333) + ((x >> 2) & 0x33333333);     return ((x + (x >> 4) & 0x0f0f0f0f) * 0x01010101) >> 24; }

static inline void prefetch(void *x) {}

endif

When you are running cmake GNUC will be true so HAVE_BUILTINS is true. I think gcc thinks you have SSE4.2 support and compiles __builtin_popcount as POPCNT rather than defining a function itself. I don't know if that is a bug with gcc or not. When I tried to compile libcorrect on a pi I didn't get a POPCNT instruction used. I think the easiest way to fix the problem would simply be masking out #define HAVE_BUILTINS in portable.h (change line 2 of portable.h to //#define HAVE_BUILTINS) — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

jontio commented 5 years ago

Can you try typing gcc -mavx2 -dM -E - < /dev/null | egrep "SSE|AVX" | sort on your computer that doesn't have SSE4.2?

I tried it on a computer that didn't have AVX but still got...

#define __AVX__ 1
#define __AVX2__ 1
#define __SSE__ 1
#define __SSE_MATH__ 1
#define __SSE2__ 1
#define __SSE2_MATH__ 1
#define __SSE3__ 1
#define __SSE4_1__ 1
#define __SSE4_2__ 1
#define __SSSE3__ 1

So it's looking like that isn't a solution for libcorrect but worth a try at least.

jontio commented 5 years ago

I think think problem has been raised and solved in libcorrect repo https://github.com/quiet/libcorrect/pull/24 . Not sure how it works as I'm not that up to speed with cmake but interesting.