Closed df7cb closed 3 years ago
Thanks for the report. We will look into it.
@df7cb to get the full picture: was all OK before? if yes, what has been changed? (gcc? hamlib? ...) Thanks.
@zcsahok,
here are the full list of logs of buildd with Tlf 1.4.1-1 (before the hamlib4).
The Sid
version of Debian is continuously upgraded, so it may be the cause of GCC - but I don't think that's a Hamlib issue.
FYI trying to reproduce this but it's not possible to install current [20201202] sid mips64el/mipsel using QEMU 1:3.1+dfsg-8+deb10u8 due to kernel panic. (previous installer [20200314] does boot, however) Now installing buster and will try to upgrade.
1.4.1-1 built on 19 May 2020 with gcc-9_9.3.0-12 was fine, while 1.4.1-2 built on 05 Jan 2021 with gcc-10_10.2.1-3 was failing. But that was surely not the only toolchain change... Thanks for looking!
With #211 fixed I retried building git head on mipsel, and the tests are all passing now.
Could you tag a new release? That would be the easiest way forward I think.
Kind of magic, as the changes were just at unrelated places.
That might mean the bug is still there... :( Still, a release would fix the current problem that tlf doesn't build on all Debian release architectures.
Or we can make a patch from the upstream for the 1.4.1. I'll take a look it soon.
@df7cb What is the difference between 1.4.1-1 and 1.4.1-2?
Thanks to the lot of changes in the last year a new release is really due. But does it make sense to have a new release with the unresolved problem of a potential failing test.
Best would be to compile on mipsel64 with -g and run the test (run_cabrillo) separate in a gdb session to narrow down the problem.
@zcsahok if you could manage to set up a system in qemu that would be great.
Code-wise there is no difference, but it's now built with hamlib 4, and the toolchain changed a lot.
I can do more debugging on the mipsel/mips64el machine tomorrow.
Building with -fsanitize=address on amd64 gives:
==12985==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55fc9b0c4afc at pc 0x55fc9b0b2ec5 bp 0x7ffc26e11a80 sp 0x7ffc26e11a78
WRITE of size 4 at 0x55fc9b0c4afc thread T0
#0 0x55fc9b0b2ec4 in write_log_fm_cabr src/readcabrillo.c:91
#1 0x55fc9b0b39ee in cab_qso_to_tlf src/readcabrillo.c:363
#2 0x55fc9b0b14af in test_cabToTlf_ParseQSO test/test_cabrillo.c:157
#3 0x7fc9cde8e238 (/lib/x86_64-linux-gnu/libcmocka.so.0+0x5238)
#4 0x7fc9cde8ebd8 in _cmocka_run_group_tests (/lib/x86_64-linux-gnu/libcmocka.so.0+0x5bd8)
#5 0x55fc9b0b0601 in main test/run_cabrillo.c:37
#6 0x7fc9cdcead09 in __libc_start_main ../csu/libc-start.c:308
#7 0x55fc9b0b0829 in _start (/tmp/tlf-1.4.1/test/run_cabrillo+0xf829)
0x55fc9b0c4afc is located 4 bytes to the left of global variable 'qsoflags_for_qtc' defined in 'test_cabrillo.c:15:5' (0x55fc9b0c4b00) of size 80000
0x55fc9b0c4afc is located 56 bytes to the right of global variable 'bandinx_spy' defined in 'test_cabrillo.c:26:5' (0x55fc9b0c4ac0) of size 4
SUMMARY: AddressSanitizer: global-buffer-overflow src/readcabrillo.c:91 in write_log_fm_cabr
The bug is in https://sources.debian.org/src/tlf/1.4.1-2/src/readcabrillo.c/#L91
When nr_qsos
is 0 you are overwriting something outside the array, and on some architectures this might just be the value of bandinx_spy
(on amd64 there seems to be some space between the variables so the write might be harmless there).
Random changes of the code might just change what gets wrongly overwritten.
I've confirmed that an if(nr_qsos)
before that line fixes the test on MIPS, I do not know whether this is the correct fix.
Looks promising.
A quick check of the function in question shows that two lines up store_qso() is called. In that function nr_qsos gets incremented. So normally the variable is at least one.
But in the testcode (test_cabrillo.c) we stub out store_qso() with an empty definition. We should modify it as follow:
void store_qso() { nr_qsos++; }
That should fix it.
void store_qso() { nr_qsos++; }
That should fix it.
Thanks, works.
This fixes the test problem on mipsel (32) as well. Thanks! I'll upload that as hotfix so it's no longer blocking the hamlib4 transition in Debian.
Fixed in PR #212. If no negative comment I will pick it up in next two days.
@AdrianBunk '-fsanitize=address' showed some more problems. Will be fixed by above mentioned commits. Thanks for pointing in that direction.
Fix e95db78 for Cabrillo test confirmed on QEMU mips64el system using
user@dmips64el:~/tlf$ gcc --version
gcc (Debian 10.2.1-3) 10.2.1 20201224
user@dmips64el:~/tlf$ dpkg -l | grep hamlib
ii libhamlib-dev:mips64el 4.0-3 mips64el Development library to control radio transceivers and receivers
ii libhamlib4:mips64el 4.0-3 mips64el Run-time library to control radio transceivers and receivers
It just takes ages to install party due to broken kernel (had to go the buster->bullseye->sid way). And a full Tlf build takes well over an hour.
Ok. Will merge PR #212 now.
tlf 1.4.1 fails on mips64el:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=979377 https://buildd.debian.org/status/fetch.php?pkg=tlf&arch=mips64el&ver=1.4.1-2&stamp=1609846479&raw=0
... and similarly on other architectures: https://buildd.debian.org/status/logs.php?pkg=tlf&ver=1.4.1-2