libimobiledevice / libplist

A library to handle Apple Property List format in binary or XML
https://libimobiledevice.org
GNU Lesser General Public License v2.1
535 stars 304 forks source link

Testsuite failures on Linux sparc64 (Bus error) #118

Closed glaubitz closed 5 years ago

glaubitz commented 6 years ago

The testsuite fails on Linux sparc64 machines due to bus errors:


Making check in test
make[2]: Entering directory '/<<PKGBUILDDIR>>/test'
make  check-TESTS
make[3]: Entering directory '/<<PKGBUILDDIR>>/test'
make[4]: Entering directory '/<<PKGBUILDDIR>>/test'
FAIL: empty.test
FAIL: small.test
FAIL: medium.test
FAIL: large.test
FAIL: huge.test
PASS: bigarray.test
FAIL: emptycmp.test
FAIL: smallcmp.test
FAIL: mediumcmp.test
FAIL: largecmp.test
FAIL: hugecmp.test
PASS: bigarraycmp.test
FAIL: dates.test
FAIL: timezone1.test
FAIL: timezone2.test
PASS: signedunsigned1.test
PASS: signedunsigned2.test
PASS: signedunsigned3.test
PASS: hex.test
PASS: order.test
PASS: recursion.test
FAIL: entities.test
PASS: empty_keys.test
PASS: amp.test
PASS: invalid_tag.test
PASS: cdata.test
PASS: offsetsize.test
PASS: refsize.test
PASS: malformed_dict.test
=========================================
   libplist 2.0.0: test/test-suite.log
=========================================

# TOTAL: 29
# PASS:  15
# SKIP:  0
# XFAIL: 0
# FAIL:  14
# XPASS: 0
# ERROR: 0

.. contents:: :depth: 2

FAIL: empty
===========

./empty.test: line 11:  7023 Bus error               $top_builddir/test/plist_test $DATASRC/$TESTFILE $DATAOUT/$TESTFILE.out
FAIL empty.test (exit status: 138)

FAIL: small
===========

./small.test: line 11:  7043 Bus error               $top_builddir/test/plist_test $DATASRC/$TESTFILE $DATAOUT/$TESTFILE.out
FAIL small.test (exit status: 138)

FAIL: medium
============

./medium.test: line 11:  7063 Bus error               $top_builddir/test/plist_test $DATASRC/$TESTFILE $DATAOUT/$TESTFILE.out
FAIL medium.test (exit status: 138)

FAIL: large
===========

./large.test: line 11:  7083 Bus error               $top_builddir/test/plist_test $DATASRC/$TESTFILE $DATAOUT/$TESTFILE.out
FAIL large.test (exit status: 138)

FAIL: huge
==========

./huge.test: line 11:  7103 Bus error               $top_builddir/test/plist_test $DATASRC/$TESTFILE $DATAOUT/$TESTFILE.out
FAIL huge.test (exit status: 138)

FAIL: emptycmp
==============

File does not exists
FAIL emptycmp.test (exit status: 2)

FAIL: smallcmp
==============

File does not exists
FAIL smallcmp.test (exit status: 2)

FAIL: mediumcmp
===============

File does not exists
FAIL mediumcmp.test (exit status: 2)

FAIL: largecmp
==============

File does not exists
FAIL largecmp.test (exit status: 2)

FAIL: hugecmp
=============

File does not exists
FAIL hugecmp.test (exit status: 2)

FAIL: dates
===========

./dates.test: line 12:  7313 Bus error               $top_builddir/tools/plistutil -i $DATAOUT0 -o $DATAOUT1
FAIL dates.test (exit status: 138)

FAIL: timezone1
===============

./timezone1.test: line 16:  7375 Bus error               $top_builddir/test/plist_cmp $DATAOUT0 $DATAOUT1
FAIL timezone1.test (exit status: 138)

FAIL: timezone2
===============

./timezone2.test: line 15:  7409 Bus error               TZ=UTC $top_builddir/tools/plistutil -i $DATAOUT0 -o $DATAOUT1
FAIL timezone2.test (exit status: 138)

FAIL: entities
==============

./entities.test: line 12:  7639 Bus error               $top_builddir/test/plist_cmp $DATAIN0 $DATAOUT0
FAIL entities.test (exit status: 138)

============================================================================
Testsuite summary for libplist 2.0.0
============================================================================
# TOTAL: 29
# PASS:  15
# SKIP:  0
# XFAIL: 0
# FAIL:  14
# XPASS: 0
# ERROR: 0
============================================================================
See test/test-suite.log
Please report to https://github.com/libimobiledevice/libplist/issues
============================================================================
make[4]: *** [Makefile:772: test-suite.log] Error 1
make[4]: Leaving directory '/<<PKGBUILDDIR>>/test'
make[3]: *** [Makefile:880: check-TESTS] Error 2
make[3]: Leaving directory '/<<PKGBUILDDIR>>/test'
make[2]: *** [Makefile:943: check-am] Error 2
make[2]: Leaving directory '/<<PKGBUILDDIR>>/test'
make[1]: *** [Makefile:438: check-recursive] Error 1
make[1]: Leaving directory '/<<PKGBUILDDIR>>'```

This is usually a result of unaligned access. I'll provide a gdb backtrace later.
nikias commented 6 years ago

Yes backtrace would be great!

glaubitz commented 6 years ago

Interesting, I just tried with the version from git and cannot reproduce the issue there. So, it seems it was fixed after 2.0.0 which is what Debian currently ships. I'll still get a backtrace.

glaubitz commented 6 years ago

Here's the backtrace:

Program received signal SIGBUS, Bus error.
parse_real_node (bnode=bnode@entry=0x7feffffee58, size=<optimized out>, size@entry=3 '\003') at bplist.c:286
286             data->realval = *(double *) buf;
(gdb) bt
#0  parse_real_node (bnode=bnode@entry=0x7feffffee58, size=<optimized out>, size@entry=3 '\003') at bplist.c:286
#1  0xffff80010012eb04 in parse_date_node (size=3 '\003', bnode=0x7feffffee58) at bplist.c:301
#2  parse_bin_node (object=0x7feffffee58, bplist=0x7feffffefd8) at bplist.c:680
#3  parse_bin_node_at_index (bplist=bplist@entry=0x7feffffefd8, node_index=node_index@entry=15) at bplist.c:785
#4  0xffff80010012edac in parse_dict_node (size=9, bnode=0x7feffffef18, bplist=0x7feffffefd8) at bplist.c:506
#5  parse_bin_node (object=0x7feffffef18, bplist=0x7feffffefd8) at bplist.c:727
#6  parse_bin_node_at_index (bplist=bplist@entry=0x7feffffefd8, node_index=node_index@entry=0) at bplist.c:785
#7  0xffff80010012fa4c in plist_from_bin (
    plist_bin=0x1000010eeb0 "bplist00\331\001\003\005\006\b\n\f\016\020\002\004\002\a\t\v\r\017\021_\020\021Some ASCII stringP_\020\021Some UTF8 strings\251\002\002\002\002\002\002\002\002\002_\020\021Keys & \"entities\"WBoolean\b_\020\017Another Boolean\tXSome Int\020", length=<optimized out>, plist=0x7fefffff0d8) at bplist.c:880
#8  0x0000010000000bd8 in main (argc=<optimized out>, argv=<optimized out>) at plist_test.c:92
(gdb)

The *(double *) is a classic example for unaligned access. You cast a pointer from one type to another which means the compiler can no longer make sure the access is aligned.

Looking at the current code, there is still the potentially problematic use of pointer casts. It just seems that the accesses don't become unaligned for some reason.

FWIW, systemd has some helper code for unaligned access which might be useful, see: https://github.com/systemd/systemd/blob/4d09e1c8bab1d684172b1f277f3213825b30d2d9/src/basic/unaligned.h

nikias commented 6 years ago

There is the use of get_unaligned, maybe that is sufficient? https://github.com/libimobiledevice/libplist/blob/master/src/bplist.c?#L271-L278

nikias commented 6 years ago

Not sure if it would make sense to add another get_unaligned around the double and float casts?

glaubitz commented 6 years ago

Not sure if it would make sense to add another get_unaligned around the double and float casts?

I think that would definitely be a very good idea. Whenever you make these casts, you risk of creating unaligned access which causes exceptions on some platforms. Using get_unaligned() avoids this problem.

This works for me:

diff --git a/src/bplist.c b/src/bplist.c
index a9724b8..f4abd6e 100644
--- a/src/bplist.c
+++ b/src/bplist.c
@@ -270,11 +270,11 @@ static plist_t parse_real_node(const char **bnode, uint8_t size)
     {
     case sizeof(uint32_t):
         *(uint32_t*)buf = float_bswap32(get_unaligned((uint32_t*)*bnode));
-        data->realval = *(float *) buf;
+        data->realval = get_unaligned((float *) buf);
         break;
     case sizeof(uint64_t):
         *(uint64_t*)buf = float_bswap64(get_unaligned((uint64_t*)*bnode));
-        data->realval = *(double *) buf;
+        data->realval = get_unaligned((double *) buf);
         break;
     default:
         free(data);
nikias commented 5 years ago

I was wondering if I missed something here. There is already a get_unaligned() when reading the original value (original commit 292994b09fcfac64e14de3b20eab7821614e33dd), also shown in your above diff. It was reported by ASAN so I added it there; I think this should be enough. It will be stored in a new local unsigned char[8] which should be properly aligned.

glaubitz commented 5 years ago

It seems that the issue was fixed in a different way. The package now builds fine in Debian:

https://buildd.debian.org/status/logs.php?pkg=libplist&arch=sparc64

From the Debian changelog:

libplist (2.0.0-5) unstable; urgency=medium

  * QA upload.

  [ Yves-Alexis Perez ]
  * Add myself to Uploaders
  * Use Salsa git packaging repository under imobiledevice-team.
  * Remove useless X-Python-Version field from debian/control.
  * Add debian/gbp.conf.

  [ Boyuan Yang ]
  * Backport upstream bugfixes till 2018-09-04.
    + Drop no-unaligned-access.patch, fixes upstream in another way.

 -- Boyuan Yang <byang@debian.org>  Tue, 18 Sep 2018 15:17:32 -0400
nikias commented 5 years ago

Seems fixed according to previous comment. Closing.

Cogitri commented 5 years ago

Hello,

we hit about the same thing on Alpine, but on armv7. We're using the 2.0.0 release tarball for building, here's the log:

=========================================
   libplist 2.0.0: test/test-suite.log
=========================================

# TOTAL: 29
# PASS:  16
# SKIP:  0
# XFAIL: 0
# FAIL:  13
# XPASS: 0
# ERROR: 0

.. contents:: :depth: 2

FAIL: empty
===========

File ../test/data/1.plist is open
Bus error
FAIL empty.test (exit status: 135)

FAIL: small
===========

File ../test/data/2.plist is open
Bus error
FAIL small.test (exit status: 135)

FAIL: medium
============

File ../test/data/3.plist is open
Bus error
FAIL medium.test (exit status: 135)

FAIL: large
===========

File ../test/data/4.plist is open
Bus error
FAIL large.test (exit status: 135)

FAIL: huge
==========

File ../test/data/5.plist is open
Bus error
FAIL huge.test (exit status: 135)

FAIL: emptycmp
==============

File does not exists
FAIL emptycmp.test (exit status: 2)

FAIL: smallcmp
==============

File does not exists
FAIL smallcmp.test (exit status: 2)

FAIL: mediumcmp
===============

File does not exists
FAIL mediumcmp.test (exit status: 2)

FAIL: largecmp
==============

File does not exists
FAIL largecmp.test (exit status: 2)

FAIL: hugecmp
=============

File does not exists
FAIL hugecmp.test (exit status: 2)

FAIL: dates
===========

Bus error
FAIL dates.test (exit status: 135)

FAIL: timezone1
===============

Bus error
FAIL timezone1.test (exit status: 135)

FAIL: timezone2
===============

Bus error
FAIL timezone2.test (exit status: 135)

Sadly I can't provide you with a backtrace for that because I don't have a ARMv7 machine, this is from a CI run.

glaubitz commented 5 years ago

@Cogitri Can you try whether a current git snapshot works for you? Debian is using git snapshots for both libplist and libimobiledevice.

Cogitri commented 5 years ago

Hm, maybe it's time to a new release of libplist then. I don't have a iDevice to test it, so I'd rather not do the validation work for this.

glaubitz commented 5 years ago

FWIW, gcc-9 has a new warning option called address-of-packed-member which will hopefully reduce the amount of code with broken alignment in the future.