sshock / AFFLIBv3

AFF is an open and extensible file format to store disk images and associated metadata.
Other
80 stars 21 forks source link

affsign -k: segmentation fault in s390x, ppc64 and sparc64 (Debian) #41

Closed eribertomota closed 4 years ago

eribertomota commented 4 years ago

Hi @sshock,

In my last Debian revision for afflib, I re-enabled the upstream tests (disabled since 2010). The package was built in several architectures but failed to build from source in three archs: s390x, ppc64 and sparc64. I attached the build logs but you can see a relevant part below:

Signing AFF file...
affsign -k /tmp/basevP50b.agent.pem /tmp/basevP50b.evidence.aff
Segmentation fault
affsign failed
FAIL test_signing.sh (exit status: 1)

============================================================================
Testsuite summary for AFFLIB 3.7.18
============================================================================
# TOTAL: 5
# PASS:  4
# SKIP:  0
# XFAIL: 0
# FAIL:  1
# XPASS: 0
# ERROR: 0
============================================================================

I have SSH access to all machines and I can help to do tests. In a s390x, running the command by hand, I can see:

./affsign -k /tmp/basevP50b.agent.pem /tmp/basevP50b.evidence.aff
Signing segments...
Calculating BOM for segment badflag...   
Calculating BOM for segment badsectors...   
Calculating BOM for segment afflib_version...   
Calculating BOM for segment aff_file_type...   
Calculating BOM for segment acquisition_commandline...   
Calculating BOM for segment pagesize...   
Calculating BOM for segment sectorsize...   
Calculating BOM for segment page0...   
Calculating BOM for segment page1...   
Segmentation fault

Please, let me know if you need more details or tests.

Taking advantage of this message, I would like to ask for use other source to generate the rawevidence.raw file. Currently, it uses the content of /usr/share/dict/. It is bad for me because I need force the build system to install dictionaries. See below:

TEST ./test_signing.sh
=== MAKING THE TEST FILES ===
Making the random ISO rawevidence.raw
cat: '/usr/share/dict/*': No such file or directory
cat: '/usr/share/dict/*': No such file or directory
cat: '/usr/share/dict/*': No such file or directory
cat: '/usr/share/dict/*': No such file or directory
cat: '/usr/share/dict/*': No such file or directory
cat: '/usr/share/dict/*': No such file or directory
cat: '/usr/share/dict/*': No such file or directory
cat: '/usr/share/dict/*': No such file or directory
cat: '/usr/share/dict/*': No such file or directory
cat: '/usr/share/dict/*': No such file or directory
-rw-r--r-- 1 buildd buildd 35127296 Jun 30 03:31 rawevidence.raw
MD5(rawevidence.raw)= a02ae7822b19f78ee40f18d3740567f4

Thanks a lot in advance.

Regards,

Eriberto

afflib-logs.tar.gz

sshock commented 4 years ago

Hi @eribertomota . My suspicion is an endian-ness bug somewhere, but that is just a guess. Can you get me a call stack where it seg faults?

For the rawevidence.raw, I am open to suggestions on how to change that script. I think we could just have it grab files from somewhere else like /bin/ instead. Would that work?

eribertomota commented 4 years ago

Hi @eribertomota . My suspicion is an endian-ness bug somewhere, but that is just a guess. Can you get me a call stack where it seg faults?

Thanks for your very quick reply. Can you guide me how to get a call stack?

For the rawevidence.raw, I am open to suggestions on how to change that script. I think we could just have it grab files from somewhere else like /bin/ instead. Would that work?

I think the rawevidence.raw must be ASCII only (because the old upstream focused in dictionaries). I suggest to use local .cpp files as source or a file from /etc. What you think about it?

sshock commented 4 years ago

Thanks for your very quick reply. Can you guide me how to get a call stack?

If you run aftest inside gdb (e.g., gdb lib/aftest and then inside gdb type run and hit enter), then inside gdb after it seg faults just type bt and hit enter and that should spit out a back trace.

I think the rawevidence.raw must be ASCII only (because the old upstream focused in dictionaries). I suggest to use local .cpp files as source or a file from /etc. What you think about it?

I can't see anything that indicates they have to be text files, but I like your idea better, so I'll try that.

eribertomota commented 4 years ago

aftest or affsign?

sshock commented 4 years ago

oh yeah, sorry; affsign

sshock commented 4 years ago

This worked for me (I had to use .libs/affsign because affsign ends up being a libtool wrapper script or somethign):

gdb .libs/affsign
set args -k /tmp/basekIHqK.agent.pem /tmp/basekIHqK.evidence.aff
run
eribertomota commented 4 years ago

In s390x:

/home/eriberto/afflib/afflib-3.7.18/tools/.libs/affsign: error while loading shared libraries: libafflib.so.0: cannot open shared object file: No such file or directory [Inferior 1 (process 13801) exited with code 0177]

eribertomota commented 4 years ago

Maybe I need install afflib in environment...

sshock commented 4 years ago

fyi, I just put in the fix to make it not rely on /usr/share/dict: 15ebb83

eribertomota commented 4 years ago

Ok, I did a test in my local machine. I need install afflib and use a prebuilt tmp/name.pem. The problem is I don't have permission to install afflib in remote machine... Is there a way to use a local libafflib.so.0?

eribertomota commented 4 years ago

fyi, I just put in the fix to make it not rely on /usr/share/dict: 15ebb83

Good!

eribertomota commented 4 years ago

I just installed afflib in remote. A moment please...

eribertomota commented 4 years ago

Well... I don't understand...

$ gdb .libs/affsign
(gdb) set args -k /tmp/baseD7ezF.agent.pem /tmp/baseD7ezF.evidence.aff
(gdb) run
Starting program: /home/eriberto/afflib/afflib-3.7.18/tools/.libs/affsign -k /tmp/baseD7ezF.agent.pem /tmp/baseD7ezF.evidence.aff
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/s390x-linux-gnu/libthread_db.so.1".
Signing segments...
Calculating BOM for segment badflag...   
Calculating BOM for segment badsectors...   
Calculating BOM for segment afflib_version...   
Calculating BOM for segment aff_file_type...   
Calculating BOM for segment acquisition_commandline...   
Calculating BOM for segment pagesize...   
Calculating BOM for segment sectorsize...   
Calculating BOM for segment page0...   
Calculating BOM for segment page1...   
Calculating BOM for segment page2...   
Calculating BOM for segment imagesize...   
Calculating BOM for segment md5...   
Calculating BOM for segment sha1...   
Calculating BOM for segment image_gid...   
Calculating BOM for segment acquisition_date...   
Calculating BOM for segment cert-sha256...   
Calculating BOM for segment badflag/sha256...   
Calculating BOM for segment badsectors/sha256...   
Calculating BOM for segment afflib_version/sha256...   
Calculating BOM for segment aff_file_type/sha256...   
Calculating BOM for segment acquisition_commandline/sha256...   
Calculating BOM for segment pagesize/sha256...   
Calculating BOM for segment sectorsize/sha256...   
Calculating BOM for segment page0/sha256...   
Calculating BOM for segment page1/sha256...   
Calculating BOM for segment page2/sha256...   
Calculating BOM for segment imagesize/sha256...   
Calculating BOM for segment md5/sha256...   
Calculating BOM for segment sha1/sha256...   
Calculating BOM for segment image_gid/sha256...   
Calculating BOM for segment acquisition_date/sha256...   
Calculating BOM for segment cert-sha256/sha256...   

[Inferior 1 (process 55096) exited normally]
(gdb) bt
No stack.
(gdb) 

No fail, no stack.

eribertomota commented 4 years ago

However, I ran the build command again and I got the same error.

sshock commented 4 years ago

Hmm, that is very odd for it to work outside of gdb but not inside, crazy!

How about this. Can you make it dump a core file? I think you can enable that by running ulimit -c unlimited. Then go ahead and run ./affsign -k /tmp/baseD7ezF.agent.pem /tmp/baseD7ezF.evidence.aff. It should create a core file. Then you can open the core file with gdb using gdb .libs/affsign core. Then I think you will be able to immediately run bt in gdb to get the call stack.

eribertomota commented 4 years ago

I created a valgrind for you... I will try gdb.

valgrind.gz

eribertomota commented 4 years ago
(sid_s390x-dchroot)eriberto@zelenka:~/afflib/afflib-3.7.18/tools$ gdb .libs/affsign core
GNU gdb (Debian 9.2-1) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "s390x-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from .libs/affsign...
[New LWP 1877]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/s390x-linux-gnu/libthread_db.so.1".
Core was generated by `/home/eriberto/afflib/afflib-3.7.18/tools/.libs/affsign -k /tmp/baseeqOfU.agent'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000003ff8581bd58 in af_get_page (af=0x2aa2cfa7040, pagenum=<optimized out>, data=<optimized out>, 
    bytes=0x3ffe4dfe3b0) at afflib_pages.cpp:308
308     afflib_pages.cpp: No such file or directory.
(gdb) bt
#0  0x000003ff8581bd58 in af_get_page (af=0x2aa2cfa7040, pagenum=<optimized out>, data=<optimized out>, 
    bytes=0x3ffe4dfe3b0) at afflib_pages.cpp:308
#1  0x000002aa2b482fa4 in affsign (fn=<optimized out>) at affsign.cpp:159
#2  0x000003ff84fab5b6 in __libc_start_main (main=0x2aa2b4828d8 <main(int, char**)>, argc=<optimized out>, 
    argv=0x3ffe4dff288, init=<optimized out>, fini=0x2aa2b484730 <__libc_csu_fini>, 
    rtld_fini=0x3ff858904e8 <_dl_fini>, stack_end=0x3ffe4dff1d0) at libc-start.c:308
#3  0x000002aa2b482b34 in _start () at affsign.cpp:305
(gdb) 
eribertomota commented 4 years ago

---> 308 afflib_pages.cpp: No such file or directory ???

sshock commented 4 years ago

I created a valgrind for you... I will try gdb.

valgrind.gz

wow, so many errors!

sshock commented 4 years ago

---> 308 afflib_pages.cpp: No such file or directory ???

I think the seg fault happened on line 308 of afflib_pages.cpp, but the No such file or directory just means it couldn't open the source code for some reason.

Here's lines 306-309:

306:    size_t bytes_left_in_sector = (SECTOR_SIZE - (*bytes % SECTOR_SIZE)) % SECTOR_SIZE;
307:    for(size_t i=0;i<bytes_left_in_sector;i++){
308:        data[*bytes + i] = 0;
309:    }
sshock commented 4 years ago

I'm scouring the code here but not seeing the problem yet. Seems likely to be something such as the data buffer not getting allocated to the full size it should be (pagesize).

sshock commented 4 years ago

can you tell me what sizeof(long) is on those systems?

eribertomota commented 4 years ago

How to get it?

eribertomota commented 4 years ago

s390x $ cpp -dD /dev/null | grep __SIZEOF_LONG__

define __SIZEOF_LONG__ 8

hurd $ cpp -dD /dev/null | grep __SIZEOF_LONG__

define __SIZEOF_LONG__ 4

sshock commented 4 years ago

ok, yeah I think I found the problem; one minute...

eribertomota commented 4 years ago

good

sshock commented 4 years ago

please try changing line 244 in lib/afflib_pages.cpp, from this:

*bytes = ntohl(*(long *)compressed_data);

to this:

*bytes = ntohl(*(uint32_t *)compressed_data);

and let me know if it runs better

eribertomota commented 4 years ago

Worked fine!

sshock commented 4 years ago

awesome! committing the fix and adding in a sanity test to go with it now...

eribertomota commented 4 years ago

If you want I can prepare a patch and send to Debian experimental to test in all archs before you release a new version.

sshock commented 4 years ago

@eribertomota yes, that would be great, thanks!

here is the commit with the fix, which is also in master now: e386977

eribertomota commented 4 years ago

I will send master to Debian. For Debian it is 3.7.18+git20200701.e386977

sshock commented 4 years ago

remind me when this is all done to create a new version (tag), since it has been a while

eribertomota commented 4 years ago

Sure!