Closed Swatinem closed 1 year ago
I wouldn't rely solely on that man page. As noted in #90, other programs such as readelf sometimes use an alignment of 8, so I don't think changing to always use 4 is a correct solution.
Can you provide a file to test with, or steps to create such a file?
Maybe I was a bit overeager and creates a PR with a test right away: https://github.com/m4b/goblin/pull/365
Unfortunately, I can’t share the file I am looking at as it contains sensitive customer data.
Can you determine which compiler was used to create the file?
The file is not an executable, but a core file from a certain proprietary ELF-based platform that puts a bunch of custom info into various NOTEs. I’m trying to intentionally keep this as vague as possible :-D
A possibility is that the proprietary ELF-based platform is incompatible with other platforms.
I'll see if I can find a file that does require 8 byte alignment for notes.
In case this is indeed incompatible with other ELF platforms, I can also parse the NOTEs manually instead of going through iter_note_headers
, that is perfectly fine for my use cases :-)
Ah okay, I found out the confusion here.
Both LLVM and binutils align only namesz
, and not the whole pointer offset:
Whereas goblin aligns the whole pointer.
in my case namesz == 16
, so it is already 8-byte aligned, and no change is being made in readelf
. But goblin sees the 12-byte NOTE header + 16 byte namesz
and aligns that. (Or rather, the file-based offset of that.) That is where my 4-byte offset is coming from presumably.
I’m still so confused how this whole thing could potentially have ever worked reading theNT_GNU_BUILD_ID
, which has a 4-byte namesz
.
Okay no, wait a second, that alignment is taking into account the offsetof
:
So it does align taking into account the whole struct, which is why GNU\0
aligns correctly.
For my use-case, readelf
seems to align and read things correctly, but goblin does not. Maybe it is simply using a different alignment
value for some reason, even though the program headers state that the NOTEs are supposed to be 8-byte aligned.
The only files I've found on my system that have alignment 8 are the ".note.gnu.property" sections, which happen to be readable using an alignment of both 4 and 8.
For the test you added in #365, have you kept the same length for the name and descriptor?
If I write the data from that test to a file with section alignment 8, then readelf fails to read it:
$ readelf -n note8.o
Displaying notes found in: .note
Owner Data size Description
readelf: Warning: note with invalid namesz and/or descsz found at offset 0x0
readelf: Warning: type: 0x1, namesize: 0x00000008, descsize: 0x00000014, alignment: 8
llvm-readobj does succeed because it always uses alignment 4. In my opinion this is an LLVM bug, and it has had similar bugs before (https://reviews.llvm.org/D70962).
The ELF generic ABI states that padding is 4 or 8 depending on ELF class: https://www.sco.com/developers/gabi/latest/ch5.pheader.html#note_section. OS and CPU ABIs can override that, but in any case I would expect the section alignment to match the alignment that is used.
Thanks for helping me deep dive into all this. I was indeed using llvm-readobj
/ llvm-readelf
as a reference, which was working just fine. It looks like the core dump that I am dealing with might just advertise a wrong alignment. But I should be able to work around that on my end. Thanks for all the help!
The ELF manpage is pretty clear on this using a
4
byte alignment:https://www.mankier.com/5/elf#Description-Notes_(Nhdr)
The example given there is also hardcoding a
4
byte alignment even though the example is usingElf64_Nhdr
.However, goblin uses the
alignment
as defined in the program headers to align thedescription
after reading the name here:https://github.com/m4b/goblin/blob/87821fd7b85cbcaa0ce9ae542809313b18dec2f2/src/elf/note.rs#L222
I am running into this problem with a NOTE that has a 16-byte
name
, thus 12-bytes for the Note header + 16-bytes for the name are not8
byte aligned, so it erroneously shifts the beginning offset of the description by 4 bytes, which then leads to an error parsing parsing the description bytes.