Open willglynn opened 4 years ago
Checking binutils
readelf
, it appears that various other machine types use an identical build attributes scheme. I'll hoist goblin::elf::arm::build_attributes
to goblin::elf::build_attributes
.
I don't like the duplication and control flow inversion between ParameterType::{Ntbs,Uleb128}
and Attribute::{Ntbs,Uleb128}
. I'm pretty sure there's a trait to extract here which provides the attribute control over its parsing and its runtime representation, such that AttributesIter
would call T::parse(tag_number)
to produce a T
and then package it into an Attribute<T>
or something. Besides being generally better, this refactor would particularly benefit Arm's Tag_compatibility
and Tag_also_compatible_with
, which are both a Uleb128
followed by an Ntbs
.
I'm also toying with the idea that the NUL-terminated byte string attributes should all be separate types. The argument is that while cpu_name
might truly be a &[u8]
, it is a specific kind of &[u8]
. Maybe it's better modeled as an arm::CpuName<'a>
which is AsRef<[u8]>
and From<&'a [u8]>
/Into<&'a [u8]>
and so on. This has benefits with localizing associated constants (e.g. CpuName
could hold the list of CPU names) and other field-specific tricks. This might also tie into the trait I sense I'm missing: &'a [u8]
is just a bag of bytes, whereas arm::CpuName<'a>
could know that its own tag number is 5. If every field knew its own tag number, then perhaps Attribute
wouldn't need to exist at all.
:thinking: Maybe: Remove Attribute
and AttributesIter
. Make all the attributes separate types which can parse themselves out of a &[u8]
. Make Aeabi::parse(attrs: Attributes)
own the loop which reads a tag number, dispatches to the appropriate attribute's parse function, and assigns the field. No weird separation/inversion of attribute parsing concerns. Hmm…
That feels better. pub struct Attributes<'a>
represents a bag of bytes containing zero or more unparsed attributes, and pub(crate) struct SemiParsedAttribute<'a>
represents a parsed tag number and a bag of bytes containing both the attribute's data and subsequent unparsed attributes. Both types have fallible pub(crate)
methods which consume the value and return its counterpart. This keeps the knowledge of build_attributes
layout strictly in build_attributes
while allowing the vendor section to drive the parsing.
The consuming code in the vendor section looks like:
pub struct Vendor {
field: Option<Field>,
}
impl Vendor {
fn parse(&mut self, mut attrs: Attributes) -> Result<(), AttributeParseError> {
while let Some(attr) = attrs.next()? {
match attr.tag_number() {
1 => {
let (value, rest) = Field::parse(attr)?;
self.field = Some(value);
attrs = rest;
},
_ => return Err(attr.unrecognized()),
}
}
Ok(())
}
}
pub enum Field {
/* … */
}
impl Field {
fn parse<'a>(attr: SemiParsedAttribute<'a>) -> Result<(Self, Attributes<'a>), AttributeParseError> {
let (value, rest) = attr.parse_uleb128()?;
Ok((value.into(), rest)
}
}
…which the build_attributes!
macro now generates.
Let me know when you’d like to me to review :)
@willglynn so this is marked a WIP, is this ready for review, or what should happen here? Interested in getting this merged/in :) Let me know whenever you have the time, thanks!
Well, I'm not happy with the test coverage but I got sidetracked before improving them and haven't returned to it yet. It seems to work though.
There exist Arm extensions for ELF. This PR aims to add support to
goblin
.goblin::elf::arm
defines some contents and some extension traits which add Arm-specific functionality to existinggoblin
types. In particular,elf::Header::arm()
returns anOption<ArmElfHeader>
which (for Arm executables) can distinguish hard-float from soft-float, and distinguish Arm entrypoints from Thumb entrypoints.elf::Section::arm_special_section()
returns anOption<ArmSpecialSection>
which among other things can identify anArmSpecialSection::BuildAttributes
section.goblin::elf::arm::build_attributes
is the largest single part of this PR. It adds a bunch of types for parsing through the Arm build attributes hierarchy, mostly in (data type, fallible iterator, iterator error) sets. It also defines a particularly gory macro for use bybuild_attributes
submodules. You see, Arm build attributes are organized into vendor-specific sections, and each vendor is free to define attributes as they see fit. Modeling this in Rust in a way that preservesgoblin
API stability is challenging.The most interesting build attribute vendor is the Arm embedded ABI (
aeabi
) vendor, captured here ingoblin::elf::arm::build_attributes::aeabi
. Every Arm tool produces and consumesaeabi
build attributes. The original spec describes attributes like:That macro lets me write this as:
The macro expands to:
…along with all the boring, duplicative mapping and conversion code.
This all needs considerably better test coverage before it's ready to merge. I'm opening this as a draft PR for feedback and discussion.