swiftlang / swift

The Swift Programming Language
https://swift.org
Apache License 2.0
67.38k stars 10.34k forks source link

ELF: Please provide a way to statically access Swift metadata without using a runtime call. #76698

Open grynspan opened 1 week ago

grynspan commented 1 week ago

Motivation

ELF images are generally stripped of their section header info, which means it's impossible to find sections at runtime. Swift loves storing metadata in sections.

Proposed solution

What if we used a "note" program header to contain a copy of this information? Program headers are not stripped, and "note" PHs are intended for arbitrary vendor use. We could imagine a program header whose desc field (i.e. payload) consists of a sequence of structures:

struct SectionInfo {
  char name[24];
  uintptr_t start;
  uintptr_t stop;
};

// Compiler or linker then emits something equivalent to this C array:
const SectionInfo desc[] = {
  { "swift5_protocols", &__start_swift5_protocols, &__stop_swift5_protocols },
  { "swift5_type_metadata", &__start_swift5_swift5_type_metadata, &__stop_swift5_type_metadata },
  ...
};

This data is then discoverable statically or at runtime by tools that include elf.h. This solves the problem of @_section data being undiscoverable in ELF binaries and helps Swift Testing do runtime metadata lookup without relying on the runtime-internal function swift_enumerateAllMetadataSections().

Linux, FreeBSD, and at least some other OSes that use ELF include dl_iterate_phdr() for enumerating program headers easily.

Alternatives considered

N/A

Additional information

N/A

compnerd commented 1 week ago

I'm not particularly fond of this approach. PT_NOTE sections are not guaranteed to be preserved as they are not required for loading. strip is permitted to remove any bits which are not required for loading the image.

grynspan commented 1 week ago

Well, what else in ELF could we borrow? Any ideas?

grynspan commented 6 days ago

My reading of the man page for strip is that it's permitted to remove these bits, but not required to do so nor expected to do so unless you explicitly tell it to.

I suppose we could place the data in a PT_LOAD with read-only permissions and add a sufficiently long magic byte sequence at the start to avoid any statistically significant risk of conflict with other loadable segments.

compnerd commented 5 days ago

We could add a custom section with the data that we need. As long as the section is marked with the load requirement, I think that it should be preserved.

grynspan commented 5 days ago

There doesn't appear to be any way to distinguish such a section from other LOADs, is there?

al45tair commented 2 days ago

FWIW, I'd previously experimented with using a PT_NOTE for this and abandoned that as a strategy. I don't recall exactly why — I think maybe there were issues with cross-section relocations? (I don't think we need worry too much about PT_NOTEs getting stripped, FWIW; that doesn't appear to happen in practice, and they're already used for some things where you wouldn't really want them stripped.)

We could add a custom section with the data that we need. As long as the section is marked with the load requirement, I think that it should be preserved.

As @grynspan says, that doesn't really help. Sections aren't preserved; it's only program headers (segments) that are preserved and those don't have names. In principle we could stuff all the metadata into a program header with a well-known segment type (the same way GCC puts .eh_frame_hdr data into PT_GNU_EH_FRAME, for instance), but doing so requires the use of a custom linker script. Or linker patches, but that will tie us to lld if we do that.

grynspan commented 14 hours ago

I might be shooting myself in the foot here, but I do have a proof of concept for our use case (Swift Testing) that uses PT_NOTE effectively. I'm curious what caused you to abandon your previous efforts @al45tair and whether or not the addition of @_section changes the algebra.

al45tair commented 12 hours ago

I'm curious what caused you to abandon your previous efforts @al45tair and whether or not the addition of @_section changes the algebra.

I don't recall the detail but I couldn't get it to work — there was some issue I think with relocations from the PT_NOTE to the sections containing the data not being supported.