Open connorkuehl opened 5 years ago
Here's a setup procedure:
rfcv2
branchasm-goto
branch into it (git merge --no-ff asm-goto
-- you might need to check this branch out locally or prepend origin/
to the branch name) OR rebase rfcv2 onto asm-goto
.rm -rf /home/your-user/path/to/llvm-project/build/*
cd /home/your/path/to/llvm-project/build
cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS='clang;lld' ../llvm
make -j $(nproc)
mkdir /home/user/kernels
)cd /home/user/kernels
git clone --single-branch --branch kssp/clang/randstruct https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git
)cd /where/you/cloned/the/linux/repo/linux
build-essential
, libssl-dev
, bison
, flex
, bc
make defconfig
should suffice for our purposes)make -j $(nproc) CC=/path/to/our/llvm/repo/llvm-project/build/bin/clang
Ty->getAs<RecordType>()
is returning a nullptr
when randstruct is active but I'm not sure why.
I confirmed this by printing out a message when that method returns a nullptr.
Commenting out the call to commit
allows the kernel to compile successfully but that's less than ideal since our new order will not be committed to the DeclContext.
I suspect we are destroying pieces of the DeclContext by the way we update it. I sent an e-mail to the cfe-dev mailing list asking for advice on how we can safely manipulate the DeclContext's order of a RecordDecl's fields, but no response yet.
I've also tried updating our commit
method to remove the fields from the RecordDecl's context with removeDecl
and then adding them in the new order with addDecl
but we end up experiencing the same segmentation fault anyways.
I built the kernel with make -k
with Clang build with debug symbols so that it would try to build as much of the kernel as it possibly could so that we can see what other structures cause the compiler to abort and hopefully see what they all have in common. The names of the structures suggest that the compiler fails to emit code for structures that are used primarily as dispatch tables.
I sampled 5 of them randomly. The ones I randomly selected are declaring instances of a struct like so:
static const struct <name of struct> <name of instance> = { ...snipped inits... }
Code generation for these structures are causing the Clang compiler to abort.
The compiler aborts due to failing this assertion:
clang-9: /home/connor/src/llvm-project/llvm/include/llvm/Support/Casting.h:254: typename cast_retty<X, Y >::ret_type llvm::cast(Y ) [X = llvm::PointerType, Y = llvm::Type]: Assertion `isa
I'm unable to reproduce this on our simple testing C program with something like this:
struct ptrs {
int (* hi)();
int (* bye)();
int (* why)();
} __attribute__((randomize_layout));
static const struct ptrs global = {
.hi = 0,
.bye = 0,
.why = 0
};
Here's the list of structures (actually, instances of the structures):
I'm unable to reproduce this on our simple testing C program with something like this:
struct ptrs { int (* hi)(); int (* bye)(); int (* why)(); } __attribute__((randomize_layout)); static const struct ptrs global = { .hi = 0, .bye = 0, .why = 0 };
I looked at a few examples from the list you provided, and though I didn't look at all of them, they seem to include nested structures as members. Have you tried to reproduce the problem using with a nested structure test? Perhaps even cut-n-paste an actual instance that fails?
Have you tried to reproduce the problem using with a nested structure test?
Yes, it compiles cleanly and runs.
Perhaps even cut-n-paste an actual instance that fails?
Not yet, but I'll start looking through for a structure that fails but that isn't encumbered by a lot of "scaffolding"
Although I haven't quite figured it out, I believe your problem is in bool ConstStructBuilder::Build(InitListExpr *ILE)
. See ElementNo in the loop...
The initializer list is created during parsing and uses the pre-randomized order. Perhaps a call to getASTRecordLayout()
before creating the initializer list would solve the problem.
I need to recompile with your code in order to test this, but if this is the case, you should be able to print out an initialized structure with incorrect values without the need to make it assert or crash.
Verified that initialization is busted when reordering fields. Here's a little program that demonstrates the problem. Try it with/without the randomize_layout attribute to see the difference.
I'll dig a little deeper, but at this point, you should probably go back to the list and ask for advice on how to deal with designator lists while reordering fields. Richard Smith is probably the right guy to ask.
#include <iostream>
struct foo {
double i;
int j;
} __attribute__((randomize_layout));
int main(int argc, char *argv[]) {
foo f = { .i=1.1, .j=2};
std::cout << "1.1 = " << f.i << "\n2 = " << f.j << std::endl;
return 0;
}
I don't think messing around with the initializer list creation is a good idea -- probably couldn't permute the field order there anyway. But, if you could attach a map (old_index->new->index) to the RecordDecl when you reorder it, codegen could use the map to match them back up.
The key here is that while designator lists can be sparse, the holes get filled in via InitListChecker::FillInEmptyInitializations()
, so you can safely iterate over the fields and grab the correct initializer from the map.
Wow, thank you so much for all your help! We'll circle back to implement this. Sincerely appreciate your help diagnosing this. You rock.
You're more than welcome. Looking forward to this patch landing...
Btw, I'm not an authority on which approach should or should not be taken, so before spending a lot of time implementing something, I'd encourage you to seek advice from the list and/or D59254.
Again, looking forward to this landing...
Btw, I'm not an authority on which approach should or should not be taken
Don't worry, same here.
I'd encourage you to seek advice from the list and/or D59254.
Absolutely. :-)
Calling Context.getASTRecordLayout(Record)
at the end of Sema::ActOnFields
is one way to fix this, though not sure it's complete, but it should get most of the kernel compilation units going. Although, I guess it may be better to take only the relevant randomization bits from getASTRecordLayout
and not the whole thing, and if we intend to support C++, remember the declaration order somehow so that initializers list would function properly.
Kees found this in testing.
I pulled latest changes from LLVM/Clang and applied the ASM goto series. Also applied Randstruct patches.
With Randstruct disabled the kernel builds fine.
With Randstruct enabled the compiler will segfault.
I find this part of the stack trace particularly interesting:
But when I look at that method nothing is glaringly obviously wrong about it. I think there's got to be some corruption in the DeclContext as a result of our rearranging.