SecurityLab-UCD / IRFuzzer

Apache License 2.0
12 stars 0 forks source link

building IRFuzzer #49

Open regehr opened 5 months ago

regehr commented 5 months ago

hello, I'd like to try out IRFuzzer but when I run ./build.sh I'm getting an error, below.

also, is there any version of IRFuzzer that works with LLVM top of tree? all of the fuzzing that I do is against the latest version. thanks!

....
clang+llvm-14.0.0-x86_64-linux-gnu-ubuntu-18.04/libexec/ccc-analyzer
clang+llvm-14.0.0-x86_64-linux-gnu-ubuntu-18.04/libexec/c++-analyzer
clang+llvm-14.0.0-x86_64-linux-gnu-ubuntu-18.04/libexec/analyze-c++
clang+llvm-14.0.0-x86_64-linux-gnu-ubuntu-18.04/libexec/analyze-cc
clang+llvm-14.0.0-x86_64-linux-gnu-ubuntu-18.04/libexec/intercept-c++
clang+llvm-14.0.0-x86_64-linux-gnu-ubuntu-18.04/libexec/intercept-cc
fatal: not a git repository (or any of the parent directories): .git
regehr@ohm:~/IRFuzzer$  
DataCorrupted commented 5 months ago

Hi John! Thanks for trying our tool! Unfortunately, I'm travelling this week. I will get back to you next week.

regehr commented 5 months ago

no rush. I'll give a bit of background-- we're doing LLVM backend testing using a fork of Alive2 that can do translation validation for the AArch64 backend. we're finding plenty of bugs using our own fuzzer, but the more fuzzers the better! we'd love to try out yours if possible, and of course we would be happy to tag you in any bugs that we found using the help of your tool.

DataCorrupted commented 5 months ago

I heard that from Yoyo in LLVM Developer's Meeting 2022. Validating backend is way harder than finding ICE, we are pushing that direction as well. Maybe we should schedule a talk or smt :)

regehr commented 5 months ago

that sounds great, let's talk sometime in March

regehr commented 5 months ago

I was thinking a little more about how we could work together. one way is we build your fuzzer and then have it call our tools. but, also, if you want to run our translation validation tool, then that is easy too. I would think it would just plug into your workflow pretty seamlessly. basically you just hand it an IR function and it either tells you that it verified or else signals some sort of error.

I'd be happy to help you get started using our software if you wanted to do that. it's all easy stuff. all I would ask is that you let us know about any miscompiles that you find, since we're still accumulating evidence that our stuff works well, in preparation for writing it up for publication.

so far we've found and reported 29 silent miscompiles in LLVM's AArch64 backend. I'm sure there are a lot more bugs remaining but we need fuzzer magic to discover them.

DataCorrupted commented 5 months ago

We have found some bugs, still deduplicating/ping pointing them. Our experiment runs on X86 backend, so we can compile it against our host machine directly without the need of an emulator, run it, and see the difference between O0 and O3.

DataCorrupted commented 5 months ago
git clone https://github.com/SecurityLab-UCD/IRFuzzer.git -b irfuzzer-0.3
cd IRFuzzer
./init.sh
./build.sh

I separated the initialization and building process. Please try this to reproduce the results we had in our paper. Upgrading to the latest LLVM turned out to be harder than I thought, will you ping you once that it's done (probably in IRFuzzer 0.4). The reason is that we have to customize LLVM for some of our needs to track the TableGen, and that part isn't being nice when bumped to the latest LLVM. :(

Besides, I don't think IRFuzzer 0.3 is a good fit for your need. IRFuzzer is known to generate UB (e.g., store undef), the situation is worsened after we introduce architecture agonistic intrinsic insertion. It gives you compilable, but not likely to be runnable code that may not be verifiable.

Internally, we have a verify-asm branch that is customized to generate only defined behavior for that purpose. We should definitely talk about your needs, and maybe we can customize them for you as well. :)

regehr commented 5 months ago

hi Peter, thanks!! I'll try this out soon.

my testing workflow is based on Alive2 and it is extremely robust with respect to undefined behavior. even so, I think it would be best if you avoided undef because it is going to go away and also the LLVM people are very reluctant to fix miscompiles that are triggered by undef. but poison and immediate UB are perfectly fine.

regehr commented 5 months ago

well, it got pretty far this time, here's what I ended up with on an x86-64 machine running Ubuntu 22.04:

[3455/3569] Linking CXX executable bin/llvm-cov
[3496/3569] Linking CXX executable bin/llvm-profgen

[3569/3569] Linking CXX executable bin/llvm-reduce
-- The C compiler identification is unknown
-- The CXX compiler identification is unknown
-- The ASM compiler identification is unknown
-- Didn't find assembler
CMake Error at CMakeLists.txt:47 (project):
  The CMAKE_C_COMPILER:

    clang

  is not a full path and was not found in the PATH.

  Tell CMake where to find the compiler by setting either the environment
  variable "CC" or the CMake cache entry CMAKE_C_COMPILER to the full path to
  the compiler, or to the compiler name if it is in the PATH.

CMake Error at CMakeLists.txt:47 (project):
  The CMAKE_CXX_COMPILER:

    clang++

  is not a full path and was not found in the PATH.

  Tell CMake where to find the compiler by setting either the environment
  variable "CXX" or the CMake cache entry CMAKE_CXX_COMPILER to the full path
  to the compiler, or to the compiler name if it is in the PATH.

CMake Error at CMakeLists.txt:47 (project):
  No CMAKE_ASM_COMPILER could be found.

  Tell CMake where to find the compiler by setting either the environment
  variable "ASM" or the CMake cache entry CMAKE_ASM_COMPILER to the full path
  to the compiler, or to the compiler name if it is in the PATH.

-- Warning: Did not find file Compiler/-ASM
-- Configuring incomplete, errors occurred!
See also "/home/regehr/IRFuzzer/llvm-project/build-release/CMakeFiles/CMakeOutput.log".
See also "/home/regehr/IRFuzzer/llvm-project/build-release/CMakeFiles/CMakeError.log".
regehr@ohm:~/IRFuzzer$ 
regehr commented 5 months ago

(the build above was using the system compiler, gcc 11.4.0. before that, I tried building using clang 17, but in that case it didn't get nearly as far)

regehr commented 5 months ago

anyway, I'm happy to wait for IRFuzzer 0.4

like I said, I do not require UB-free code, I just need functions in LLVM IR that can be processed by the AArch64 backend. if llc -march=aarch64 foo.ll can process the file, then it is likely to work for me

DataCorrupted commented 5 months ago

As of IRFuzzer 0.4, I think I will trim off some features just so IRFuzzer can direct rely on LLVM instead of our customized LLVM so that updating would be easier. (I think calling it IRFuzzer-Alive2 would be more appropriate lol).

I'm bogged down by ICSE 25, job hunting and some other projects, I don't have an estimated time to finish, but you can expect it by next Sunday I guess.

regehr commented 5 months ago

hi Peter, thanks! of course no rush on this. we already have a fuzzer to combine with our tools, but yours looks great and we hopefully can find more bugs using it!! :)

DataCorrupted commented 4 months ago

@regehr branch irfuzzer-alive is available now. Since you mentioned you are testing AArch64, this branch only compiled that backend (You can modify build.sh to change that). Feel free to let me know if you need anything else.

Looking back at your comment about undef and poison, I am tempted to change how IRFuzzer generates these.

regehr commented 4 months ago

thanks Peter, I'll try this out!

here's a miscompile I reported last night: https://github.com/llvm/llvm-project/issues/84718

but this was the only one found after running our fuzzer for a couple of days on a pretty big machine. I'm afraid our fuzzer is running out of bugs to find -- hopefully yours isn't :)

regehr commented 4 months ago

regarding undef, it would be best if you completely avoided generating undef. LLVM is trying to eliminate undef from the IR, and in the meantime people aren't really fixing bugs that contain undefs. so it's best to just not even try to find those bugs (which totally exist).

regarding poison and immediate UB, you should feel free to generate tests that contain these. however, you want to avoid generating tests that are undefined on all paths, these are useless for finding miscompilation bugs (because they can be lowered to anything).

regehr commented 4 months ago

ok, I'm doing a docker build, but this also failed, it ended with this:

In file included from /IRFuzzer/mutator/./include/InsertIntrinsicStrategy.h:3,
                 from /IRFuzzer/mutator/src/irfuzzer.cpp:1:
/root/clang+llvm/include/llvm/FuzzMutate/IRMutator.h:70:8: note: candidate: 'void llvm::IRMutator::mutateModule(llvm::Module&, int, size_t, size_t)'
   70 |   void mutateModule(Module &M, int Seed, size_t CurSize, size_t MaxSize);
      |        ^~~~~~~~~~~~
/root/clang+llvm/include/llvm/FuzzMutate/IRMutator.h:70:8: note:   candidate expects 4 arguments, 3 provided
[12/17] Building CXX object CMakeFiles/AFLFuzzMutate.dir/src/libfuzzer.cpp.o
ninja: build stopped: subcommand failed.
The command '/bin/sh -c ./build.sh' returned a non-zero code: 1
regehr commented 4 months ago

just to be clear, I checked out the head of the irfuzzer-alive branch and then ran:

docker build .
DataCorrupted commented 4 months ago

I messed docker a bit. Finally sorted it out. Can you try it again docker build . --no-cache? It's working on my end now. Also push an image to docker.io, should be finished tomorrow morning...

regehr commented 4 months ago

excellent-- this works for me now! let's try to figure out the next steps.

first, I think that I should modify your dockerfile so that arm-tv (our fork of Alive2 that does translation validation for the AArch64 backend) gets built too. alas, this is going to require a third LLVM build since the one I need is top-of-tree + exceptions and RTTI. I'll also build a Z3 since we like to use the latest release and the ubuntu package is a few releases old.

does it sound good if I give you a pull request that modifies your Dockerfile to do these things?

second, we need to figure out how to run arm-tv on every IR function that you generate, and to log the results. I don't know the best way to fit that into your infrastructure, so maybe you can work on that part after I do the other thing?

DataCorrupted commented 4 months ago

does it sound good if I give you a pull request that modifies your Dockerfile to do these things?

Yes, giving me PR is perfectly fine. I'll try to review them ASAP

we need to figure out how to run arm-tv on every IR function that you generate

My question is, do you want to run it online with IRFuzzer, or run it offline when IRFuzzer is done?

You could've run IRFuzzer and wait until it finishes, grab all the seeds it generated and run arm-tv somewhere else. If that's the case, you don't have to modify Dockerfile at all, just run the scripts and copy the result out of the container should be fine.

However, it would be interesting (in terms of research) if you can provide some kind of feedback to IRFuzzer to guide its mutation. In terms of the infrastructure, the whole thing is based on AFL++. Adding some new feedback shouldn't be rocket science.

DataCorrupted commented 4 months ago

thanks Peter, I'll try this out!

here's a miscompile I reported last night: llvm/llvm-project#84718

but this was the only one found after running our fuzzer for a couple of days on a pretty big machine. I'm afraid our fuzzer is running out of bugs to find -- hopefully yours isn't :)

I'm surprised that a miscompile can be as simple as this -- and the current unit tests couldn't catch it. Maybe I should run IRFuzzer again to see if we get more bugs (Last large scale run was almost a year ago)

regehr commented 4 months ago

ok, let's start with the simplest thing -- I'll run IRFuzzer for a while and grab the seeds and run them through arm-tv, and I'll report back here with what I learned.

if the initial results look promising, it seems perhaps worth figuring out how to run arm-tv inside the fuzzing loop, if nothing else to make the workflow smoother and avoid having to work with giant directories full of IR files.

in terms of providing additional feedback to AFL++, that's a very interesting question. the only thing I can think of offhand is to use arm-tv as a source of coverage feedback, instead of the LLVM backend. presumably this would not be too difficult. I don't have any idea what the results would look like, but it would seem worth trying out if it's not too hard.

DataCorrupted commented 4 months ago

Just updated this branch to include my commit to llvm. You may docker build . --build-arg IRFUZZER_COMMIT=95047b292617d7b2 to rebuild it.

DataCorrupted commented 4 months ago

in terms of providing additional feedback to AFL++

Here's one easy but ideally effective idea, only instrument subclass of Pass (Or even a white list of passes) to provide a more fine-grained branch coverage. With little code to instrument, we can even use N-gram edge coverage if we want to :)

DataCorrupted commented 3 months ago

@regehr Just checking how's IRFuzzer working for you?

regehr commented 3 months ago

hi @DataCorrupted! it was going just fine up until I got interrupted by a bunch of complications that are now mostly behind me. I'll get back to this within the next week or two!!!

DataCorrupted commented 3 months ago

Sounds good.

regehr commented 3 months ago

hi Peter, I came back to this but I think I'm going to need more explicit instructions about what to do.

as I said, running "docker build ." succeeds for me, but when I try to run the fuzzing script I just get:

regehr@john-home:~/IRFuzzer$ python3.10 scripts/fuzz.py -i seeds -o fuzzing -r 1 --set="aarch64" --type=docker --isel=dagisel --fuzzer=irfuzzer --time=1d -j 16 --on_exist=force
Traceback (most recent call last):
  File "/home/regehr/IRFuzzer/scripts/fuzz.py", line 7, in <module>
    from tap import Tap
  File "/home/regehr/.local/lib/python3.10/site-packages/tap.py", line 6, in <module>
    from mc_bin_client import mc_bin_client, memcacheConstants as Constants
  File "/home/regehr/.local/lib/python3.10/site-packages/mc_bin_client/mc_bin_client.py", line 278
    except MemcachedError, e:
           ^^^^^^^^^^^^^^^^^
SyntaxError: multiple exception types must be parenthesized
regehr commented 3 months ago

actually, maybe there's an easier solution here, where we can get some results without running each other's software. Peter would you mind sending me the final output of a week-long (or however long) run of IRFuzzer on the AArch64 backend (with or without global isel, or both) and I'll pass these through arm-tv and let you know what the results are?

DataCorrupted commented 3 months ago

I was investigating the previous issue. Will let you know when I found something. It looks like its our dependencies issue so it may take some time.

In terms of the artifact, check here, you can find our baseline fuzzing, which should include AArch64 here

regehr commented 3 months ago

thanks! I can work with this. will let you know the results.

regehr commented 3 months ago

ok-- arm-tv is running on my 128-thread machine on all of the bitcode files in your artifact. will let you know if anything good comes up!

unrelated, something that might be interesting for you is Kostya's new fuzzer, Centipede: https://github.com/google/fuzztest/tree/main/centipede

something complex like an LLVM backend seems like a perfect target for this fuzzer. see the slide deck linked to the project's README.md

regehr commented 3 months ago

arm-tv has signaled some wrong code bugs, here's the summary of the ones I've looked at so far:

so anyway, that's interesting and fun but no real miscompiles so far. but it has only gone through a couple percent of the IR files so far, and I'll keep looking.

DataCorrupted commented 3 months ago

I had my eyes on Centipede, but never had time to dig into it because I'm running multiple things in parallel I simply run out of bandwidth :( Now that I have finished my thesis and job hunting, I can take a look before I start my job :)

In terms of your finding, that's expected for IRFuzzer, it only cares about semantic correctness (i.e., the code should compile), and often generate code that doesn't make sense. Such behavior can be adjusted if you modify the source code.

All in all, sounds very interesting! Looking forward to more exciting findings!

regehr commented 3 months ago

update:

I ran every bitcode file in the irfuzzer subtree through arm-tv. Alive signaled a bunch of poison- and pointer-related problems, but these things happen a lot due to the nasty ubiquitous ptrtoint/inttoptr found in lifted code. I didn't look through them carefully yet. this run didn't find any of the nastiest kind of miscompile, where either immediate UB is introduced or else a function straight-up returns the wrong answer.

then I started a new run, this time using global isel. it has not finished, but a miscompile already popped up. Peter, I am going to go ahead and just CC you in any bug that I report, that comes from IRFuzzer-generated code. does that sound like a good plan?

all of the testing that I'm doing is using the generic AArch64 backend. we've not yet even started to look at the more specific AArch64 targets or target options.

link to bug:

https://github.com/llvm/llvm-project/issues/90532

regehr commented 3 months ago

ok, the global isel run finished without finding any further big problems. I still have plenty of alarms to look through, but most of them appear to correspond to known weaknesses in our pointer support (this is hard stuff) that we'll hopefully get fixed this summer. at that point I can run all of your test cases again.

at one level it's good news that all of these tests resulted in just one (obvious) miscompile, it means that this backend is pretty strong right now. but on the other hand, I'm surprised that a new fuzzer doesn't turn up more stuff. I'm not really sure what to do next, let me know if you have ideas.

DataCorrupted commented 3 months ago
regehr commented 3 months ago
  • Which image did you use? I just checked [irfuzzer-alive]

I think I used the head of that branch. But if you don't mind, I'd be happy to keep working in our current mode where you run your tool and I run mine! This seems very easy for both of us.

But also I'm happy to keep trying to build+run IRFuzzer.

Another thing we should investigate is whether IRFuzzer should run arm-tv. Maybe the custom mutator could fork a process for arm-tv? My thinking is that it would only do this some of the time, like when new coverage is found, or else probabilistically. We don't want to run arm-tv too much since it's slow and we don't want to take all CPU time away from fuzzing. So we could maybe adjust the probability so that arm-tv is always using, for example, fewer than 25% of the cores.

  • In terms of the pointer support that's bothering alive2, I think we can modify IRFuzzer in a way that it doesn't generate pointer operations. Currently, it has a lot of it because every time it needs a sink, it could alloc a stack memory then load from it.

I think this is worth trying. If you need to sink a value, maybe just combine it somehow with the function's return value?

DataCorrupted commented 3 months ago

https://hub.docker.com/layers/datacorrupted/irfuzzer/alive/images/sha256-74c29df36824bcb472dc3d06952ab69df4aba0a5a332dbd82b47e62fcfb3bb0c?context=explore

(I tagged the wrong link to irfuzzer-alive image)

DataCorrupted commented 3 months ago

I didn't fully get your point on arm-tv. Shall we schedule a zoom to sort that out?

Also, I want to understand more about how limited is alive2 in terms of pointer to better modify the mutator.

regehr commented 3 months ago

hi Peter, yes, it would be great to talk. now that the semester has ended my schedule is pretty open. maybe suggest a time next week?

it's not Alive2 that is limited in pointers, but my tool arm-tv which builds on Alive2 to do translation validation for the AArch64 backend. the fundamental problem is that the ARM assembly code freely mixes pointers and integers, so when I lift that code back to LLVM IR, there are a ton of ptrtoint and inttoptr instructions, and these are fundamentally difficult to reason about. basically we're running into open research problems here. Nuno has implemented lots of partial solutions for me, but problems remain. we hope to get them solved this summer.

DataCorrupted commented 3 months ago

Sounds good. I just sent you a Zoom invitation next Tuesday. Feel free to adjust the time as you wish: I am free all day that day.

regehr commented 3 months ago

oops I have a meeting right at 2pm on the 7th, can we do an hour earlier?

regarding arm-tv, I'm just saying that it needs to get run somewhere, and I think it's more elegant and efficient to run it as part of the fuzzing process, rather than running it as a batch job afterwards. in this mode it will not participate in fuzzing actively, it simply acts as a passive test oracle, logging errors into a file somewhere.

DataCorrupted commented 2 months ago

Just rescheduled. I don't think it would be too difficult, nothing that can't be done with some scripting. But I do need to learn how to use arm-tv first.

regehr commented 2 months ago

excellent. I'll give you an arm-tv demo when we talk.

also, our own fuzzer has gone for >24 hours without finding any miscompiles. if it finishes its run (in 2-3 days) without finding any, that would be the first time this has ever happened.

so maybe between our two fuzzing efforts (+ whoever else is doing this kind of work)( we've mined out most of the easy stuff from the AArch64 backend