Fuzzing lld/ELF - Githubissues

Quuxplusone commented 7 years ago


Bugzilla Link	PR30540
Status	NEW
Importance	P normal
Reported by	Davide Italiano (ditaliano@apple.com)
Reported on	2016-09-27 10:22:11 -0700
Last modified on	2016-11-16 13:50:47 -0800
Version	unspecified
Hardware	PC Linux
CC	dblaikie@gmail.com, emaste@freebsd.org, filcab@gmail.com, grimar@accesssoftek.com, llvm-bugs@lists.llvm.org, peter@pcc.me.uk, rafael@espindo.la
Fixed by commit(s)
Attachments	`crashes.tar` (71680 bytes, application/x-tar)
Blocks
Blocked by	PR30915
See also

Trying to track crashes here. I did a first run with AFL and will upload the first attachment soon.

Quuxplusone commented 7 years ago

Attached crashes.tar (71680 bytes, application/x-tar): crashes

Quuxplusone commented 7 years ago

There may be duplicates, and this needs further analysis. Uploading the result of the afl run here so it doesn't get lost in the black hole of my filesystem.

Quuxplusone commented 7 years ago

Next set of patches seems fixes all provided testcase:

D25091, D25090, D25087, D25085, D25083, D25082, D25081, D25025, D25016, D25015

Quuxplusone commented 7 years ago

Now that I moved ELF.h to Expected I think we're in a much better position for fuzzing (as previously pointed out by Rafael).

Quuxplusone commented 7 years ago

I looked into fuzzing lld/ELF with libFuzzer a while back -- you can see my
experimental branch at https://github.com/pcc/llvm-project/tree/lld-fuzz

I have a few thoughts on hardening lld in general:
- there's probably no point in hardening some parts of lld and not all of it
- hardening should be weighed against other project priorities such as code
cleanliness and performance
- it's difficult to measure the overall performance penalty of hardening as a
whole if it's done incrementally -- probably the best way to do it is to
compare a non-hardened lld against a fully hardened lld

If we did want to harden lld, here's what I might do:
- continue working on a "hardened" branch of lld until it can survive (say) 1
hour of fuzzing with libFuzzer
- take a diff of the fully hardened branch against upstream; look at the
changes holistically and see if there's a good clean way to implement them
- once the code looks clean, measure the performance penalty, and if it looks
good:
- contribute the hardening changes upstream

Quuxplusone commented 7 years ago

My thoughts are next:

>I have a few thoughts on hardening lld in general:
>- there's probably no point in hardening some parts of lld and not all of it
>- hardening should be weighed against other project priorities such as code
>cleanliness and performance

I probably can't agree on these. Hardening helps to reduce crashes amount for
example. You probably can never know where is "all was hardened" point.
Hardening of all LLD might help to create a testing bot configuration, but
please look at D25279, D25467. These shows how just a little problem can create
a unresolvable discussions.

- continue working on a "hardened" branch of lld until it can survive (say) 1
hour of fuzzing with libFuzzer

Above makes this part (IMHO of course) not very helpful. From my little
experience of using AFL, 1h of work on regular 2 years old hi-end I7 + SSD is
really nothing. Last run took about few days for me. And I faced lot of
comments and opinions on each patch I posted. So I mean it probably can be
impossible to fix everything at once.

I do not think we should have strict correlation between LLD stability and
perfomance. Both are important undependently. Hopefully stability checks should
not not consume much time.

So I would suggest to continue posting patches for each issue that fuzzer
reveals.

Quuxplusone commented 7 years ago

You probably can never know where is "all was hardened" point.

True, but you can probably get close enough for collecting data.

Hardening of all LLD might help to create a testing bot configuration, but please look at D25279, D25467. These shows how just a little problem can create a unresolvable discussions.

FWIW, I think this argues in favour of working on a branch. This would allow you to work quickly and come back with data showing the overall cost (in terms of performance and code complexity) of hardening which may overcome the objections in the discussions.

Above makes this part (IMHO of course) not very helpful. From my little experience of using AFL, 1h of work on regular 2 years old hi-end I7 + SSD is really nothing. Last run took about few days for me.

Out of curiosity: what kind of initial corpus were you using with AFL? For my libFuzzer experiments I used a corpus based on the contents of tools/lld/test/ELF/Output in my build directory, which uncovered issues within a few seconds.

Hopefully stability checks should not not consume much time.

That may very well be the case, but I think you will need data to convince the other lld developers (including me) of it.

Quuxplusone commented 7 years ago

(In reply to comment #7)
> Out of curiosity: what kind of initial corpus were you using with AFL?

I think I used the initial binaries from this bug attachment whole the time.

Quuxplusone commented 7 years ago

(In reply to comment #8)
> (In reply to comment #7)
> > Out of curiosity: what kind of initial corpus were you using with AFL?
>
> I think I used the initial binaries from this bug attachment whole the time.

And just in case: it might reveal new issues but might be not. As far I
understand the AFL algorithm it just goes from start to end and swithes bytes
according to some patterns. So What I want to say here is that 1-N hours of
running is not a something worth to take in account probably.

Quuxplusone commented 7 years ago

(In reply to comment #9)
> (In reply to comment #8)
> > (In reply to comment #7)
> > > Out of curiosity: what kind of initial corpus were you using with AFL?
> >
> > I think I used the initial binaries from this bug attachment whole the time.
>
> And just in case: it might reveal new issues but might be not. As far I
> understand the AFL algorithm it just goes from start to end and swithes
> bytes according to some patterns. So What I want to say here is that 1-N
> hours of running is not a something worth to take in account probably.

Most modern fuzzers (including AFL and libFuzzer) are guided by code coverage,
so you will likely get the best results if you start with a corpus that already
provides good coverage (such as the test suite).

Quuxplusone commented 7 years ago

(In reply to comment #10)
> (In reply to comment #9)
> > (In reply to comment #8)
> > > (In reply to comment #7)
> > > > Out of curiosity: what kind of initial corpus were you using with AFL?
> > >
> > > I think I used the initial binaries from this bug attachment whole the
time.
> >
> > And just in case: it might reveal new issues but might be not. As far I
> > understand the AFL algorithm it just goes from start to end and swithes
> > bytes according to some patterns. So What I want to say here is that 1-N
> > hours of running is not a something worth to take in account probably.
>
> Most modern fuzzers (including AFL and libFuzzer) are guided by code
> coverage, so you will likely get the best results if you start with a corpus
> that already provides good coverage (such as the test suite).

Yes, I understand the theory. At fact most of results I had was covered by
mixing of flags/etc. So fuzzing the header revealed lots of results and corpus
probably was significant just to pass the initial checks here. I mean that it
is a ELF file and so on.

Quuxplusone / LLVMBugzillaTest

Fuzzing lld/ELF #29513