merbanan / rtl_433

Program to decode radio transmissions from devices on the ISM bands (and other frequencies)
GNU General Public License v2.0
6.13k stars 1.32k forks source link

Add fuzz tests #1062

Open mhansen opened 5 years ago

mhansen commented 5 years ago

I think this project could get a lot of use out of fuzz tests - tests that send random input to the decoders and check they don't crash or segfault while running under ASAN.

Given that rtl_433:

I think fuzz tests might find some issues pretty quickly?

I imagine this would be a bit of work to set up, but wanted to file an issue to track any ideas. The first place I'd try: AFL Fuzz, which is pretty neat - AFL instruments the binary to discover new inputs that reach new internal binary states http://lcamtuf.coredump.cx/afl/

zuckschwerdt commented 5 years ago

Sounds interesting. There is already the -y option to feed arbitrary bitbuffers to the decoders. That mechanism could easily be used programmatically. I already planned to use that for quick(er) regression tests on each build -- the full regression tests with rtl_433_tests are to heavy and slow for that.

mhansen commented 5 years ago

That's pretty cool, I didn't see the -y option. Sounds like a good addition to the regression tests for sure.

Mindavi commented 5 years ago

I created a small script that uses the -y option to mash random data into rtl_433. This helped me find some decoders that were greedy (matched on bogus data) and fix them. It's not a smart script but might help with finding bugs related to decoders that match on too many inputs.

Of course, some decoders don't have any checksums so will happily accept any data at all.

https://gist.github.com/Mindavi/3b972526384dac1248e4aded49e8aff0

zuckschwerdt commented 5 years ago

Great to see this finally done! Some devices will only trigger with N identical bitbuffers, some only with a minimum number of rows, some need a first row with sync (something like a single bit). Maybe some ideas to explore?

mhansen commented 5 years ago

This is really cool, Rick! I'm impressed that you found some greedy decoders so quickly.

On Fri, Oct 18, 2019 at 7:39 AM Christian W. Zuckschwerdt < notifications@github.com> wrote:

Great to see this finally done! Some devices will only trigger with N identical bitbuffers, some only with a minimum number of rows, some need a first row with sync (something like a single bit). Maybe some ideas to explore?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/merbanan/rtl_433/issues/1062?email_source=notifications&email_token=AAAZYOPPIZ4O3BGT5LS7MQLQPDERTA5CNFSM4HOJHFI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBROMEA#issuecomment-543352336, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAZYOMPEKJQZOWYEQ633ADQPDERTANCNFSM4HOJHFIQ .

Mindavi commented 5 years ago

I'm otherwise not very familiar with instrumented fuzzing, but if I find the time I'd really like to learn how to do it and implemented it for this application. I'm sure it will be helpful in finding some interesting edge cases.

Mindavi commented 5 years ago

Well, according to the online sources afl should run a lot faster, but it does find stuff anyway. These are my first results after just a couple of minutes:

rick@rick-GL552VW:~/hobby/rtl-sdr/433mhz/afl-fuzz$ afl-whatsup output/
status check tool for afl-fuzz by <lcamtuf@google.com>

Individual fuzzers
==================

>>> fuzzer01 (0 days, 0 hrs) <<<

  cycle 1, lifetime speed 59 execs/sec, path 0/306 (0%)
  pending 158/306, coverage 5.63%, no crashes yet

>>> fuzzer02 (0 days, 0 hrs) <<<

  cycle 1, lifetime speed 44 execs/sec, path 24/686 (3%)
  pending 216/682, coverage 5.93%, crash count 17 (!)

>>> fuzzer03 (0 days, 0 hrs) <<<

  cycle 1, lifetime speed 59 execs/sec, path 0/306 (0%)
  pending 158/306, coverage 5.63%, no crashes yet

>>> fuzzer04 (0 days, 0 hrs) <<<

  cycle 1, lifetime speed 59 execs/sec, path 0/306 (0%)
  pending 158/306, coverage 5.63%, no crashes yet

Summary stats
=============

       Fuzzers alive : 4
      Total run time : 0 days, 0 hours
         Total execs : 0 million
    Cumulative speed : 221 execs/sec
       Pending paths : 690 faves, 1600 total
  Pending per fuzzer : 172 faves, 400 total (on average)
       Crashes found : 17 locally unique

I'm currently running it against the rtl_433_tests files (or at least some of them), so that's also why it's kinda slow. rtl_433 needs a small change to be able to use the -y option (needs to read either from stdin or from a file). I'd assume this will make the process hundreds of times faster, so I'll look into that later.

zuckschwerdt commented 5 years ago

Done. You can now read bitbuffers (one per line) with:

zuckschwerdt commented 5 years ago

I've also added an optional prefix of [nn] to each test bitbuffer to select a single decoder. This should allow us to build a test pattern file for quick regression tests. Surely needs some nice refactoring -- but the feature is there ;)

Mindavi commented 5 years ago

Not sure about that [nn] feature, the command line does support it too and a regression test could be easily written to give the input to rtl_433 using a device id.

Great to see the options implemented this quick! I'll review the fuzzing results later and will send some reports in. This morning I already found a couple hundred unique crashes so I guess it's working well and finding lots of issues.

@zuckschwerdt or @merbanan , should I send in the results privately or is sharing here ok? I'm not sure how easy the issues are to misuse (you'd have to send out a very specific 433 mhz signal or something, I guess). If it's ok to post them here, I'll make a zip and upload all the samples.

I also made a quick'n'dirty implementation for reading in test bitbuffers, so that one will have run for a while too this evening. It was indeed a lot faster, instead of averaging around 70 execs/second, it does about 900 execs/second. Because the test bitbuffers are a lot shorter, it finds issues a lot quicker. However, of course it doesn't hit the demod code that would otherwise also be tested.

I'll put some time in writing down how it works later, but it's not very hard. I've watched this talk and that helped a ton: https://www.youtube.com/watch?v=DFQT1YxvpDo. The steps are basically:

  1. Download AFL
  2. Compile (and optionally install) AFL
  3. Compile rtl_433 using afl-clang-fast or afl-gcc instead of the default C compiler (CC=afl-clang-fast cmake ..). This will add instrumentation to the binary so afl can see which branches are taken.
  4. Create a test corpus (I just used the rtl_433_tests repository).
  5. Shrink the corpus to a minimal corpus (using afl-cmin, also an AFL tool). This ensures that there are no test files that take the same branches as others, which can slow down fuzzing.
  6. Start the fuzzer with afl-fuzz, which is called something like this: afl-fuzz -i corpus -o output -- rtl_433 -r @@.
zuckschwerdt commented 5 years ago

The [nn] prefix allows to create a (single) test bitbuffer file with e.g. a few selected codes for each decoder. Not for fuzzing but for a quick regression test. The output can then be tested to remain invariant when adding patches, features, new devices. Basically it adds back something like the missing timing parameters (bitbuffers from the demod only go to a few decoders, with matching timings). I still need to add a debug feature to dump the used bitbuffers when a decoder is successful so we can easily build that file.

merbanan commented 5 years ago

@Mindavi you need some really tricky signal to be able to create an exploit. We do limit the amount of pulses a message can have and most bitstreams need very little parsing so I'd say that it is very unlikely that someone will be able to create an exploit. But on the other hand if the code crashes who knows what is possible.

So send the data in private, I'll take a look and if I don't think there is anything really serious then it can be posted publicly.

mhansen commented 5 years ago

It might be possible to exploit using the TCP mode of rtl_433? Then you don't need to make a radio signal, just a TCP signal.

On Wed, Oct 23, 2019 at 6:02 AM Benjamin Larsson notifications@github.com wrote:

@Mindavi https://github.com/Mindavi you need some really tricky signal to be able to create an exploit. We do limit the amount of pulses a message can have and most bitstreams need very little parsing so I'd say that it is very unlikely that someone will be able to create an exploit. But on the other hand if the code crashes who knows what is possible.

So send the data in private, I'll take a look and if I don't think there is anything really serious then it can be posted publicly.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/merbanan/rtl_433/issues/1062?email_source=notifications&email_token=AAAZYOJ7DEXEMFLJE42LHEDQP5E4NA5CNFSM4HOJHFI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEB62ZEQ#issuecomment-545107090, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAZYOOB2BIRSO33PPOWLLTQP5E4NANCNFSM4HOJHFIQ .

zuckschwerdt commented 5 years ago

The first step, like done here, is to fuzz bitbuffers and make sure there are no hidden assumptions about received bitbuffers (perhaps some unchecked length fields in messages). Then we can move on to fuzz the pulse format (-r OOK:-). Not sure if fuzzing an actual radio signal is feasible?

mhansen commented 5 years ago

I had a go at fuzzing raw radio signal and failed completely, the radio data’a structure was too diffuse for afl to latch onto.

On Wed, 23 Oct 2019 at 19:13, Christian W. Zuckschwerdt < notifications@github.com> wrote:

The first step, like done here, is to fuzz bitbuffers and make sure there are no hidden assumptions about received bitbuffers (perhaps some unchecked length fields in messages). Then we can move on to fuzz the pulse format (-r OOK:-). Not sure if fuzzing an actual radio signal is feasible?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/merbanan/rtl_433/issues/1062?email_source=notifications&email_token=AAAZYOMX2RWN5L43JFH7WBTQQABT3A5CNFSM4HOJHFI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECAQA6Y#issuecomment-545325179, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAZYOLQAZMNEYE4LVA4BCLQQABT3ANCNFSM4HOJHFIQ .

Mindavi commented 5 years ago

AFL is definitely made to fuzz small files, so the big radio files are quite hard on the fuzzer. I sent the files to @merbanan, but after reviewing the files this morning it seems to boil down to one bug.

The issue here is that afl will give a 'uniq crash' for every different path. Of course, if multiple decoders bail out earlier/later or generate an output, this is detected by afl as a different path, while the cause of the crash is always the same.

So, while it definitely resulted in some crashes, it's possible that might be the only bug that can be found with the big binary files. At least that's one out of the way. The issue with the big files is also that it takes a lot of time to parse them, which slows down rtl_433.

It's important to have a good test corpus. That will definitely improve the chances afl finds something.

Mindavi commented 5 years ago

The first step, like done here, is to fuzz bitbuffers and make sure there are no hidden assumptions about received bitbuffers (perhaps some unchecked length fields in messages). Then we can move on to fuzz the pulse format (-r OOK:-). Not sure if fuzzing an actual radio signal is feasible?

How is the -r OOK:- command used? Does it also accept hex input? Or does it accept binary input?

zuckschwerdt commented 5 years ago

It's just (base-10 text) pulse-gap pairs. You can dump decoded pulse data to stdout with rtl_433 -w OOK:- and take a look.

Mindavi commented 5 years ago

linecoverage.zip

This is the coverage I've this far achieved doing fuzzing with cu8 files. It's pretty impressive, I think. I'll now use the -G flag to also check if the disabled devices can be decoded.

Most missing coverage is either: verbose output, disabled devices, too big test files as input (which I removed from the corpus instead of minifying). The last case can be mitigated by reducing the size while keeping enough signal to trigger the decoders.

Mindavi commented 5 years ago

I uploaded some corpus files to my personal stack:

cu8 files https://mindavi.stackstorage.com/s/9pEGpzZvEitjEMU

bitbuffer files https://mindavi.stackstorage.com/s/Xg9PYFogKAZgUsH

These can be used to seed AFL-fuzz (or another fuzzer) so it can find new paths quickly. The coverage of the cu8 is pretty high. The bitbuffer coverage is a little bit lower as of now, but I'm currently running fuzzing with some new patches, so I hope this will improve the coverage.

Either way, they're both good to seed a fuzzer. They might also help hitting all the edges of the code when trying to improve some things.

I'd have to look in to a better place to store them, but for now this works for me.

merbanan commented 5 years ago

Fuzzing of baseband(cu8) files will not work. The low pass filtering step will take care of most induced errors. What could be fuzzed instead is the pre-step to the bit-buffer and that is the pulse array. The pulse array is what is transformed to the bit buffer.

Mindavi commented 4 years ago

I've generated a corpus that gives good coverage already, so I'll put it here if anyone else might want to try fuzzing. corpus.zip

The corpus gives a line coverage of 72.1% on src/devices (3660/5075 lines). Bear in mind this is without verbose logging enabled, which would give higher coverage, but doesn't add much by means of finding bugs.

Mindavi commented 4 years ago

I've created a repository with the fuzzing corpus I've generated. https://github.com/Mindavi/rtl_433_fuzzing_corpus

gdt commented 1 year ago

How do we make progress on this issue? Is it just adding a link to some other test repo to our README? If so, PR please. If something else, please explain.