sunchao / parquet-rs

Apache Parquet implementation in Rust
Apache License 2.0
149 stars 20 forks source link

Add fuzzing #190

Closed gnieto closed 6 years ago

gnieto commented 6 years ago

Add fuzzing

Add a fuzzing target for a full parquet file using AFL. You can follow the instructions on parquet-fuzz to find problematic input files.

Results

I've seen that there are many crashes due to explicit crashes on record/reader.rs or unrwaps.

Before (trying to) solve some of the crashes, I would like to know if you agree on changing some of the explicit panics to methods returning Result (obviously, depending on the context).

One of the crashes detected by the fuzzer hits this line: https://github.com/sunchao/parquet-rs/blob/master/src/record/reader.rs#L497

coveralls commented 6 years ago

Pull Request Test Coverage Report for Build 669


Files with Coverage Reduction New Missed Lines %
file/properties.rs 4 90.97%
<!-- Total: 4 -->
Totals Coverage Status
Change from base Build 661: -0.03%
Covered Lines: 12993
Relevant Lines: 13578

💛 - Coveralls
gnieto commented 6 years ago

What does this tool do? Does it generate test files?

Fuzzing is a testing technique that mutates valid states (in this case, the files on the in folder) and tries to generate inputs that panics or hangs. It will, for sure, generate invalid parquet files that probably won't be readable on other systems, but hopefully does not lead to a crash of the program.

Why do you need to add parquet files that we seem to have in "data" directory?

The fuzzer, in this case AFL, needs some samples to start the fuzzing process. I've used the files on data directory, but excluding large files (otherwise, AFL refused to start the fuzzing). I can symlink the files if you prefer it.

Can you send me that file, I would like to know why record reader crashed there.

I'll add, temporary, a commit with several files that makes parquet-reader crash. As I said before, this tool will probably generate invalid parquet files, but I think that those cases should be gracefully handled.

sadikovi commented 6 years ago

I would rather you either point fuzzer to files in "data", or symlink them.

gnieto commented 6 years ago

@sadikovi I've added some of the crashes on a temporal commit.

I'll symlink the input files instead of copying them.

sadikovi commented 6 years ago

Could you just create a separate issue and attach the files, and revert the commit.

Symlinks are fine, as long as there is no confusion when adding and deleting files.

This tool can be useful to see where our code panics, but it generates invalid files anyway, so it is quite hard to say if it was an invalid layout of the file or an actual bug that we should investigate - which, of course, are addressed differently.

I wish it could generate all different valid files. Actually, we can just write a tool using parquet-mr that would generate files with different schemes, and try reading it with our code. Let me know if this sounds like something we want to have!

gnieto commented 6 years ago

@sadikovi Removed the commit with the examples. I've tried to symlink the data files on input folder, but AFL refuses to work when using symlinks

This tool can be useful to see where our code panics, but it generates invalid files anyway, so it is quite hard to say if it was an invalid layout of the file or an actual bug that we should investigate - which, of course, are addressed differently.

Yes, this tool generates invalid files and I understand that found issues will have distinct priority to get resolved. Anyway, I think that all of them are important: from my point of view, as a library, parsing should never panic.

I wish it could generate all different valid files. Actually, we can just write a tool using parquet-mr that would generate files with different schemes, and try reading it with our code. Let me know if this sounds like something we want to have!

Even a tool like this could be useful, it would be orthogonal to the proposed one and is not fuzzing testing. Anyway, the most testing tools, the better; It could be nice to have a tool like the one you describe.

@sunchao Fixed your comments

sunchao commented 6 years ago

@gnieto I got the following error when running cargo afl build:

➜  fuzz git:(fuzz) cargo afl build
   Compiling rayon-core v1.4.1
   Compiling rayon v1.0.3
   Compiling either v1.5.0
   Compiling proc-macro2 v0.4.23
error: linking with `cc` failed: exit code: 1
  |
  = note: "cc" "-m64" "-L" "/Users/chao/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/rayon-core-3167ba751795c106/build_script_build-3167ba751795c106.build_script_build.3vea1i82-cgu.0.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/rayon-core-3167ba751795c106/build_script_build-3167ba751795c106.build_script_build.3vea1i82-cgu.1.rcgu.o" "-o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/rayon-core-3167ba751795c106/build_script_build-3167ba751795c106" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/rayon-core-3167ba751795c106/build_script_build-3167ba751795c106.4wx7sp0ghblr683w.rcgu.o" "-Wl,-dead_strip" "-nodefaultlibs" "-L" "/Users/chao/git/parquet-rs/fuzz/target/debug/deps" "-L" "/Users/chao/.local/share/afl.rs/rustc-1.32.0-nightly-9fefb67/afl.rs-0.4.2/afl-llvm-rt" "-L" "/Users/chao/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib" "-lafl-llvm-rt" "/Users/chao/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libstd-1b66d2d49ee19492.rlib" "/Users/chao/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libpanic_unwind-0fc6280c17916cf2.rlib" "/Users/chao/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libunwind-616b9d3aedc220c5.rlib" "/Users/chao/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/liblibc-6677ee0ff3f20586.rlib" "/Users/chao/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/liballoc-8cd9fba02444a18e.rlib" "/Users/chao/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libcore-467ba95b94992ba8.rlib" "/Users/chao/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libcompiler_builtins-b95c24a751b6af8a.rlib" "-lSystem" "-lresolv" "-lc" "-lm" "-fuse-ld=gold"
  = note: clang: error: invalid linker name in argument '-fuse-ld=gold'

error: aborting due to previous error

error: Could not compile `rayon-core`.
warning: build failed, waiting for other jobs to finish...
error: linking with `cc` failed: exit code: 1
  |
  = note: "cc" "-m64" "-L" "/Users/chao/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/rayon-2c40eb4a6336a2c6/build_script_build-2c40eb4a6336a2c6.build_script_build.7q8qirm9-cgu.0.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/rayon-2c40eb4a6336a2c6/build_script_build-2c40eb4a6336a2c6.build_script_build.7q8qirm9-cgu.1.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/rayon-2c40eb4a6336a2c6/build_script_build-2c40eb4a6336a2c6.build_script_build.7q8qirm9-cgu.10.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/rayon-2c40eb4a6336a2c6/build_script_build-2c40eb4a6336a2c6.build_script_build.7q8qirm9-cgu.11.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/rayon-2c40eb4a6336a2c6/build_script_build-2c40eb4a6336a2c6.build_script_build.7q8qirm9-cgu.12.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/rayon-2c40eb4a6336a2c6/build_script_build-2c40eb4a6336a2c6.build_script_build.7q8qirm9-cgu.13.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/rayon-2c40eb4a6336a2c6/build_script_build-2c40eb4a6336a2c6.build_script_build.7q8qirm9-cgu.14.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/rayon-2c40eb4a6336a2c6/build_script_build-2c40eb4a6336a2c6.build_script_build.7q8qirm9-cgu.15.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/rayon-2c40eb4a6336a2c6/build_script_build-2c40eb4a6336a2c6.build_script_build.7q8qirm9-cgu.2.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/rayon-2c40eb4a6336a2c6/build_script_build-2c40eb4a6336a2c6.build_script_build.7q8qirm9-cgu.3.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/rayon-2c40eb4a6336a2c6/build_script_build-2c40eb4a6336a2c6.build_script_build.7q8qirm9-cgu.4.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/rayon-2c40eb4a6336a2c6/build_script_build-2c40eb4a6336a2c6.build_script_build.7q8qirm9-cgu.5.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/rayon-2c40eb4a6336a2c6/build_script_build-2c40eb4a6336a2c6.build_script_build.7q8qirm9-cgu.6.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/rayon-2c40eb4a6336a2c6/build_script_build-2c40eb4a6336a2c6.build_script_build.7q8qirm9-cgu.7.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/rayon-2c40eb4a6336a2c6/build_script_build-2c40eb4a6336a2c6.build_script_build.7q8qirm9-cgu.8.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/rayon-2c40eb4a6336a2c6/build_script_build-2c40eb4a6336a2c6.build_script_build.7q8qirm9-cgu.9.rcgu.o" "-o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/rayon-2c40eb4a6336a2c6/build_script_build-2c40eb4a6336a2c6" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/rayon-2c40eb4a6336a2c6/build_script_build-2c40eb4a6336a2c6.7k2dqv7gyvfq06h.rcgu.o" "-Wl,-dead_strip" "-nodefaultlibs" "-L" "/Users/chao/git/parquet-rs/fuzz/target/debug/deps" "-L" "/Users/chao/.local/share/afl.rs/rustc-1.32.0-nightly-9fefb67/afl.rs-0.4.2/afl-llvm-rt" "-L" "/Users/chao/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib" "-lafl-llvm-rt" "/Users/chao/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libstd-1b66d2d49ee19492.rlib" "/Users/chao/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libpanic_unwind-0fc6280c17916cf2.rlib" "/Users/chao/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libunwind-616b9d3aedc220c5.rlib" "/Users/chao/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/liblibc-6677ee0ff3f20586.rlib" "/Users/chao/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/liballoc-8cd9fba02444a18e.rlib" "/Users/chao/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libcore-467ba95b94992ba8.rlib" "/Users/chao/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libcompiler_builtins-b95c24a751b6af8a.rlib" "-lSystem" "-lresolv" "-lc" "-lm" "-fuse-ld=gold"
  = note: clang: error: invalid linker name in argument '-fuse-ld=gold'

error: aborting due to previous error

error: Could not compile `rayon`.
warning: build failed, waiting for other jobs to finish...
error: linking with `cc` failed: exit code: 1
  |
  = note: "cc" "-m64" "-L" "/Users/chao/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/proc-macro2-09b3edaeb23042a3/build_script_build-09b3edaeb23042a3.build_script_build.50o8j5zu-cgu.0.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/proc-macro2-09b3edaeb23042a3/build_script_build-09b3edaeb23042a3.build_script_build.50o8j5zu-cgu.1.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/proc-macro2-09b3edaeb23042a3/build_script_build-09b3edaeb23042a3.build_script_build.50o8j5zu-cgu.10.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/proc-macro2-09b3edaeb23042a3/build_script_build-09b3edaeb23042a3.build_script_build.50o8j5zu-cgu.11.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/proc-macro2-09b3edaeb23042a3/build_script_build-09b3edaeb23042a3.build_script_build.50o8j5zu-cgu.12.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/proc-macro2-09b3edaeb23042a3/build_script_build-09b3edaeb23042a3.build_script_build.50o8j5zu-cgu.13.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/proc-macro2-09b3edaeb23042a3/build_script_build-09b3edaeb23042a3.build_script_build.50o8j5zu-cgu.14.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/proc-macro2-09b3edaeb23042a3/build_script_build-09b3edaeb23042a3.build_script_build.50o8j5zu-cgu.15.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/proc-macro2-09b3edaeb23042a3/build_script_build-09b3edaeb23042a3.build_script_build.50o8j5zu-cgu.2.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/proc-macro2-09b3edaeb23042a3/build_script_build-09b3edaeb23042a3.build_script_build.50o8j5zu-cgu.3.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/proc-macro2-09b3edaeb23042a3/build_script_build-09b3edaeb23042a3.build_script_build.50o8j5zu-cgu.4.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/proc-macro2-09b3edaeb23042a3/build_script_build-09b3edaeb23042a3.build_script_build.50o8j5zu-cgu.5.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/proc-macro2-09b3edaeb23042a3/build_script_build-09b3edaeb23042a3.build_script_build.50o8j5zu-cgu.6.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/proc-macro2-09b3edaeb23042a3/build_script_build-09b3edaeb23042a3.build_script_build.50o8j5zu-cgu.7.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/proc-macro2-09b3edaeb23042a3/build_script_build-09b3edaeb23042a3.build_script_build.50o8j5zu-cgu.8.rcgu.o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/proc-macro2-09b3edaeb23042a3/build_script_build-09b3edaeb23042a3.build_script_build.50o8j5zu-cgu.9.rcgu.o" "-o" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/proc-macro2-09b3edaeb23042a3/build_script_build-09b3edaeb23042a3" "/Users/chao/git/parquet-rs/fuzz/target/debug/build/proc-macro2-09b3edaeb23042a3/build_script_build-09b3edaeb23042a3.1r66nlkbxpqlqqo2.rcgu.o" "-Wl,-dead_strip" "-nodefaultlibs" "-L" "/Users/chao/git/parquet-rs/fuzz/target/debug/deps" "-L" "/Users/chao/.local/share/afl.rs/rustc-1.32.0-nightly-9fefb67/afl.rs-0.4.2/afl-llvm-rt" "-L" "/Users/chao/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib" "-lafl-llvm-rt" "/Users/chao/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libstd-1b66d2d49ee19492.rlib" "/Users/chao/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libpanic_unwind-0fc6280c17916cf2.rlib" "/Users/chao/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libunwind-616b9d3aedc220c5.rlib" "/Users/chao/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/liblibc-6677ee0ff3f20586.rlib" "/Users/chao/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/liballoc-8cd9fba02444a18e.rlib" "/Users/chao/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libcore-467ba95b94992ba8.rlib" "/Users/chao/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libcompiler_builtins-b95c24a751b6af8a.rlib" "-lSystem" "-lresolv" "-lc" "-lm" "-fuse-ld=gold"
  = note: clang: error: invalid linker name in argument '-fuse-ld=gold'

error: aborting due to previous error

error: Could not compile `proc-macro2`.

To learn more, run the command again with --verbose.

Have you encountered similar issue before?

gnieto commented 6 years ago

To be honest, I didn't saw this error before. I work on Linux, but OSX seems to be supported (https://fuzz.rs/book/afl/setup.html).

It seems that is trying to use the gold linker, which is an ELF linker, which won't work on OSX. I'll do some research regarding afl, rust and osx.

Sorry about that!

gnieto commented 6 years ago

@sunchao Could you try using afl version 0.4.1?

cargo install afl --force --version 0.4.1

I think that 0.4.2 may have a regression on OSX. Unfortunately, I've limited access to OSX and I can't confirm it.

sunchao commented 6 years ago

@gnieto yes 0.4.1 works on Mac, although it still requires some settings but the instruction is pretty clear. Would you update the README for Mac? After that, I'm ready to merge this. @sadikovi : let me know if you want to take another look.

sadikovi commented 6 years ago

LGTM

gnieto commented 6 years ago

@sunchao Added instructions for OSX.

sunchao commented 6 years ago

Merged. Thanks @gnieto .