FabrizioSandri / RcppDeepState

RcppDeepState, a simple way to fuzz test code in Rcpp packages
https://fabriziosandri.github.io/gsoc-2022-blog/
6 stars 5 forks source link

Valgrind for initial pass #7

Closed FabrizioSandri closed 2 years ago

FabrizioSandri commented 2 years ago

While writing the GitHub Action for the pull request #6 I came up again to the strange Segmentation fault error mentioned in the issue #2. I had assumed that the lack of debug symbols was to cause for the issue, however it doesn't appear that this is the case. The segmentation fault that I am referring to, occurs for the rcpp_use_after_deallocate function in the testSAN package.

Steps to reproduce

First of all I ran the test harness compilation procedure deepstate_harness_compile_run and it succesfully generated the compiled test harness. The first execution however leaves the rcpp_use_after_deallocate_output empty. So I decided to manually run the harness with the same seed that generated the segmentation fault(5) to understand the problem.

$ ../rcpp_use_after_deallocate_DeepState_TestHarness --fuzz --timeout=1 --seed=5 --fuzz_save_passing --output_test_dir  rcpp_use_after_deallocate_output
INFO: Starting fuzzing
WARNING: No test specified, defaulting to first test defined (_)
input starts
EXTERNAL: qs v0.25.3.

array_size values: 467495432
input ends
[1]    52869 segmentation fault (core dumped)  ./rcpp_use_after_deallocate_DeepState_TestHarness --fuzz --timeout=1 --seed=5

It appears that the segmentation fault occurs before deepstate can generate the output file. If I run the same program above using valgrind, the output folder is instead filled with a test file.

$ valgrind ./rcpp_use_after_deallocate_DeepState_TestHarness --fuzz --timeout=1 --seed=5 --fuzz_save_passing --output_test_dir  rcpp_use_after_deallocate_output
==52956== Memcheck, a memory error detector
==52956== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==52956== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==52956== Command: ./rcpp_use_after_deallocate_DeepState_TestHarness --fuzz --timeout=1 --seed=5 --fuzz_save_passing --output_test_dir rcpp_use_after_deallocate_output
==52956== 
INFO: Starting fuzzing
WARNING: No test specified, defaulting to first test defined (_)
input starts
EXTERNAL: qs v0.25.3.

array_size values: 467495432
input ends
==52956== Warning: set address range perms: large range [0xd65c040, 0x29432a48) (undefined)
==52956== Warning: set address range perms: large range [0xd65c028, 0x29432a60) (noaccess)
==52956== Invalid read of size 1
==52956==    at 0x4D539AA: rcpp_use_after_deallocate(int) (use_after_deallocate.cpp:8)
==52956==    by 0x114FB6: DeepState_Test__() (rcpp_use_after_deallocate_DeepState_TestHarness.cpp:23)
==52956==    by 0x114D58: DeepState_Run__() (rcpp_use_after_deallocate_DeepState_TestHarness.cpp:14)
==52956==    by 0x12B124: DeepState_RunTestNoFork (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==52956==    by 0x12B668: DeepState_FuzzOneTestCase (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==52956==    by 0x12B8FF: DeepState_Fuzz (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==52956==    by 0x11074A: main (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==52956==  Address 0xd65c045 is 5 bytes inside a block of size 467,495,432 free'd
==52956==    at 0x4849A7F: operator delete[](void*) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==52956==    by 0x4D539A9: rcpp_use_after_deallocate(int) (use_after_deallocate.cpp:7)
==52956==    by 0x114FB6: DeepState_Test__() (rcpp_use_after_deallocate_DeepState_TestHarness.cpp:23)
==52956==    by 0x114D58: DeepState_Run__() (rcpp_use_after_deallocate_DeepState_TestHarness.cpp:14)
==52956==    by 0x12B124: DeepState_RunTestNoFork (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==52956==    by 0x12B668: DeepState_FuzzOneTestCase (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==52956==    by 0x12B8FF: DeepState_Fuzz (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==52956==    by 0x11074A: main (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==52956==  Block was alloc'd at
==52956==    at 0x48472E3: operator new[](unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==52956==    by 0x4D5399D: rcpp_use_after_deallocate(int) (use_after_deallocate.cpp:6)
==52956==    by 0x114FB6: DeepState_Test__() (rcpp_use_after_deallocate_DeepState_TestHarness.cpp:23)
==52956==    by 0x114D58: DeepState_Run__() (rcpp_use_after_deallocate_DeepState_TestHarness.cpp:14)
==52956==    by 0x12B124: DeepState_RunTestNoFork (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==52956==    by 0x12B668: DeepState_FuzzOneTestCase (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==52956==    by 0x12B8FF: DeepState_Fuzz (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==52956==    by 0x11074A: main (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==52956== 
INFO: Done fuzzing! Ran 1 tests (1 tests/second) with 0 failed/1 passed/0 abandoned tests
==52956== 
==52956== HEAP SUMMARY:
==52956==     in use at exit: 51,641,784 bytes in 10,583 blocks
==52956==   total heap usage: 33,306 allocs, 22,723 frees, 560,059,861 bytes allocated
==52956== 
==52956== LEAK SUMMARY:
==52956==    definitely lost: 0 bytes in 0 blocks
==52956==    indirectly lost: 0 bytes in 0 blocks
==52956==      possibly lost: 0 bytes in 0 blocks
==52956==    still reachable: 51,641,784 bytes in 10,583 blocks
==52956==                       of which reachable via heuristic:
==52956==                         newarray           : 4,264 bytes in 1 blocks
==52956==         suppressed: 0 bytes in 0 blocks
==52956== Rerun with --leak-check=full to see details of leaked memory
==52956== 
==52956== For lists of detected and suppressed errors, rerun with: -s
==52956== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

The inverse problem

@tdhock, I found this old conversation in the issue https://github.com/akhikolla/RcppDeepState/issues/62 about using Vaglrind in the first steps, after the harness compilation. Based on this discussion, I discovered that when the seed is set to 2, if valgrind is used to run the test, no output file is produced.

As you can see running this test standalone generates an output file.

$ ./rcpp_use_after_deallocate_DeepState_TestHarness --fuzz --timeout=1 --seed=2 --fuzz_save_passing --output_test_dir  rcpp_use_after_deallocate_output
INFO: Starting fuzzing
WARNING: No test specified, defaulting to first test defined (_)
input starts
EXTERNAL: qs v0.25.3.

array_size values: -92322737
input ends
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
ERROR: Failed: _
INFO: Saved test case in file `rcpp_use_after_deallocate_output/520d9b63d5d9e5fa249b7fae87d2621c419ddf0a.fail`
input starts
array_size values: 1957496050
input ends
[1]    53624 segmentation fault (core dumped)  ./rcpp_use_after_deallocate_DeepState_TestHarness --fuzz --timeout=1 --seed=2

Instead, if Valgrind is used, the program is aborted ad stated in the message cannot throw exceptions and so is aborting instead. Sorry.

$ valgrind  ./rcpp_use_after_deallocate_DeepState_TestHarness --fuzz --timeout=1 --seed=2 --fuzz_save_passing --output_test_dir  rcpp_use_after_deallocate_output
==53705== Memcheck, a memory error detector
==53705== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==53705== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==53705== Command: ./rcpp_use_after_deallocate_DeepState_TestHarness --fuzz --timeout=1 --seed=2 --fuzz_save_passing --output_test_dir rcpp_use_after_deallocate_output
==53705== 
INFO: Starting fuzzing
WARNING: No test specified, defaulting to first test defined (_)
input starts
EXTERNAL: qs v0.25.3.

array_size values: -92322737
input ends
==53705== Argument 'size' of function __builtin_vec_new has a fishy (possibly negative) value: -92322737
==53705==    at 0x48472E3: operator new[](unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==53705==    by 0x4D5399D: rcpp_use_after_deallocate(int) (use_after_deallocate.cpp:6)
==53705==    by 0x114FB6: DeepState_Test__() (rcpp_use_after_deallocate_DeepState_TestHarness.cpp:23)
==53705==    by 0x114D58: DeepState_Run__() (rcpp_use_after_deallocate_DeepState_TestHarness.cpp:14)
==53705==    by 0x12B124: DeepState_RunTestNoFork (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==53705==    by 0x12B668: DeepState_FuzzOneTestCase (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==53705==    by 0x12B8FF: DeepState_Fuzz (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==53705==    by 0x11074A: main (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==53705== 
**53705** new/new[] failed and should throw an exception, but Valgrind
**53705**    cannot throw exceptions and so is aborting instead.  Sorry.
==53705==    at 0x484551C: ??? (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==53705==    by 0x4847355: operator new[](unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==53705==    by 0x4D5399D: rcpp_use_after_deallocate(int) (use_after_deallocate.cpp:6)
==53705==    by 0x114FB6: DeepState_Test__() (rcpp_use_after_deallocate_DeepState_TestHarness.cpp:23)
==53705==    by 0x114D58: DeepState_Run__() (rcpp_use_after_deallocate_DeepState_TestHarness.cpp:14)
==53705==    by 0x12B124: DeepState_RunTestNoFork (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==53705==    by 0x12B668: DeepState_FuzzOneTestCase (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==53705==    by 0x12B8FF: DeepState_Fuzz (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==53705==    by 0x11074A: main (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==53705== 
==53705== HEAP SUMMARY:
==53705==     in use at exit: 51,656,156 bytes in 10,684 blocks
==53705==   total heap usage: 33,302 allocs, 22,618 frees, 92,559,782 bytes allocated
==53705== 
==53705== LEAK SUMMARY:
==53705==    definitely lost: 0 bytes in 0 blocks
==53705==    indirectly lost: 0 bytes in 0 blocks
==53705==      possibly lost: 0 bytes in 0 blocks
==53705==    still reachable: 51,656,156 bytes in 10,684 blocks
==53705==                       of which reachable via heuristic:
==53705==                         newarray           : 4,264 bytes in 1 blocks
==53705==         suppressed: 0 bytes in 0 blocks
==53705== Rerun with --leak-check=full to see details of leaked memory
==53705== 
==53705== For lists of detected and suppressed errors, rerun with: -s
==53705== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
FabrizioSandri commented 2 years ago

I think that the problem is due to the fact that DeepState creates the output file once the test is completed. This is a problem since a segmentation fault in a test harness might force DeepState to crash before it gets a chance to write the input data. I am working on this.

FabrizioSandri commented 2 years ago

I've finally found a solution to this issue. The solution is to create two tests inside the same harness: one test will only generate fresh input data (without running the function being tested), while the second test will execute the function being tested taking as input the outputs of the first test(the generator). In fact, when we run the first passing using the function deepstate harness analyze pkg, all we want at the end of this step is a Test Harness with its built binary file and a set of inputs; so in this first pass the execution of the tested function is not necessary. The execution based on the inputs should be done by the deepstate_harness_analyze_pkg function.

Let's see an example to clarify this solution. The test harness generated for the rcpp_read_out_of_bound function of the testSAN package will look like the one in the next fragment of code, where we have a generator that generates the inputs, without actually executing the function rcpp_read_out_of_bound and a runner that takes the inputs of the generator.

#include <fstream>
#include <RInside.h>
#include <iostream>
#include <RcppDeepState.h>
#include <qs.h>
#include <DeepState.hpp>

RInside Rinstance;

int rcpp_read_out_of_bound(int rbound);

TEST(testSAN, generator){
  std::cout << "input starts" << std::endl;
  IntegerVector rbound(1);
  rbound[0]  = RcppDeepState_int();
  qs::c_qsave(rbound,"/home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_read_out_of_bound/inputs/rbound.qs",
        "high", "zstd", 1, 15, true, 1);
  std::cout << "rbound values: "<< rbound << std::endl;
  std::cout << "input ends" << std::endl;
}

TEST(testSAN, runner){
  std::cout << "input starts" << std::endl;
  IntegerVector rbound(1);
  rbound[0]  = RcppDeepState_int();
  std::cout << "rbound values: "<< rbound << std::endl;
  std::cout << "input ends" << std::endl;
  try{
    rcpp_read_out_of_bound(rbound[0]);
  }
  catch(Rcpp::exception& e){
    std::cout<<"Exception Handled"<<std::endl;
  }
}

We can begin by first creating the inputs using the following command, where rcpp_read_out_of_bound_DeepState_TestHarness is the compiled test harness.

$ ./rcpp_read_out_of_bound_DeepState_TestHarness --fuzz 
--fuzz_save_passing --output_test_dir ./rcpp_read_out_of_bound_output 
--input_which_test testSAN_generator

As you can see the parameter --input_which_test is used to specify that during this phase, the generator test is run. After this phase, the rcpp_read_out_of_bound_output will contain some inputs files. We can analyze one of them using Valgrind in the following way:

$ valgrind ./rcpp_read_out_of_bound_DeepState_TestHarness --input_test_file 
./rcpp_read_out_of_bound_output/00009f0507a2dbd3bdf0ca6b7ebb8049149a4904.pass 
--input_which_test testSAN_runner

In this case the parameter --input_which_test is used to specify that we want to run the runner test.

I'm going to implement this change in the pull request #6 .

tdhock commented 2 years ago

sounds great thanks

FabrizioSandri commented 2 years ago

The solution for this issue has been implemented in the pull request #6. More precisely in the following commits: 1ebc81c, 71edb6a, 05a7ed8