akhikolla / RcppDeepState

RcppDeepState, a simple way to fuzz test code in Rcpp packages
https://akhikolla.github.io./
8 stars 2 forks source link

SaveRDS doesn't work in TestHarness #48

Closed akhikolla closed 4 years ago

akhikolla commented 4 years ago

Hello @tdhock,

As we discussed earlier the saveRDS is working fine when executed using the main function but throws the same error again when executed with testharness.

Testharness Output:

TRACE: Running: deepstate_test_datatype from rinside-problem.cpp(5)
EXTERNAL: integer
EXTERNAL: �
EXTERNAL: Error in readRDS("mat.RDs") : unknown input format

terminate called after throwing an instance of 'std::runtime_error'
  what():  Error evaluating: cat(class(ms));saveRDS(ms,"mat.RDs",compress = TRUE); r <- readRDS("mat.RDs") ; cat(r)
ERROR: Failed: deepstate_test_datatype

I am unable to read the mat.RDs because it is created as a binary file instead of a Gzip archive. But when executed the code with main instead of testharness it creates a gzip archive file.

Testharness code:


#!/bin/bash
set -o errexit
cat > rinside-problem.cpp <<EOF
#include <RInside.h>
#include <iostream>
#include <RcppDeepState.h>
#include <DeepState.hpp>
TEST(deepstate_test,datatype){
RInside R;
R["ms"] = 7;
std::string cmd = "cat(class(ms));saveRDS(ms,\"mat.RDs\",compress = TRUE); r <- readRDS(\"mat.RDs\") ; cat(r)";
R.parseEval(cmd); 
}
EOF
RCPP=$(Rscript --vanilla -e 'cat(system.file(package="Rcpp"))')
RINSIDE=$(Rscript --vanilla -e 'cat(system.file(package="RInside"))')
R_HOME=$(Rscript --vanilla -e 'cat(R.home())')
R_DS=$(Rscript --vanilla -e 'cat(system.file(package="RcppDeepState"))')
FLAGS="-I$R_DS/include -I$R_HOME/include -I$RCPP/include -I$RINSIDE/include"
COMPILE="clang++ $FLAGS rinside-problem.cpp -o rinside-problem.o -c"
echo $COMPILE
rm -f rinside-problem.o
$COMPILE
du rinside-problem.o
LINK="clang++ -L$RINSIDE/lib -Wl,-rpath=$RINSIDE/lib -L$R_HOME/lib -Wl,-rpath=$R_HOME/lib -L${HOME}/.RcppDeepState/deepstate-master/build -Wl,-rpath=${HOME}/.RcppDeepState/deepstate-master/build -ldeepstate -lR -lRInside rinside-problem.o -o rinside-problem"
echo $LINK
$LINK
./rinside-problem
tdhock commented 4 years ago

I confirm

(base) tdhock@maude-MacBookPro:~/R$ R_LIBS_USER=/home/tdhock/R/x86_64-pc-linux-gnu-library/4.0 bash testharness-problem.sh
clang++ -I/home/tdhock/R/x86_64-pc-linux-gnu-library/4.0/RcppDeepState/include -I/home/tdhock/lib/R/include -I/home/tdhock/R/x86_64-pc-linux-gnu-library/4.0/Rcpp/include -I/home/tdhock/R/x86_64-pc-linux-gnu-library/4.0/RInside/include rinside-problem.cpp -o rinside-problem.o -c
140 rinside-problem.o
clang++ -L/home/tdhock/R/x86_64-pc-linux-gnu-library/4.0/RInside/lib -Wl,-rpath=/home/tdhock/R/x86_64-pc-linux-gnu-library/4.0/RInside/lib -L/home/tdhock/lib/R/lib -Wl,-rpath=/home/tdhock/lib/R/lib -L/home/tdhock/.RcppDeepState/deepstate-master/build -Wl,-rpath=/home/tdhock/.RcppDeepState/deepstate-master/build -ldeepstate -lR -lRInside rinside-problem.o -o rinside-problem
TRACE: Running: deepstate_test_datatype from rinside-problem.cpp(5)
EXTERNAL: integer
EXTERNAL: 
EXTERNAL: Error in readRDS("mat.RDs") : unknown input format

terminate called after throwing an instance of 'std::runtime_error'
  what():  Error evaluating: cat(class(ms));saveRDS(ms,"mat.RDs",compress = TRUE); r <- readRDS("mat.RDs") ; cat(r)
ERROR: Failed: deepstate_test_datatype
(base) tdhock@maude-MacBookPro:~/R$ 

I don't know how to solve this.

akhikolla commented 4 years ago
  1. Observed that RDs files are created as gzip archive when executed without testharness whereas for the testharness executable it creates a binary file. Tried creating an ascii format file instead of binary still the same error and creates a binary file.
  2. Compared ldd flags on both the executables, they are same:
akhila@akhila-VirtualBox:~/R$ ldd rinside
    linux-vdso.so.1 (0x00007ffce6556000)
    libR.so => /home/akhila/lib/R/lib/libR.so (0x00007f6f75dff000)
    libRInside.so => /home/akhila/R/x86_64-pc-linux-gnu-library/3.6/RInside/lib/libRInside.so (0x00007f6f75bdb000)
    libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f6f75852000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f6f754b4000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f6f7529c000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (
0x00007f6f74eab000)
    libRblas.so => /usr/lib/libRblas.so (0x00007f6f74c80000)
    libreadline.so.7 => /lib/x86_64-linux-gnu/libreadline.so.7 (0x00007f6f74a37000)
    libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007f6f747c5000)
    liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f6f7459f000)
    libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 (0x00007f6f7438f000)
    libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f6f74172000)
    librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f6f73f6a000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f6f73d66000)
    libicuuc.so.60 => /usr/lib/x86_64-linux-gnu/libicuuc.so.60 (0x00007f6f739ae000)
    libicui18n.so.60 => /usr/lib/x86_64-linux-gnu/libicui18n.so.60 (0x00007f6f7350d000)
    libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1 (0x00007f6f732de000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f6f730bf000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f6f76488000)
    libtinfo.so.5 => /lib/x86_64-linux-gnu/libtinfo.so.5 (0x00007f6f72e95000)
    libicudata.so.60 => /usr/lib/x86_64-linux-gnu/libicudata.so.60 (0x00007f6f712ec000)
akhila@akhila-VirtualBox:~/R$ ldd rinside-problem
    linux-vdso.so.1 (0x00007ffe8b3ba000)
    libR.so => /home/akhila/lib/R/lib/libR.so (0x00007fe8007a9000)
    libRInside.so => /home/akhila/R/x86_64-pc-linux-gnu-library/3.6/RInside/lib/libRInside.so (0x00007fe800585000)
    libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fe8001fc000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fe7ffe5e000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fe7ffc46000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe7ff855000)
    libRblas.so => /usr/lib/libRblas.so (0x00007fe7ff62a000)
    libreadline.so.7 => /lib/x86_64-linux-gnu/libreadline.so.7 (0x00007fe7ff3e1000)
    libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007fe7ff16f000)
    liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007fe7fef49000)
    libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 (0x00007fe7fed39000)
    libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fe7feb1c000)
    librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fe7fe914000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fe7fe710000)
    libicuuc.so.60 => /usr/lib/x86_64-linux-gnu/libicuuc.so.60 (0x00007fe7fe358000)
    libicui18n.so.60 => /usr/lib/x86_64-linux-gnu/libicui18n.so.60 (0x00007fe7fdeb7000)
    libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1 (0x00007fe7fdc88000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fe7fda69000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fe800e32000)
    libtinfo.so.5 => /lib/x86_64-linux-gnu/libtinfo.so.5 (0x00007fe7fd83f000)
    libicudata.so.60 => /usr/lib/x86_64-linux-gnu/libicudata.so.60 (0x00007fe7fbc96000)
akhikolla commented 4 years ago

Hello @tdhock, I think I found a way to serialize the Rcpp objects that we create in the testharness. We need to use qs::c_qsave() to do so: we need to use the qs package for this purpose. qs has an interface similar to saveRDS and readRDS.

I tested c_qsave() with the testharness and it is saving the files to the disk and reading them similar to saveRDS & readRDS. install.packages("qs")

code.sh

#!/bin/bash
set -o errexit
cat > rinside.cpp <<EOF
#include <RInside.h>
#include <iostream>
#include <RcppDeepState.h>
#include <qs.h>
#include <DeepState.hpp>
TEST(deepstate_test,datatype){
RInside R;
Rcpp::NumericMatrix vec= RcppDeepState_NumericMatrix();
qs::c_qsave(vec, "~/R/RcppDeepState/inst/testvec.qs", "high", "zstd", 1, 15, true, 1);
}
EOF
RCPP=$(Rscript --vanilla -e 'cat(system.file(package="Rcpp"))')
RINSIDE=$(Rscript --vanilla -e 'cat(system.file(package="RInside"))')
QS=$(Rscript --vanilla -e 'cat(system.file(package="qs"))')
R_HOME=$(Rscript --vanilla -e 'cat(R.home())')
R_DS=$(Rscript --vanilla -e 'cat(system.file(package="RcppDeepState"))')
FLAGS="-I$R_DS/include -I$R_HOME/include -I$RCPP/include -I$RINSIDE/include -I$QS/include" 
COMPILE="clang++ $FLAGS rinside.cpp -o rinside.o -c"
echo $COMPILE
rm -f rinside.o
$COMPILE
du rinside.o
LINK="clang++ -L$RINSIDE/lib -Wl,-rpath=$RINSIDE/lib -L$R_HOME/lib -Wl,-rpath=$R_HOME/lib -L${HOME}/.RcppDeepState/deepstate-master/build -Wl,-rpath=${HOME}/.RcppDeepState/deepstate-master/build -ldeepstate -lR -lRInside rinside.o -o rinside"
echo $LINK
$LINK
./rinside --fuzz

the parameters of c_qsave() include value to save, file path, preset, algorithm, compression level, shuffle_control, check_hash, number of threads.

> qread("/home/akhila/R/RcppDeepState/inst/testvec.qs")
               [,1]          [,2]          [,3]
 [1,] 2.233534e+267 1.217284e+284 1.491756e-176
 [2,]  1.376962e-75 1.473632e-162 1.749670e+162
 [3,] 4.292852e+195 6.256146e+250  1.170040e-30
 [4,] 5.630632e+184 4.599630e+149  7.001611e-43
 [5,] 1.555324e+306  3.077184e+29 6.145424e+254
 [6,]  1.047978e+17  2.205198e-44 1.719098e-278
 [7,]  4.207836e-53 1.811399e-265 1.102357e-114
 [8,]  9.267488e+23  2.716983e+88 4.172402e+275
 [9,]  8.968062e-62 2.429756e+144  1.596374e+97
[10,] 1.096121e+245 2.509878e+160 4.517320e+260
tdhock commented 4 years ago

cool! I didn't know about qs. But if that works, great, let's use it!

tdhock commented 4 years ago

Hi I confirm this works on my system too. Have you tried using it inside of your deepstate_compile_run?

akhikolla commented 4 years ago

Yes, I tried using it with the compile run and it worked.

tdhock commented 4 years ago

is this on current master? I would like to try it. Can you please update the readme so that it explains how to access the inputs which caused problems, using this new approach?

akhikolla commented 4 years ago

The current master build fails for the update I am still trying to fix that. Yes, I'll update the readme accordingly.

tdhock commented 4 years ago

ok please tell me when it is fixed and I can try.

akhikolla commented 4 years ago

Sure @tdhock

akhikolla commented 4 years ago

Resolved the issue and updated the readme. please try running the code and let me know if there are any issues.

tdhock commented 4 years ago

please fix readme formatting under https://github.com/akhikolla/RcppDeepState#functionalities

tdhock commented 4 years ago

also can you please clarify how to use analyze_one? From the readme it is not clear. #53 also after each command on the readme can you also include the expected output? It would be useful for me (and others) to be able to check and see if your code is working as expected.

tdhock commented 4 years ago

also do you have a compelling reason to include analyze_one as a user step? It seems to me that it should be used as the last step inside of compile_run, which should then return the table of problems detected for the user.

akhikolla commented 4 years ago

deepstate_harness_analyze_one() - analyzes each binary crash or fail file which takes a lot of time depending on number of crash files generated and travis build is not successful in this case(throws a time limit exceeded error). The deepstate_harness_compile_run () alone takes 46 mins in the recent travis build https://travis-ci.org/github/akhikolla/RcppDeepState/builds/729609566. So I had to seperate both the functions and then use only deepstate_harness_compile_run() in travis.

tdhock commented 4 years ago

hmmmm what takes so long in the tests? maybe only run one or two tests on travis, instead of all of them? you can write code in your test to check for the presence of one of these environment variables https://docs.travis-ci.com/user/environment-variables/#default-environment-variables e.g.

(base) tdhock@maude-MacBookPro:~/teaching/cs499-599-fall-2020/homeworks$ Rscript --vanilla -e "if(identical(Sys.getenv('TRAVIS'), 'true'))print('do only one or two tests') else print('do all tests')"
[1] "do all tests"
(base) tdhock@maude-MacBookPro:~/teaching/cs499-599-fall-2020/homeworks$ TRAVIS=true Rscript --vanilla -e "if(identical(Sys.getenv('TRAVIS'), 'true'))print('do only one or two tests') else print('do all tests')"
[1] "do only one or two tests"
(base) tdhock@maude-MacBookPro:~/teaching/cs499-599-fall-2020/homeworks$ 

another solution is to just create another function for the user, compile_run_analyze, which first runs compile_run, then runs analyse. What do you think?

akhikolla commented 4 years ago

Yes, instead of creating another function for compile_run_analyze. I can just extend the current function(compile_run) and if we are in a travis environment I can analyze only one crash/fail file which takes less time else I'll analyze all the generated binaries(i.e run analyze_one).

tdhock commented 4 years ago

to be clear you should write the if(on travis) code in your tests/ files not in your R/ files. to implement what you propose you would probably need to do something like the following in your test

max_inputs <- if(on travis) 1 else Inf
deepstate_compile_run("path/to/pkg", max_inputs=max_inputs)

where the max_inputs arg controls how many input files you analyze

akhikolla commented 4 years ago

Created the function as suggested.

tdhock commented 4 years ago

ok thanks can this issue be closed?

akhikolla commented 4 years ago

Yes, Issue Resolved.