Anirban166 / data.table.threads

Other
0 stars 1 forks source link

comments about data table mutation testing blog #12

Closed tdhock closed 1 week ago

tdhock commented 1 week ago

hi @Anirban166

this is a comment about https://anirban166.github.io/data%20table/Mutation%20Testing/

I am posting here because it says on this page https://anirban166.github.io/data%20table/ to "post an issue on the repositories I’m working on" including this one.

Can you please add a section Conclusions which summarizes your experience overall? How many mutants did you investigate? How many resulted in PRs that added new test cases? How many did not? and why not? What would you do differently next time for investigating the other mutants? or do you think it is not worth it to investigate the others?

Also the following is non-standard, so you should explain better why you did this instead of something more standard. To re-compile R packages the standard way is just R CMD INSTALL, or data.table uses cc(), or Rstudio uses devtools::load_all(). You should not have to use R CMD SHLIB, nor dyn.load. Also Why .External? (.Call is more common) "resorted to using R CMD SHLIB and then R CMD INSTALL to generate the shared object which I then loaded onto the R session via dyn.load. For functions that do not have a function to call them directly in R code, I combine that with a wrapper (using .Call) to call the corresponding C routine from within R (given that there mostly aren’t exported objects in the data.table namespace for some those functions, e.g. fast mean). I then switched to using .External to call C routines via symbol names."

Finally in all of the code blocks, it would be useful to see the output of the commands (right now only the code is shown, with no output).

Also please edit the comment in https://github.com/Rdatatable/data.table/issues/6114 to re-categorize the ones you have investigated, which are probably currently under "Not classified yet (TODO)"

tdhock commented 1 week ago

also please add a section which discusses the PR that I filed based on the mutation testing results, https://github.com/Rdatatable/data.table/pull/6115 Is there anything different about the ones that you investigated? Why was I able to create a test for that mutant, but you were not able to create any tests/PRs for the mutants you investigated?

tdhock commented 1 week ago

How could someone pick mutants in the future, to maximize their likelihood of picking one that could result in creating a new test/PR?

Anirban166 commented 1 week ago

I am posting here because it says on this page https://anirban166.github.io/data%20table/ to "post an issue on the repositories I’m working on" including this one.

Yes this/here is perfectly appropriate!

Can you please add a section Conclusions which summarizes your experience overall? How many mutants did you investigate? How many resulted in PRs that added new test cases? How many did not? and why not? What would you do differently next time for investigating the other mutants? or do you think it is not worth it to investigate the others?

Sure I suppose the questions are what I should be answering in that section? (or did you want me to respond here?)

It's worth investigating some of the mutants (not all, especially since some aren't even within the execution path or out of coverage), and I think there are a few interesting ones out there. I want to delve deeper into this and create tests/PRs, but it would be helpful for me to have a slightly detailed example of going about one test case. For e.g., after you make changes in the C file (or if we even need to do so?), whether you recompile C code manually or just uninstall and install data.table, the testing procedure, the commands ran, etc. (given the lack of guidance back then, I was not sure how to proceed tbh)

Finally in all of the code blocks, it would be useful to see the output of the commands (right now only the code is shown, with no output).

Some of them don't have output sections (a few do have O/P) since I was just sharing the test cases I wrote, but you're right that it might be useful or at least for consistency to have them all showing output (I'll rerun them and post the results)

Also please edit the comment in https://github.com/Rdatatable/data.table/issues/6114 to re-categorize the ones you have investigated, which are probably currently under "Not classified yet (TODO)"

Thanks for pointing that out, will do!

Anirban166 commented 1 week ago

I made the suggested changes, can you please check again?

tdhock commented 1 week ago

please link the diff

Anirban166 commented 1 week ago

O/P additions and some minor changes aside, it would be these two sections:

tdhock commented 1 week ago

headers typo rblindlist

meaning it’s hard to get construct tests -> get or construct?

I will maybe try fuzzing inputs -> I don't think that is relevant, please delete or clarify your proposition here. Fuzzing is a different kind of testing. With mutation tests we are supposed to be able to examine the mutants to determine new tests/inputs (random generation should not be required).

if you really were using .External can you please give a detailed example about how that worked? (code + output)

I still don't see an answer to these questions Is there anything different about the ones that you investigated? Why was I able to create a test for that mutant, but you were not able to create any tests/PRs for the mutants you investigated? How could someone pick mutants in the future, to maximize their likelihood of picking one that could result in creating a new test/PR?

in particular please discuss the mutants below, which also involve changing a binary comparison operator. I managed to create a new test/PR for these, and you did not, why?

rbindlist

if (nrow==0 && ncol==0) return(R_NilValue); # Original
if (nrow>=0 && ncol==0) return(R_NilValue); # Mutated

subset

if (elem<1 || elem>max) continue;
if (elem<1 || elem==max) continue;

fifelse

if (!na_n && len3!=1 && len3!=len0)
if (!na_n && len3!=1 && len3<len0)

my fread

if (length(key) == 1L) { # Original
if (length(key) < 1L)  { # Mutant

also for the rbindlist example, you should mention that it is not possible to have a data table with ncol=0 and nrow>0

Anirban166 commented 1 week ago

headers typo rblindlist

Thanks!

meaning it’s hard to get construct tests -> get or construct?

construct, Thanks!

I will maybe try fuzzing inputs -> I don't think that is relevant, please delete or clarify your proposition here. Fuzzing is a different kind of testing. With mutation tests we are supposed to be able to examine the mutants to determine new tests/inputs (random generation should not be required).

I deleted it, and it was to help with input generation, as like you said too with mutation testing we are trying to 'determine new tests/inputs', and it's kinda tricky to think of different inputs for some of the mutant cases we see.

if you really were using .External can you please give a detailed example about how that worked? (code + output)

There is already an example I provided for the fastmean case that uses .External:

library(data.table)
options(datatable.optimize=1)

meanComparison <- function(x, ...) 
{
  baseR <- mean(x, ...)
  fastmean <- .External("Cfastmean", x, ...)
  cat("Results as computed by:\nBase R's mean:", baseR, "\ndata.table's fast mean:", fastmean, "\n")
  fifelse(identical(baseR, fastmean), "Passed", "Failed")
}

testInputs <- list(
  c(rep(1e308, 1e3), rep(-1e308, 1e3)),
  c(rnorm(1e6, mean = 0, sd = 1e5), rep(1, 1e6), .Machine$double.xmax)
)

for(i in seq_along(testInputs))
{
  cat("Test case ", i, ":\n", sep = "")
  cat(meanComparison(testInputs[[i]], na.rm = TRUE), "\n")
}

O/P:

Test case 1:
Results as computed by:
Base R's mean: -8.147083e+291 
data.table's fast mean: Inf 
Failed 
Test case 2:
Results as computed by:
Base R's mean: 8.988461e+301 
data.table's fast mean: -1.508633e+304 
Failed 

.External is only necessary when calling C routines directly, so I'm only using it for functions which do not have an R implementation directly.

I still don't see an answer to these questions Is there anything different about the ones that you investigated? Why was I able to create a test for that mutant, but you were not able to create any tests/PRs for the mutants you investigated?

in particular please discuss the mutants below, which also involve changing a binary comparison operator. I managed to create a new test/PR for these, and you did not, why?

rbindlist

if (nrow==0 && ncol==0) return(R_NilValue); # Original
if (nrow>=0 && ncol==0) return(R_NilValue); # Mutated

subset

if (elem<1 || elem>max) continue;
if (elem<1 || elem==max) continue;

fifelse

if (!na_n && len3!=1 && len3!=len0)
if (!na_n && len3!=1 && len3<len0)

my fread

if (length(key) == 1L) { # Original
if (length(key) < 1L)  { # Mutant

But I did write about why the rbindlist and fifelse mutants I explored above didn't result in a test/PR in their respective sections, and mentioned the fifelse operator case in the third point talking regarding the 'issues in making a test/PR' part of the conclusion too.

As for subset, I previously tried inputs assuming elem to represent an index or value being checked (more specifically, it is used to access elements from the source array in subset.c) and max being a boundary condition (upper bound of allowable indices) but it didn't work, so I'm not sure what triggers that case. Meanwhile for the fread case, the variable key is clear to make test cases out of and a direct argument. (I'll include this information)

I managed to create a new test/PR for these You mean for fread right? (cause the rest you didn't or I don't see any PRs) But I got your point, you're right I should write about why things are different than the one you explored. On it!

How could someone pick mutants in the future, to maximize their likelihood of picking one that could result in creating a new test/PR?

This I don't have a clear answer to given my inexperience in doing so (but I will try to based on the reasoning above)

also for the rbindlist example, you should mention that it is not possible to have a data table with ncol=0 and nrow>0

Yup that is exactly what I meant (or that since ncol=0, nrow won't matter), but I'll clarify it with that added sentence.

Anirban166 commented 1 week ago

I made the changes a while ago, please take a look! (and let me know if it answers your questions or not - I'll follow up tomorrow; Conclusions link for quick reference)

tdhock commented 1 week ago

great thanks

tdhock commented 1 week ago

how can your blog be found? can you please add links to it? from

Anirban166 commented 1 week ago

how can your blog be found? can you please add links to it? from

  • data table wiki
  • your main blog page

Good point, and done for both!