NorskRegnesentral / shapr

Explaining the output of machine learning models with more accurately estimated Shapley values
https://norskregnesentral.github.io/shapr/
Other
138 stars 32 forks source link

Harmonize batch distribution ++ #359

Closed LHBO closed 7 months ago

LHBO commented 10 months ago

Done in this PR:


OLD text:

…er of batches (when not provided in the explain function call) to a number which then throws an error in the check_n_batches function. Logical error as get_default_n_batches could previously set n_batches to a larger value than n_combinations - 1. Subtract one as the check_n_batches function specifies that n_batches must be strictly less than n_combinations.

The bug occurred for example for:

# The other stuff here is the variables from the first code block in the vignette.
> explanation <- explain(
+   model = model,
+   x_explain = x_explain,
+   x_train = x_train,
+   approach = "gaussian",
+   prediction_zero = p0,
+   n_combinations = 8
+ )
Note: Feature classes extracted from the model contains NA.
Assuming feature classes from the data are correct.

Setting parameter 'n_batches' to 10 as a fair trade-off between memory consumption and computation time.
Reducing 'n_batches' typically reduces the computation time at the cost of increased memory consumption.

Error in check_n_batches(internal) : 
  `n_batches` (10) must be smaller than the number of feature combinations/`n_combinations` (8)
LHBO commented 10 months ago

The tests failed as I changed the printout. I added the word of as I thought that was more grammatically correct. But one can just remove it again. The intention of the sentence is clear.

The new printout is: n_batches (32) must be smaller than the number of feature combinations/n_combinations (32)

LHBO commented 10 months ago

Should not be merged yet, as another bug occurred when I was working on combined approaches.

explain_numeric = explain(
  model = model_lm_numeric,
  x_explain = x_explain_numeric,
  x_train = x_train_numeric,
  approach = c("independence", "empirical", "gaussian", "copula", "empirical"),
  prediction_zero = p0,
  n_batches = NULL,
  timing = FALSE)
Setting parameter 'n_batches' to 2 as a fair trade-off between memory consumption and computation time.
Reducing 'n_batches' typically reduces the computation time at the cost of increased memory consumption.

explain_numeric$internal$objects$S_batch
$`1`
[1]  2  3  4  5  6 32

$`2`
 [1]  7  8  9 10 11 12 13 14 15 16

$`3`
 [1] 17 18 19 20 21 22 23 24 25 26

$`4`
[1] 27 28 29 30 31

> explain_numeric$internal$parameters$n_batches
[1] 2
LHBO commented 10 months ago

There are also some other additional bugs with the combined approaches, e.g., that setting seed does not work.

LHBO commented 10 months ago

Looks like some test fails. Will look at that tomorrow.

martinju commented 9 months ago

I looked at the tests here and almost all of them are OK (just adding two new objects), except for the test involving the independence approach, where 'something' is off slightly, causing a small numerical change (10th decimal place or something). It is probably fine, but would be good to know what causes it. That will be easier to see once #356 is merged as this PR contains all those changes as well, making it harder to spot what is actually changed here.

LHBO commented 9 months ago

I looked at the tests here and almost all of them are OK (just adding two new objects), except for the test involving the independence approach, where 'something' is off slightly, causing a small numerical change (10th decimal place or something). It is probably fine, but would be good to know what causes it. That will be easier to see once #356 is merged as this PR contains all those changes as well, making it harder to spot what is actually changed here.

I have merged #356 into this PR now. Will try to look more into it.

The file test-setup.R gave no fails, except for two fails; due to that, I added the word "of" to the printout of an error message.

There are a lot of errors in test-output.R:

  1. Some of the mistakes are related to that in the comb explainers where n_batches is set to 1, which is now an Invalid input as n_batches must be equal or larger than the number of unique approaches used. This was not the case before; then, the user could let n_batches = 1, but shapr would silently overwrite it. I would recommend to use n_batches = 10, then we test that both the random distribution of the batches between the approaches are the same each time and the generated MC samples.
  2. I have tried to look more into it, but not sure about the pipeline/workflow. E.g., when I call testthat::snapshot_review('output/') to review changes, I just get errors:
    Error: embedded nul in string: '\037\x8b\b\0\0\0\0\0\0\003\xed\032\ttTE\xf2\xcfE\xeedB\002\xe4 \x87\020\x82@\fgH\u0091߁\x80$$\001\002\xe1\022\030&\xc9'\031\x99̌3\023\004d5\034\vB\x94C@\x96p\bx\0*^xn\x84\x89 \b\vD\016\021\027\x82r(+(AQ\\D \xdbݿz\xe6\xcf\xcf f\t\xba\xef\xb9\xf3R\xa9\xee\xea\xaa\xea\xaa\xea\xee\xfa\xdd\xfd\xffh_\x8e\xe3T\x9cZ\xa1\xe0T\032\\\xe44\xf9#\006ޗ\x82i!\xb4A\xc4\xcd0\004\x92:\032\x90r\xf6\xe9q\x9f\xce"x\036\xf7ؿ\b^\026\x9d\022\xcbګی\xd8dXU\xf6\004\nH\xc9\xd6\xed{_\x83\u009b\xaf\x88\xf0\xae[\xe8\x94o\xbf>i\xee\xe9\xd4\xfc\xeaV\xb1{\xf6\xcek\031[\x9dd9߭\xe8\x8b\021\xce\xf6\x8e\023\xbc̏\xc6\005VG\x84\xa5$\xb5\x9b\x99P\035\x955\xa1 (\032\xb1vG\xed\xbaw3\xdf?\033\xe68qt\\\xe9\xd2\031g\xf8k\xb1W\xben\xd3\xf2_\xce\xfeC\x92\x86\xae\xf1iy\xb0:<\xd1\177Ea\xe7J\0247w䖃gJ9N\xad\xc4\xed\nN\xcd\xf9\020'M\xfaR\xc1\x86\vZ\xea\x9cHT\x9b\xcc&\001\xca^\xc3\xcdF\xbd51\x8f5\x8d2\x98\x8aXy\x84Pj\x81\xb2*C?\r\x8a\xfe9f\x93\xbdD7I_h7[e\xdd\xf9X\xcd\017'\xb2.\0030(\xcb\xf1\xbf\xfa\xfa
  3. Talk with Martin and see if it works on his computer.
LHBO commented 7 months ago

La til noen comments

martinju commented 7 months ago

The failing tests on R4.2 and R4.1 (oldrel-1 and oldrel-2) seems to be due to a change on how errors are reported to the terminal. In the previous versions a missing input argument would not throw an error before one tried to access the missing argument, while in R4.3 an error is thrown immediately. This causes an error now due to an update in testthat (remember to update to v3.2.0 if you run locally) which now shows which function is throwing the error, which is then different in the two versions. I will try to look into how to get around this. One possibility is to simply ignore it as we know the reason for the error, it does not cause practical problems.