NorskRegnesentral / shapr

Explaining the output of machine learning models with more accurately estimated Shapley values
https://norskregnesentral.github.io/shapr/
Other
141 stars 33 forks source link

Bugfix in `prepare_data()` related to vector of approaches. When usin… #356

Closed LHBO closed 1 year ago

LHBO commented 1 year ago

…g several approaches; the old version only used the first approach. I verified this by adding a temporary print in each prepare_data.approach() function and saw that only the first approach in internal$parameters$approach was printed. It is not possible to spot the logical error in any other way to the best of my knowledge, hence I have not been able to make a script demonstrating the error, and I have made no test.

We could, in all prepare_data.approach() functions, add an extra column in the returned data.table indicating the used approach to generate the MC samples. This would make it easier to test that the specified approaches were indeed used.

Remove code comments before the pull request is accepted, and maybe Martin has a better solution to get the approach(?).

Also updated roxygen2 for the function, as it seemed that it was reflecting the old version of shapr(?) due to arguments that are no longer present.

However, we then get a warning when creating the roxygen2 documentation. Discuss some solutions as comments below. Discuss with Martin.

LHBO commented 1 year ago

It seems that the checks fail as the tests say that the combined approach now gives different results/explanations compared to what it did before. This might be a good thing because it illustrates that the updated version uses different approaches.

martinju commented 1 year ago

Well done spotting this! Sorry you had to deal with this. As I guess you realized, I had most of this setup up, but apparently I had completely forgotten to handle it in prepare_data.

I verified manually that your fix works as intended. As you wrote, I think it is hard to make a test for this (the mistake was to make an output test with the wrong output in the first place). An error was introduced in the output_lm_mixed_comb test, so that needs to be handled. The others test failures are purely based on the incorrect output test being updated (I will update those test files once the output_lm_mixed_comb error is fixed). Feel free to take a look at what is going on with output_lm_mixed_comb, otherwise I will do it later this week.

We inherit documentation from one/two sources when using internal to avoid copying text in lots of places (which makes maintenance more cumbersome). Thus, you should not write documentation for internal in other functions, so we should remove those again. The warning is that you say you want to inherit function documentation, but all input arguments are already documented.

LHBO commented 1 year ago

The error in output_lm_mixed_comb is due to the same thing as in #362. Have you seen/read it?

LHBO commented 1 year ago

Thanks for clarifying the documentation stuff :) That makes sense. I could definitely apply that in VAEAC documentation as I have a lot of copy-pasted @param there (we take that later).

LHBO commented 1 year ago

I verified manually that your fix works as intended. As you wrote, I think it is hard to make a test for this (the mistake was to make an output test with the wrong output in the first place). An error was introduced in the output_lm_mixed_comb test, so that needs to be handled. The others test failures are purely based on the incorrect output test being updated (I will update those test files once the output_lm_mixed_comb error is fixed). Feel free to take a look at what is going on with output_lm_mixed_comb, otherwise I will do it later this week.

I have solved the bug at its origin. Will update soon.

LHBO commented 1 year ago

Now I did something I did not intend to do... OR maybe not. This is maybe just the effect of me pulling the changes you made here on github? But I pushed before I pulled, hence, maybe why it looks strange.

LHBO commented 1 year ago

The check fails, but I think that is due to old test-values? I'll wait and see what Martin says :)

martinju commented 1 year ago

/style