NorskRegnesentral / shapr

Explaining the output of machine learning models with more accurately estimated Shapley values
https://norskregnesentral.github.io/shapr/
Other
138 stars 32 forks source link

Add support for models with prediction output size above 1. #323

Open jonlachmann opened 1 year ago

jonlachmann commented 1 year ago

This adds support for models that have predict functions with size over 1. For example a time-series forecast model which forecasts 3 steps ahead. In general, any model which has multiple outputs in its predict function should now be supported.

After discussion with Martin I have made it so that there are no new arguments to the explain function, rather the input "prediction_zero" must be of the same size as the outputs from the predict function of the model.

martinju commented 1 year ago

I renamed parallelto use_futureas it better reflects the option. Note: The failing setup tests are expected due to the changed behavior. Don't worry about that, we'll handle that in the end.

Other things we need to do:

Things we need to discuss/think through:

jonlachmann commented 1 year ago

I renamed parallelto use_futureas it better reflects the option. Note: The failing setup tests are expected due to the changed behavior. Don't worry about that, we'll handle that in the end.

Other things we need to do:

* [ ]  We need new tests for the multiple output situation. I'm thinking of a single test in test-output using a basic arima model fitted with stats::arima(), then a maybe two tests under test-setup to a) check that we get the right dimension out of explain(), and b) that the method fails we pass a prediction function with multiple outcomes while providing a single prediction_zero.

* [ ]  Add a basic example to the vignette on how to use the multiple output module

I can get started on some tests tomorrow. Do you want it to explain say 3 lags, or some exogenous variables? Once we have a test, an example is quite easy since it can just use the same code. One idea is that a basic VAR model with for example some basic weather data may be both interesting and provide something intuitive for the example in the vignette. It could also showcase the grouping feature if we want to, to group lags for the same variable.

Things we need to discuss/think through:

* [ ]  While using `prediction_zero` to tell `explain()` that the multiple output module is in use is nice, I was originally thinking of changing the behavior of `get_predict_model()` to extract this dimension directly from the "test" of predict_model. The downside of my original idea is that we then miss this test of the output of predict_model being of the right dimension. Let's just think a bit about this.

I know that you are against having too many inputs in explain, but I do think that an explicit argument may be more clear. Another option is of course to have a wrapper function called say explain_multiple... Not sure about this, but just putting it out there for consideration.

* [ ]  I see that you require the output of predict_model to be a data.frame in the multiple outcome situation. Isn't it more common to output a standard matrix in such cases?

A matrix would be preferred, but here my knowledge of data.table ended... It seemed that it did not really work to have a matrix mapped to it correctly. I am sure that it should be possible somehow, and it is definitely preferred.

* [ ]  How should plotting work for such multiple output models? The easiest is probably to just let the user specify which of the predictions that should be plotted (and default to the first one with a note written to the console). It might be nice to have them all in the same plot, however.

This is something that is also interesting for our application of it to forecasting. I am not even sure how such a plot would look, but I imagine vertical stacked barplots somehow.

* [ ]  Should we also handle multi-class classification in the same setting? If so, an example of this should also be put into the vignette.

That would probably be useful, the code should work the same way I think. It would require a good example to make it intuitive. I will try to think of something.

jonlachmann commented 1 year ago

I have now added an ar model (using the arima model from stats to make predictions based on a specific vector without the forecast package is a pain), which has a test using the temperature data used in the other tests. Please have a look if it looks acceptable.

martinju commented 1 year ago

Thanks for all this!

Do you want it to explain say 3 lags, or some exogenous variables? Once we have a test, an example is quite easy since it can just use the same code. One idea is that a basic VAR model with for example some basic weather data may be both interesting and provide something intuitive for the example in the vignette. It could also showcase the grouping feature if we want to, to group lags for the same variable.

I have looked into your example and it works well, but I also started to play around with the idea of explaining time series lags at the same time as exogenous variables. That would be very helpful in practice, I believe. I'll will play around a bit more and let you know when I got something.

I know that you are against having too many inputs in explain, but I do think that an explicit argument may be more clear. Another option is of course to have a wrapper function called say explain_multiple... Not sure about this, but just putting it out there for consideration.

I actually think that is a good idea to separate the multiple output into a separate function. If we also distinguish between say forecasting different lags, and multiple outcome classification, we could make the former more user friendly by formatting the data for the user (i.e. not require the user to provide all lags time series). Something to think about.

A matrix would be preferred, but here my knowledge of data.table ended... It seemed that it did not really work to have a matrix mapped to it correctly. I am sure that it should be possible somehow, and it is definitely preferred.

No, problem, I can deal with this.

That would probably be useful, the code should work the same way I think. It would require a good example to make it intuitive. I will try to think of something.

Great!