Closed martinju closed 1 month ago
Just some notes for myself on where to catch up after the holiday:
Depending on how things go as I get back on this in august, @LHBO might take a look at the code structure some time after point 3 is done.
Slowly reaching a steady state. Apart from the list of undone tasks above, here is a list of components which is currently not in a good state
@LHBO
OK, some more work done now.
A lot of minor code changes as I have moved from combinations to coalitions everywhere, i.e. changed n_combinations to n_coalitions, id_combinations -> id_coalitions and so on. The key data.table X also got new (general) column names: features -> coalitions, n_featuers -> coalition_size and so on. Note that features
is still added as a new column in the end, making it easier to create the binary matrix S for both groups and features.
I have also added a few extra helping parameters n_shapley_values is equal to n_features for features, and n_groups for groups. coal_feature_list is the same as the previous group_num, except that it also exists for feature wise explanation.
As a consequence of the name generalizations and helper parameters, I have removed many of the almost-identical functions that are specific for groups (the old feautre_exact/group_exact). These now got the common name exact_coalition_table, just. Similar generalizations are done elsewhere.
Tests are updated after the changes. Something is wrong with groups for forecast, but we'll just ignore that for now.
Feel free to take a quick look at the main components (no need to look at the details at this stage), and let me know if you have comments to the generalizations, name changes etc.
Hi @aredelmeier @LHBO @jonlachmann
This is closing in on a merge. Just a few things misisng now, I think. I hope to be able to merge this some time this weekend.
Here is what remains:
@jonlachmann You may want to merge this into your forecast fixing branch.
@LHBO If you want to , I actually think you can safely start on the asymmetric stuff from the current stage. What remains will not change much of the code.
Very early draft. Lots of cleanup and moving things around remains, but the general overall structure will probably be close to what we got here.
To be done in this PR (some may be removed here and handled in separate PRs):
verbose
with arguments - Add verbose = c("basic","shapley","vS_details"), with "basic" as the default, showing what is currently going on in the function, the filename of the tempfile, and what iteration we are at (+ later estimate of the remaining computationt time) NULL or "" should give no printout at all, "shapley" means printing intermediate shapley estimates, "vs_details" means printing results while estimating the vS_functions (where this is done in more than a single step).Note: All non-exact methods fails now (also the Shapley values estimates) since shapley_setup is now called after setup_approach. All tests for Shapley values pass if these calls are but back to the original order (but we don't want that in the future).