mayer79 / missRanger

Fast multivariate imputation by random forests.
https://mayer79.github.io/missRanger/
GNU General Public License v2.0
63 stars 11 forks source link

Possibility to access Ranger object for Shap values #54

Closed calogerobra closed 1 year ago

calogerobra commented 1 year ago

The project could be extended to cater for the production of Shap values and other metrics that the underlying package accounts for. Is there a way to accomplish that in the current version already?

mayer79 commented 1 year ago

Interesting idea, thanks.

The output of missRanger() is simply a data.frame with optionally some OOB performance results attached, so this is not possible at the moment.

A complete API change is not possible (too much dependencies). I am considering the following idea:

mr <- missRanger(data, other stuff, output = c("data.frame", "missRanger"))

This would not break current code, while offering necessary flexibility for further analysis.

What do you think?

calogerobra commented 1 year ago

Hi, I think that would be indeed a good way forward to start building some statistics on top of what the model produces.

How long do you think that will take you to be implemented and tested?

-------- Ursprüngliche Nachricht -------- Von: Michael Mayer @.> Datum: 25.10.23 15:42 (GMT+01:00) An: mayer79/missRanger @.> Cc: "Brancatelli, Calogero" @.>, Author @.> Betreff: Re: [mayer79/missRanger] Possibility to access Ranger object for Shap values (Issue #54)

Interesting idea, thanks.

The output of missRanger() is simply a data.frame with optionally some OOB performance results attached, so this is not possible at the moment.

A complete API change is not possible (too much dependencies). I am considering the following idea:

mr <- missRanger(data, output = c("data.frame", "missRanger"))

This would not break current code, while offering necessary flexibility for further analysis.

What do you think?

— Reply to this email directly, view it on GitHubhttps://github.com/mayer79/missRanger/issues/54#issuecomment-1779306100, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AQY56AEZCF5N36ODJ4SI2OTYBEJK3AVCNFSM6AAAAAA6PKKSOGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZZGMYDMMJQGA. You are receiving this because you authored the thread.Message ID: @.***>

[ { @.": "http://schema.org", @.": "EmailMessage", "potentialAction": { @.": "ViewAction", "target": "https://github.com/mayer79/missRanger/issues/54#issuecomment-1779306100", "url": "https://github.com/mayer79/missRanger/issues/54#issuecomment-1779306100", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { @.": "Organization", "name": "GitHub", "url": "https://github.com" } } ]

mayer79 commented 1 year ago

I will ping you when a Pull Request is ready to be installed for a quick cross-check.

mayer79 commented 1 year ago

Implemented in https://github.com/mayer79/missRanger/pull/55

You can use the new version via

devtools::install_github("mayer79/missRanger")

library(missRanger)

irisWithNA <- generateNA(iris, seed = 34)

imp <- missRanger(
  irisWithNA, pmm.k = 3, num.trees = 100, data_only = FALSE, keep_forests = TRUE
)
imp

summary(imp)

imp$forests$Species
piebel commented 1 year ago

thanks @mayer79, this implementation is great. I was wondering, would it be possible, like in the ranger() package to be able to perform functionalities like importance_pvalues() in the missranger() object? With this implementation it seems like we get some more information within the missranger() object, but is it possible to actually perform more investigations on the object itself, like in ranger(), with other functions?

thank you!

mayer79 commented 1 year ago

With the new, extended data_only = FALSE logic, adding methods is now much more natural. I don't have specific plans yet, but your issue was clearly the first step towards more functionality!

calogerobra commented 1 year ago

Thanks a million @mayer79 . Looks great. Wouldn't a simple solution to @piebel 's idea be to return the full Ranger object in the above-mentioned list?

mayer79 commented 1 year ago

In above example, all ranger objects are attached in the $forests slot. But I think it was just an example he mentioned.