Closed khammouna closed 6 years ago
Hi. It is very difficult to help you with such a small amount of information. What does your code look like? What is the error message?
That being said, my end response will probably be that LIME is not intended to be run on that number of observations. It is a pretty calculation heavy procedure that takes around 1-2 sec per observation, even with fast models, so your dataset would take more than a days worth of calculations. Further, LIME is intended to produce human interpretable explanations, meaning you'd have to look through 100000 explanations for your efforts to be worth it.
Select the observations you're curious about and run LIME on these - that is the intended usage
thank you for your reactivity, I have a xgboost model with h2o package and a training frame of 34 features (, ) and 1 700 000 observations and it's classification problem
here is my code : explainer <- lime(Train_sans, "XGB", nbins = 5)
I want to explain data_LIME : 100.000 observations to have statistics about the features that explain the most observations ...
explanations = data.table(explain( data_LIME, explainer = explainer, n_permutations = 5000, dist_fun = "manhattan", kernel_width = 3, n_features = 5, feature_select = "highest_weights", n_labels = 1 ))[, .(feature, feature_weight, model_r2)]
I have the error : can not allocate a size vector 3.7 GB
however i am working in a server of 500 GB ...
Thank you in advance for any information
Despite having a large server you may not have access to all the resources. I cannot help you debug this as it is not a LIME problem but a matter of you pressing LIME and your system beyond what was intended.
Your best course of action would probably be to calculate the explanations in smaller chunks if you insist in doing explanation for your whole set...
ok i'll try to solve it out, thank you very much
First of all, thank you for bringing LIME to R ! I am trying to explain 100.000 observations with LIME but it doesn't work, I have a memory error .. Is it possible to avoid this error ? thanks in advance