gbm-developers / gbm

Gradient boosted models (the old gbm package)
Other
51 stars 27 forks source link

protection stack overflow #38

Closed Rostamabd closed 4 years ago

Rostamabd commented 5 years ago

Me and one of my colleagues at University of Florida are working with gbm3 package to analysis some SNP genotype data. I see when the number of predictors (features) are larger than ~40000, we will receive the following error.

Error: protect(): protection stack overflow

So, I’m wondering is there any limitation in the gbm3 package in terms of number of predictors (explanatory variable)? It would be appreciated if you can help us to solve this issue with the gbm3 package. Here is the function and arguments we use to run the boosting method:

fit1 = gbm(DPR~., n.trees=500, distribution='gaussian', interaction.depth = 3,data=data1, verbose = TRUE,shrinkage = 0.1,n.minobsinnode = 10,bag.fraction = 0.2, train.fraction = 0.8,keep.data = TRUE)

harrysouthworth commented 5 years ago

Sorry for the delay in replying.

Also sorry for not being very helpful. I haven't had any input to the underlying C and really don't know the answer. However, I'd be surprised if gbm was a good approach to your problem. Most of your predictors are likely to be uninformative which creates problems. Approaches based on lasso and elastic net might be a better first step: https://web.stanford.edu/~hastie/StatLearnSparsity/

Harry

On Tue, 26 Feb 2019 at 19:30, Rostamabd notifications@github.com wrote:

Me and one of my colleagues at University of Florida are working with gbm3 package to analysis some SNP genotype data. I see when the number of predictors (features) are larger than ~40000, we will receive the following error.

Error: protect(): protection stack overflow

So, I’m wondering is there any limitation in the gbm3 package in terms of number of predictors (explanatory variable)? It would be appreciated if you can help us to solve this issue with the gbm3 package. Here is the function and arguments we use to run the boosting method:

fit1 = gbm(DPR~., n.trees=500, distribution='gaussian', interaction.depth = 3,data=data1, verbose = TRUE,shrinkage = 0.1,n.minobsinnode = 10,bag.fraction = 0.2, train.fraction = 0.8,keep.data = TRUE)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/gbm-developers/gbm/issues/38, or mute the thread https://github.com/notifications/unsubscribe-auth/ABavAHrVrdm0B-F5-zAC_VHTDpDDPu6zks5vRYtHgaJpZM4bS8G2 .

bgreenwell commented 4 years ago

Closing this as it concerns gbm3, and not gbm.