david-cortes / isotree

(Python, R, C/C++) Isolation Forest and variations such as SCiForest and EIF, with some additions (outlier detection + similarity + NA imputation)
https://isotree.readthedocs.io
BSD 2-Clause "Simplified" License
186 stars 38 forks source link

Error in fit_model(pdata$X_num, pdata$X_cat, unname(pdata$ncat), pdata$Xc, : std::bad_alloc #27

Closed anishjoni closed 3 years ago

anishjoni commented 3 years ago

Hello,

Please let me know if you need more code or explanation, I'm new to reporting issues.

I'm getting the following error when trying to build an isolation tree with 360K obervations.

Code: iso_forest <- isolation.forest(data, ntrees = 100, nthreads = 1)

Error: Error in fit_model(pdata$X_num, pdata$X_cat, unname(pdata$ncat), pdata$Xc, : std::bad_alloc

I have a hunch it is because of using it on 360K obersvations. Is there a limit to the number of observations on which I can use isotree on?

david-cortes commented 3 years ago

There's no limit, but the error message is saying that a memory allocation failed. It could be a bug with the library, or could be that you don't have enough RAM for the model.

anishjoni commented 3 years ago

Thanks for you quick response!

I think maybe it has to do with RAM. I have a memory.limit() 7839.

Is there a way I can scale the algorithm with current RAM to the whole data.frame of 360K observations?

anishjoni commented 3 years ago

Just found the answer to my question in the documentation(isolation.forest , for people with same question in the future). Thank you for your time, awesome package!!