david-cortes / isotree

(Python, R, C/C++) Isolation Forest and variations such as SCiForest and EIF, with some additions (outlier detection + similarity + NA imputation)
https://isotree.readthedocs.io
BSD 2-Clause "Simplified" License
186 stars 38 forks source link

Question: When categorical attributes are split into subsets, are the subsets random? #37

Closed krishnangovindraj closed 2 years ago

krishnangovindraj commented 2 years ago

Sorry for the likely stupid question but I can't seem to figure it out from the code or anywhere else.

When categ_split_type == Subset, If a certain categorical attribute is chosen for the split, is the subset computed randomly each time? (I imagine it is, but I just needed to confirm)

Thanks!

david-cortes commented 2 years ago

If you do not pass any gain-related parameters and do not change other parameters that would make it less random, then yes, that's how it should work.

krishnangovindraj commented 2 years ago

Thanks!