neurodata / scikit-tree

Scikit-learn compatible decision trees beyond those offered in scikit-learn
https://docs.neurodata.io/scikit-tree/dev/index.html
Other
54 stars 13 forks source link

Unsupervised tree is slow #275

Open mkgeneral opened 1 month ago

mkgeneral commented 1 month ago

Training an Unsupervised tree model takes a long time.

25K sample with 7 or 15 features using both "twomeans" and "fastbic" criterions takes between 1.5 to 2 hours to complete.

50K sample with the same number of features runs for hours. Then the process disappears without any error messages.

Do you have examples of training an unsupervised tree or an unsupervised forest with large data samples (hundreds of thousands or millions) and timing?

Thank you.

adam2392 commented 1 month ago

Hi @mkgeneral thanks for using the package. Do you have code snippet showing how you are calling the unsupervised tree model?

In addition, are you able to provide details on the dataset? E.g. discrete, continuous?

adam2392 commented 3 weeks ago

Hi @mkgeneral just following up to see if your problem was resolved?