nel215 / mondrianforest

An online random forest implementaion written in Python.
MIT License
40 stars 12 forks source link

Batch mode? #8

Open Nirav31 opened 8 years ago

Nirav31 commented 8 years ago

Hi, I am not able to figure out how to use this code in batch mode. Has this functionality been implemented?

nel215 commented 8 years ago

MondrianForestClassifier#fit does not work?

Nirav31 commented 8 years ago

The fit method does work. But it goes through each data point in a for loop and updates the forest for each data point. This is online learning. But this is very slow when the number of data points is large.

I believe we should be able to run the algorithm in a 'batch mode': by which we learn a forest from a large training dataset and also in an 'online mode': by which we can further update the forest with a stream of training data.

nel215 commented 8 years ago

Thanks. To run the batch mode, i think that it's necessary to implement the SampleMondrianBlock of the reference papers. But, there isn't the plan to implement it soon for now.

Nirav31 commented 8 years ago

Okay, thanks for the reply!