Waikato / moa

MOA is an open source framework for Big Data stream mining. It includes a collection of machine learning algorithms (classification, regression, clustering, outlier detection, concept drift detection and recommender systems) and tools for evaluation.
http://moa.cms.waikato.ac.nz/
GNU General Public License v3.0
613 stars 354 forks source link

Time measurement for AdaptiveRandomForest in parallel mode #228

Closed Jwata closed 3 years ago

Jwata commented 3 years ago

DoTask measures CPU time of TaskThread. https://github.com/Waikato/moa/blob/f8b4d485253ff4917240435b388d0d26fc7b325a/moa/src/main/java/moa/DoTask.java#L225

Since I'm not expert at Java, I may be wrong.
But when running AdaptiveRandomForest learner with multi threads (= with numberOfJobs option), I suspect time spent on the AdaptiveRandomForest threads isn't included in the TaskThread time in DoTask. https://github.com/Waikato/moa/blob/f8b4d485253ff4917240435b388d0d26fc7b325a/moa/src/main/java/moa/classifiers/meta/AdaptiveRandomForest.java#L160

Is my understanding correct? If so, is it ok to use wall clock time for measuring ARF runtime?

hmgomes commented 3 years ago

Hi @Jwata Your understanding is correct. You should measure the wall clock time in this case.

Jwata commented 3 years ago

@hmgomes Thank you for your confirmation. Will measure wall clock time.