TheDigitalFrontier / parallel-decision-trees

Semester project in CS205 Computing Foundations for Computational Science at Harvard School of Engineering and Applied Sciences, spring 2020.
MIT License
3 stars 1 forks source link

OpenMP of basic data structures #96

Closed johannes-kk closed 4 years ago

johannes-kk commented 4 years ago

LossFunction, DataFrame, etc.

johannes-kk commented 4 years ago

Most of the DataFrame functions use DataVector objects, which seem to cause some occasional bad malloc requests with OpenMP. There is also limited looping, hence the only basic data structure parallelisation that seemed reasonable was DataFrame.sample() when using replacement. See #112 .

johannes-kk commented 4 years ago

For Losses the only real looping is done across response classes, which in the binary case means only two. As we focus on binary classification, parallelising Losses is entirely redundant.