NiklasPfister / adaXT

adaXT: tree-based machine learning in Python
https://niklaspfister.github.io/adaXT/
BSD 3-Clause "New" or "Revised" License
7 stars 1 forks source link

multiprocessing nogil #25

Closed svbrodersen closed 10 months ago

svbrodersen commented 11 months ago

Test how bad it is running the multiprocessing with (https://docs.python.org/3/library/multiprocessing.html) and see if it makes sense to instead change everything over to nogil, such that we can do actual multithreading.

svbrodersen commented 11 months ago

Did some testing on the multiprocessing on my pc. First would like to state the details of my laptop spec, which is not that impressive, but: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 Vendor ID: GenuineIntel Model name: Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz CPU family: 6 Model: 142 Thread(s) per core: 2 Core(s) per socket: 2 Socket(s): 1 Stepping: 9 CPU max MHz: 3100.0000 CPU min MHz: 400.0000 BogoMIPS: 5399.81

Here note the small amount of cores, as this plays a big role in the parallel world. Now I ran the test with 100 trees in both the sklearn random forest and ours with just multithreading the fitting. Each tree is fitted with the squared error method and is fitted on 1000 data points each with 5 features. (it's 100 % of the data in our case, so we are technically fitting on a greater dataset. The sklearn is "randomly" choosing some data to fit the trees on). With this in mind I got the following: multiproc

WilliamHeuser commented 10 months ago

I ran the same code on my machine with the following result: multiproc_both100 And with only our DecisionTrees: multiproc_our100 and only sklearn: multiproc_sklearn100

WilliamHeuser commented 10 months ago

I tried building a 100 trees using shared memory for the X and Y inputs, such that each process wouldn't need to receive the entire dataset but could just access the one already in memory. Although sharing the data was faster than the version where we did not share, the running time was nowhere near that of sklearn. The following graph shows this: multiproc_both100 Note: the time is measures in milliseconds and not seconds as it says on the axis title.

svbrodersen commented 10 months ago

For now we have decided to move forward with multiprocessing, as we want the user to have the ability to create their own Criteria without having to bother with the nogil.