Closed johannes-kk closed 4 years ago
I also added file speedup/speedup_plot.py with Python code to plot two lists of speedup numbers. Feel free to change to fit your tests. Run it locally so you don't need to install Python on the AWS instance.
We should consider acquiring a larger dataset for testing given the limited availability of speed up on 3.5 seconds.
And a much higher ntrees
. 10 is very low, the default for sklearn
is 500, I think.
Thanks for the instructions @wfseaton !
Fyi I included the -g3
in the compile scripts for testing, but it's the "debug" flag. It makes it easier to get a traceback in GDB, but probably impacts performance.
In our experiments I think we should choose a different flag as our baseline.
We've been using O0
thus far when testing, as it seems like the least disruptive compilation – though it does in fact seem to speed up quite a bit compared to no flag at all, which is odd, as I thought O0
was the default ¯_(ツ)_/¯
Closing as testing is underway in AWS by Hardik
I was able to successfully get bin/test_random_forest and demo/demo_rf_serial.cpp to compile and run on AWS. I chose a t2.2xlarge instance with Ubuntu Server 16.04 and no other modifications - this runs demo_rf_serial.cpp in ~3.5 seconds so is plenty large.
We should consider acquiring a larger dataset for testing given the limited availability of speed up on 3.5 seconds.
To configure an AWS instance to run serial, you'll need to install g++ with below commands:
$ sudo apt-get install software-properties-common $ sudo add-apt-repository ppa:ubuntu-toolchain-r/test $ sudo apt-get update $ sudo apt-get install g++ $ gcc --version # To test g++ installation and version
Next, compile and run demo/demo_rf_serial.cpp using command: $ g++ -std=c++14 -g3 ../demo/demo_rf_serial.cpp -o demo_rf_serial $ time ./demo_rf_serial Make sure you compile demo file within the /demo/ folder, otherwise the absolute path to the dataset will fail.