mlpack / benchmarks

Machine Learning Benchmark Scripts
101 stars 49 forks source link

Reduce output messages when running benchmarks #125

Closed geektoni closed 6 years ago

geektoni commented 6 years ago

This PR reduces the number of the message which are printed to the terminal when running make run. I do not know if that print statement was just for debugging purposes and then it was forgotten, or if it was placed there for a more sensible reason, therefore I just guarded it.

Before patch output:

/home/uriel/Applications/miniconda3/bin/python3 benchmark/run_benchmark.py -c config.yaml -b shogun -l False -u False -m ALLKNN --f "" --n False -r "" -p ""
[WARN ] No module named simplejson
[INFO ] CPU Model:  Intel(R) Core(TM) i3-3217U CPU @ 1.80GHz
[INFO ] Distribution: debian stretch/sid
[INFO ] Platform: x86_64
[INFO ] Memory: 7.6943359375 GB
[INFO ] CPU Cores: 4
{'general': dict_items([('timeout', 9000), ('databaseHost', 'localhost'), ('port', 3306), ('database', 'benchmarks'), ('driver', 'mysql'), ('keepReports', 20), ('bootstrap', 10), ('libraries', ['mlpack', 'shogun', 'weka', 'scikit', 'mlpy', 'flann', 'ann', 'annoy', 'mrpt', 'dlibml']), ('version', ['HEAD', '3.2.0', '3.6.11', '0.15.1', '3.5.0', '1.8.4', '1.1.2', '1.8.3', '0.1', '19.4'])]), 'DTC': {'{}': [('mlpack', [['datasets/iris_train.csv', 'datasets/iris_test.csv', 'datasets/iris_labels.csv'], ['datasets/oilspill_train.csv', 'datasets/oilspill_test.csv', 'datasets/oilspill_labels.csv'], ['datasets/scene_train.csv', 'datasets/scene_test.csv', 'datasets/scene_labels.csv'], ['datasets/webpage_train.csv', 'datasets/webpage_test.csv', 'datasets/webpage_labels.csv'], ['datasets/isolet_train.csv', 'datasets/isolet_test.csv', 'datasets/isolet_labels.csv'], ['datasets/mammography_train.csv', 'datasets/mammography_test.csv', 'datasets/mammography_labels.csv'], ['datasets/reuters_train.csv', 'datasets/reuters_test.csv', 'datasets/reuters_labels.csv'], ['datasets/abalone19_train.csv', 'datasets/abalone19_test.csv', 'datasets/abalone19_labels.csv'], ['datasets/sickEuthyroid_train.csv', 'datasets/sickEuthyroid_test.csv', 'datasets/sickEuthyroid_labels.csv'], ['datasets/abalone7_train.csv', 'datasets/abalone7_test.csv', 'datasets/abalone7_labels.csv'], ['datasets/satellite_train.csv', 'datasets/satellite_test.csv', 'datasets/satellite_labels.csv'], ['datasets/ecoli_train.csv', 'datasets/ecoli_test.csv', 'datasets/ecoli_labels.csv']], 3, 'methods/mlpack/decision_tree.py', ['csv', 'txt', 'arff'], ['metric'], 'None', ['None'])]}}
{'general': dict_items([('timeout', 9000), ('databaseHost', 'localhost'), ('port', 3306), ('database', 'benchmarks'), ('driver', 'mysql'), ('keepReports', 20), ('bootstrap', 10), ('libraries', ['mlpack', 'shogun', 'weka', 'scikit', 'mlpy', 'flann', 'ann', 'annoy', 'mrpt', 'dlibml']), ('version', ['HEAD', '3.2.0', '3.6.11', '0.15.1', '3.5.0', '1.8.4', '1.1.2', '1.8.3', '0.1', '19.4'])]), 'DTC': {'{}': [('mlpack', [['datasets/iris_train.csv', 'datasets/iris_test.csv', 'datasets/iris_labels.csv'], ['datasets/oilspill_train.csv', 'datasets/oilspill_test.csv', 'datasets/oilspill_labels.csv'], ['datasets/scene_train.csv', 'datasets/scene_test.csv', 'datasets/scene_labels.csv'], ['datasets/webpage_train.csv', 'datasets/webpage_test.csv', 'datasets/webpage_labels.csv'], ['datasets/isolet_train.csv', 'datasets/isolet_test.csv', 'datasets/isolet_labels.csv'], ['datasets/mammography_train.csv', 'datasets/mammography_test.csv', 'datasets/mammography_labels.csv'], ['datasets/reuters_train.csv', 'datasets/reuters_test.csv', 'datasets/reuters_labels.csv'], ['datasets/abalone19_train.csv', 'datasets/abalone19_test.csv', 'datasets/abalone19_labels.csv'], ['datasets/sickEuthyroid_train.csv', 'datasets/sickEuthyroid_test.csv', 'datasets/sickEuthyroid_labels.csv'], ['datasets/abalone7_train.csv', 'datasets/abalone7_test.csv', 'datasets/abalone7_labels.csv'], ['datasets/satellite_train.csv', 'datasets/satellite_test.csv', 'datasets/satellite_labels.csv'], ['datasets/ecoli_train.csv', 'datasets/ecoli_test.csv', 'datasets/ecoli_labels.csv']], 3, 'methods/mlpack/decision_tree.py', ['csv', 'txt', 'arff'], ['metric'], 'None', ['None'])]}, 'PCA': {'{}': [('mlpack', ['datasets/iris.csv', 'datasets/wine.csv', 'datasets/cities.csv', 'datasets/diabetes_X.csv'], 3, 'methods/mlpack/pca.py', ['csv', 'txt'], ['metric'], 'None', ['None'])]}}
{'general': dict_items([('timeout', 9000), ('databaseHost', 'localhost'), ('port', 3306), ('database', 'benchmarks'), ('driver', 'mysql'), ('keepReports', 20), ('bootstrap', 10), ('libraries', ['mlpack', 'shogun', 'weka', 'scikit', 'mlpy', 'flann', 'ann', 'annoy', 'mrpt', 'dlibml']), ('version', ['HEAD', '3.2.0', '3.6.11', '0.15.1', '3.5.0', '1.8.4', '1.1.2', '1.8.3', '0.1', '19.4'])]), 'DTC': {'{}': [('mlpack', [['datasets/iris_train.csv', 'datasets/iris_test.csv', 'datasets/iris_labels.csv'], ['datasets/oilspill_train.csv', 'datasets/oilspill_test.csv', 'datasets/oilspill_labels.csv'], ['datasets/scene_train.csv', 'datasets/scene_test.csv', 'datasets/scene_labels.csv'], ['datasets/webpage_train.csv', 'datasets/webpage_test.csv', 'datasets/webpage_labels.csv'], ['datasets/isolet_train.csv', 'datasets/isolet_test.csv', 'datasets/isolet_labels.csv'], ['datasets/mammography_train.csv', 'datasets/mammography_test.csv', 'datasets/mammography_labels.csv'], ['datasets/reuters_train.csv', 'datasets/reuters_test.csv', 'datasets/reuters_labels.csv'], ['datasets/abalone19_train.csv', 'datasets/abalone19_test.csv', 'datasets/abalone19_labels.csv'], ['datasets/sickEuthyroid_train.csv', 'datasets/sickEuthyroid_test.csv', 'datasets/sickEuthyroid_labels.csv'], ['datasets/abalone7_train.csv', 'datasets/abalone7_test.csv', 'datasets/abalone7_labels.csv'], ['datasets/satellite_train.csv', 'datasets/satellite_test.csv', 'datasets/satellite_labels.csv'], ['datasets/ecoli_train.csv', 'datasets/ecoli_test.csv', 'datasets/ecoli_labels.csv']], 3, 'methods/mlpack/decision_tree.py', ['csv', 'txt', 'arff'], ['metric'], 'None', ['None'])]}, 'PCA': {'{}': [('mlpack', ['datasets/iris.csv', 'datasets/wine.csv', 'datasets/cities.csv', 'datasets/diabetes_X.csv'], 3, 'methods/mlpack/pca.py', ['csv', 'txt'], ['metric'], 'None', ['None'])]}, 'PERCEPTRON': {'{"max_iterations": 10000}': [('mlpack', [['datasets/iris_train.csv', 'datasets/iris_test.csv', 'datasets/iris_labels.csv'], ['datasets/oilspill_train.csv', 'datasets/oilspill_test.csv', 'datasets/oilspill_labels.csv'], ['datasets/scene_train.csv', 'datasets/scene_test.csv', 'datasets/scene_labels.csv'], ['datasets/webpage_train.csv', 'datasets/webpage_test.csv', 'datasets/webpage_labels.csv'], ['datasets/isolet_train.csv', 'datasets/isolet_test.csv', 'datasets/isolet_labels.csv'], ['datasets/mammography_train.csv', 'datasets/mammography_test.csv', 'datasets/mammography_labels.csv'], ['datasets/reuters_train.csv', 'datasets/reuters_test.csv', 'datasets/reuters_labels.csv'], ['datasets/abalone19_train.csv', 'datasets/abalone19_test.csv', 'datasets/abalone19_labels.csv'], ['datasets/sickEuthyroid_train.csv', 'datasets/sickEuthyroid_test.csv', 'datasets/sickEuthyroid_labels.csv'], ['datasets/abalone7_train.csv', 'datasets/abalone7_test.csv', 'datasets/abalone7_labels.csv'], ['datasets/satellite_train.csv', 'datasets/satellite_test.csv', 'datasets/satellite_labels.csv'], ['datasets/ecoli_train.csv', 'datasets/ecoli_test.csv', 'datasets/ecoli_labels.csv']], 3, 'methods/mlpack/perceptron.py', ['csv', 'txt', 'arff'], ['metric'], 'None', ['None'])]}}
.... and many more lines like this

After patch output:

/home/uriel/Applications/miniconda3/bin/python3 benchmark/run_benchmark.py -c config.yaml -b shogun -l False -u False -m ALLKNN --f "" --n False -r "" -p ""
[WARN ] No module named simplejson
[INFO ] CPU Model:  Intel(R) Core(TM) i3-3217U CPU @ 1.80GHz
[INFO ] Distribution: debian stretch/sid
[INFO ] Platform: x86_64
[INFO ] Memory: 7.6943359375 GB
[INFO ] CPU Cores: 4
[INFO ] Method: ALLKNN
[INFO ] Options: {'k': 3}
[INFO ] Library: shogun
[INFO ] Dataset: wine
[INFO ] Dataset: cloud
[INFO ] Dataset: wine
mlpack-jenkins commented 6 years ago

Can one of the admins verify this patch?

rcurtin commented 6 years ago

@mlpack-jenkins test this please

rcurtin commented 6 years ago

Thanks! I think you are right that it is unnecessary debugging output. Personally I'd be okay with removing the print entirely, even when verbose is set. @zoq any thoughts on this one?

zoq commented 6 years ago

Agreed, this one was was a left over from a previous commit: https://github.com/mlpack/benchmarks/commit/2e8fd55affc815c0ac4ebc83e02d42352a9ea762#diff-1a3a1a0a9cb19f8ba0a65c89a7a2a016R461 which is I think is unnecessary to keep.

rcurtin commented 6 years ago

Oops, you're right. It was my fault :) I should have checked that.

zoq commented 6 years ago

Haven't noticed either :)

geektoni commented 6 years ago

Updated :+1:

rcurtin commented 6 years ago

Perfect! Thanks so much for the contribution. :+1: