ApexAI / performance_test

**This project is deprecated** Go to https://gitlab.com/ApexAI/performance_test
64 stars 41 forks source link

Regression check scripts. #67

Closed MiguelCompany closed 4 years ago

MiguelCompany commented 5 years ago

This is an initial step towards automating performance regression checks.

The ApexComparison.py class provides methods to compare minimum latency, maximum jitter, and maxrss.

The other scripts compare a reference result against another result. apex_compare.py compares two result files, while apex_compare_tree.py compares two result trees generated with run_experiment.py.

This means that apex_compare_tree.py lines 33-34 should be aligned the experiments run by run_experiment.py


This change is Reviewable

esteve commented 5 years ago

@MiguelCompany thanks for you contribution. Unfortunately, there appears to be a bunch of format warnings and the CI failed.

MiguelCompany commented 5 years ago

Unfortunately, there appears to be a bunch of format warnings and the CI failed.

@EduPonz Would you mind taking care of this?

EduPonz commented 5 years ago

I'm into it

EduPonz commented 5 years ago

I just push the changes. I see a clean output when running colcon test --packages-select performance_test

dejanpan commented 5 years ago

@esteve check if it runs and approve and merge.

esteve commented 5 years ago

@dejanpan there's still changes pending, I'll make those changes and push them here before merging.

esteve commented 5 years ago

For some reason, I can't push to this branch on the command line, so I had to make the changes via the web interface.

esteve commented 5 years ago

Thanks @MiguelCompany and @EduPonz for your patience. I've run run_experiment.py and tested both apex_compare.py and apex_compare_tree.py in this PR with the results. The first works as expected, but the latter fails with the following error:

$ python3 performance_test/helper_scripts/regression_checkers/apex_compare_tree.py rate_1000/subs_1/ rate_1000/subs_3/
Traceback (most recent call last):
  File "performance_test/helper_scripts/regression_checkers/apex_compare_tree.py", line 96, in <module>
    ref_files_list = ApexComparison.get_file_names(ref_dir, sub_dirs)
  File "/home/esteve/Projects/adehome/performance_test/performance_test/helper_scripts/regression_checkers/ApexComparison.py", line 416, in get_file_names
    f for f in listdir(dir_name) if isfile(join(dir_name, f))
FileNotFoundError: [Errno 2] No such file or directory: 'rate_1000/subs_1/rate_20/subs_1'
daggarwa commented 5 years ago

Thanks @MiguelCompany and @EduPonz for your patience. I've run run_experiment.py and tested both apex_compare.py and apex_compare_tree.py in this PR with the results. The first works as expected, but the latter fails with the following error:

$ python3 performance_test/helper_scripts/regression_checkers/apex_compare_tree.py rate_1000/subs_1/ rate_1000/subs_3/
Traceback (most recent call last):
  File "performance_test/helper_scripts/regression_checkers/apex_compare_tree.py", line 96, in <module>
    ref_files_list = ApexComparison.get_file_names(ref_dir, sub_dirs)
  File "/home/esteve/Projects/adehome/performance_test/performance_test/helper_scripts/regression_checkers/ApexComparison.py", line 416, in get_file_names
    f for f in listdir(dir_name) if isfile(join(dir_name, f))
FileNotFoundError: [Errno 2] No such file or directory: 'rate_1000/subs_1/rate_20/subs_1'

@esteve can you please let me know what is the status on this? Are we waiting on some resolution here?

EduPonz commented 5 years ago

Sorry for the late reply. The thing is that apex_compare_tree.py is meant to compare two entire sets of results, instead of different parts of one results. The intention is to be able to set a reference, and then check against that reference. For the sake of illustration, let's say I have a reference directory and a results directory with the following structure:

test@perf_test_2:~/apex_ws$ ll reference/
total 12
drwxrwxr-x 5 test test 4096 Oct 18 06:23 rate_1000
drwxrwxr-x 5 test test 4096 Oct 18 06:23 rate_20
drwxrwxr-x 5 test test 4096 Oct 18 06:23 rate_50

test@perf_test_2:~/apex_ws$ ll results/2019-09-10_08-10-34/FastRTPS/
total 12
drwxrwxr-x 5 test test 4096 Sep 10 08:20 rate_1000
drwxrwxr-x 5 test test 4096 Sep 10 08:20 rate_20
drwxrwxr-x 5 test test 4096 Sep 10 08:20 rate_50

Then, I can compare them by running

python3 apex_compare_tree.py reference/ results/2019-09-10_08-10-34/FastRTPS/

This should produce output in the manner

[2019-10-18 06:24:08,612] 
[2019-10-18 06:24:08,612] ###################################################
[2019-10-18 06:24:08,612]        RUNNING COMPARISON WITH CONFIGURATION       
[2019-10-18 06:24:08,612] ###################################################
[2019-10-18 06:24:08,612] Reference file: reference/rate_20/subs_1/best_effort_transient_keep_last_Array16k_10-09-2019_08-23-15
[2019-10-18 06:24:08,612] Target file: results/2019-09-10_08-10-34/FastRTPS/rate_20/subs_1/best_effort_transient_keep_last_Array16k_10-09-2019_08-23-15
[2019-10-18 06:24:08,612] Columns of interest: ['latency_min (ms)', 'ru_maxrss', 'latency_max (ms)']
[2019-10-18 06:24:08,612] Latency threshold: 5
[2019-10-18 06:24:08,612] RSS threshold: 5
[2019-10-18 06:24:08,612] Jitter threshold: 5
[2019-10-18 06:24:08,613] Print results: True
[2019-10-18 06:24:08,613] ###################################################
[2019-10-18 06:24:08,615] RESULTS
[2019-10-18 06:24:08,615] ---------------------------------------------------
[2019-10-18 06:24:08,615] REFERENCE FILE:   reference/rate_20/subs_1/best_effort_transient_keep_last_Array16k_10-09-2019_08-23-15
[2019-10-18 06:24:08,615] Min Latency (ms): 0.07144
[2019-10-18 06:24:08,615] Max RSS (KB):     44848.0
[2019-10-18 06:24:08,615] Max Jitter (ms):  44.70930
[2019-10-18 06:24:08,615] ---------------------------------------------------
[2019-10-18 06:24:08,615] TARGET FILE:      results/2019-09-10_08-10-34/FastRTPS/rate_20/subs_1/best_effort_transient_keep_last_Array16k_10-09-2019_08-23-15
[2019-10-18 06:24:08,615] Min Latency (ms): 0.07144
[2019-10-18 06:24:08,615] Max RSS (KB):     44848.0
[2019-10-18 06:24:08,615] Max Jitter (ms):  44.70930
[2019-10-18 06:24:08,615] ---------------------------------------------------
[2019-10-18 06:24:08,616] COMPARISON:
[2019-10-18 06:24:08,616] SIMILAR Latency (ms): 0.00000 (0.00000 %)
[2019-10-18 06:24:08,616] SIMILAR RSS (KB):     0.00000 (0.00000 %)
[2019-10-18 06:24:08,616] SIMILAR Jitter (ms):  0.00000 (0.00000 %)
[2019-10-18 06:24:08,616] ---------------------------------------------------
[2019-10-18 06:24:08,616] FINAL RESULT
[2019-10-18 06:24:08,616] Comparison passed =)
[2019-10-18 06:24:08,616] ---------------------------------------------------
[...]
daggarwa commented 5 years ago

Sorry for the late reply. The thing is that apex_compare_tree.py is meant to compare two entire sets of results, instead of different parts of one results. The intention is to be able to set a reference, and then check against that reference. For the sake of illustration, let's say I have a reference directory and a results directory with the following structure:

test@perf_test_2:~/apex_ws$ ll reference/
total 12
drwxrwxr-x 5 test test 4096 Oct 18 06:23 rate_1000
drwxrwxr-x 5 test test 4096 Oct 18 06:23 rate_20
drwxrwxr-x 5 test test 4096 Oct 18 06:23 rate_50

test@perf_test_2:~/apex_ws$ ll results/2019-09-10_08-10-34/FastRTPS/
total 12
drwxrwxr-x 5 test test 4096 Sep 10 08:20 rate_1000
drwxrwxr-x 5 test test 4096 Sep 10 08:20 rate_20
drwxrwxr-x 5 test test 4096 Sep 10 08:20 rate_50

Then, I can compare them by running

python3 apex_compare_tree.py reference/ results/2019-09-10_08-10-34/FastRTPS/

This should produce output in the manner

[2019-10-18 06:24:08,612] 
[2019-10-18 06:24:08,612] ###################################################
[2019-10-18 06:24:08,612]        RUNNING COMPARISON WITH CONFIGURATION       
[2019-10-18 06:24:08,612] ###################################################
[2019-10-18 06:24:08,612] Reference file: reference/rate_20/subs_1/best_effort_transient_keep_last_Array16k_10-09-2019_08-23-15
[2019-10-18 06:24:08,612] Target file: results/2019-09-10_08-10-34/FastRTPS/rate_20/subs_1/best_effort_transient_keep_last_Array16k_10-09-2019_08-23-15
[2019-10-18 06:24:08,612] Columns of interest: ['latency_min (ms)', 'ru_maxrss', 'latency_max (ms)']
[2019-10-18 06:24:08,612] Latency threshold: 5
[2019-10-18 06:24:08,612] RSS threshold: 5
[2019-10-18 06:24:08,612] Jitter threshold: 5
[2019-10-18 06:24:08,613] Print results: True
[2019-10-18 06:24:08,613] ###################################################
[2019-10-18 06:24:08,615] RESULTS
[2019-10-18 06:24:08,615] ---------------------------------------------------
[2019-10-18 06:24:08,615] REFERENCE FILE:   reference/rate_20/subs_1/best_effort_transient_keep_last_Array16k_10-09-2019_08-23-15
[2019-10-18 06:24:08,615] Min Latency (ms): 0.07144
[2019-10-18 06:24:08,615] Max RSS (KB):     44848.0
[2019-10-18 06:24:08,615] Max Jitter (ms):  44.70930
[2019-10-18 06:24:08,615] ---------------------------------------------------
[2019-10-18 06:24:08,615] TARGET FILE:      results/2019-09-10_08-10-34/FastRTPS/rate_20/subs_1/best_effort_transient_keep_last_Array16k_10-09-2019_08-23-15
[2019-10-18 06:24:08,615] Min Latency (ms): 0.07144
[2019-10-18 06:24:08,615] Max RSS (KB):     44848.0
[2019-10-18 06:24:08,615] Max Jitter (ms):  44.70930
[2019-10-18 06:24:08,615] ---------------------------------------------------
[2019-10-18 06:24:08,616] COMPARISON:
[2019-10-18 06:24:08,616] SIMILAR Latency (ms): 0.00000 (0.00000 %)
[2019-10-18 06:24:08,616] SIMILAR RSS (KB):     0.00000 (0.00000 %)
[2019-10-18 06:24:08,616] SIMILAR Jitter (ms):  0.00000 (0.00000 %)
[2019-10-18 06:24:08,616] ---------------------------------------------------
[2019-10-18 06:24:08,616] FINAL RESULT
[2019-10-18 06:24:08,616] Comparison passed =)
[2019-10-18 06:24:08,616] ---------------------------------------------------
[...]

@EduPonz Thanks so much for the clarification! I also think that adding a description about this comparison script in a README.md file along with a usage example will be super useful. Can that be done? @esteve Is everything good to go in then I believe here now right?

esteve commented 5 years ago

@EduPonz perhaps it'd be good to put the example of how to run apex_compare_tree.py in the script itself so that everyone can know how to run it. Thanks!

daggarwa commented 4 years ago

@EduPonz perhaps it'd be good to put the example of how to run apex_compare_tree.py in the script itself so that everyone can know how to run it. Thanks!

@EduPonz Any updates on when will you be able to make changes for this?

esteve commented 4 years ago

@EduPonz thanks for being so patient and for making the changes!

esteve commented 4 years ago

@daggarwa the changes look good to me, but I'd like to know your opinion. Thanks.

daggarwa commented 4 years ago

@daggarwa the changes look good to me, but I'd like to know your opinion. Thanks.

@esteve

@daggarwa the changes look good to me, but I'd like to know your opinion. Thanks.

@esteve One thing. I dont think we use this file anymore https://github.com/ApexAI/performance_test/pull/67/files#diff-f61bfa3354b5fc7f2d857616f6e05211 . So let me try to run apex_compare.py on result files i generated locally and see if things are working

daggarwa commented 4 years ago

@EduPonz What does it mean by comparison failed like I got below :

divya.aggarwal@ade:~/perf_test_ws/src/performance_test/performance_test/helper_scripts/regression_checkers (eProsima-feature/regression-checks %)$ python3 apex_compare.py ~/perf_test_ws/log_PointCloud4m_22-10-2019_20-47-50 ~/perf_test_ws/log_PointCloud4m_22-10-2019_20-49-03 
[2019-10-23 13:03:47,825] 
[2019-10-23 13:03:47,825] ###################################################
[2019-10-23 13:03:47,825]        RUNNING COMPARISON WITH CONFIGURATION       
[2019-10-23 13:03:47,825] ###################################################
[2019-10-23 13:03:47,825] Reference file: /home/divya.aggarwal/perf_test_ws/log_PointCloud4m_22-10-2019_20-47-50
[2019-10-23 13:03:47,825] Target file: /home/divya.aggarwal/perf_test_ws/log_PointCloud4m_22-10-2019_20-49-03
[2019-10-23 13:03:47,825] Columns of interest: ['latency_min (ms)', 'ru_maxrss', 'latency_max (ms)']
[2019-10-23 13:03:47,825] Latency threshold: 5
[2019-10-23 13:03:47,825] RSS threshold: 5
[2019-10-23 13:03:47,825] Jitter threshold: 5
[2019-10-23 13:03:47,825] Print results: True
[2019-10-23 13:03:47,825] ###################################################
[2019-10-23 13:03:47,827] RESULTS
[2019-10-23 13:03:47,827] ---------------------------------------------------
[2019-10-23 13:03:47,827] REFERENCE FILE:   /home/divya.aggarwal/perf_test_ws/log_PointCloud4m_22-10-2019_20-47-50
[2019-10-23 13:03:47,827] Min Latency (ms): 1.65500
[2019-10-23 13:03:47,827] Max RSS (KB):     125480.0
[2019-10-23 13:03:47,827] Max Jitter (ms):  26.02500
[2019-10-23 13:03:47,827] ---------------------------------------------------
[2019-10-23 13:03:47,827] TARGET FILE:      /home/divya.aggarwal/perf_test_ws/log_PointCloud4m_22-10-2019_20-49-03
[2019-10-23 13:03:47,828] Min Latency (ms): 8.86000
[2019-10-23 13:03:47,828] Max RSS (KB):     607868.0
[2019-10-23 13:03:47,828] Max Jitter (ms):  61.85000
[2019-10-23 13:03:47,828] ---------------------------------------------------
[2019-10-23 13:03:47,828] COMPARISON:
[2019-10-23 13:03:47,828] WORSE Latency (ms): 7.20500 (435.34743 %)
[2019-10-23 13:03:47,828] WORSE RSS (KB):     482388.00000 (384.43417 %)
[2019-10-23 13:03:47,828] WORSE Jitter (ms):  35.82500 (137.65610 %)
[2019-10-23 13:03:47,828] ---------------------------------------------------
[2019-10-23 13:03:47,828] FINAL RESULT
[2019-10-23 13:03:47,828] Comparison failed =(
[2019-10-23 13:03:47,828] ---------------------------------------------------
[2019-10-23 13:03:47,828] 
[2019-10-23 13:03:47,828] Script exit value: 1
EduPonz commented 4 years ago

@daggarwa That would mean that the particular comparison between /home/divya.aggarwal/perf_test_ws/log_PointCloud4m_22-10-2019_20-47-50 and /home/divya.aggarwal/perf_test_ws/log_PointCloud4m_22-10-2019_20-49-03 resulted in the target not approving the check (thus failing). In this case, as you can see in the comparison section, all Latency's, RSS' and Jitter's performance where more than the 5% allowed worse than the reference. As for the script, if any of the comparisons fails, the exit code will be 1, signifying that there was something wrong.

daggarwa commented 4 years ago

@daggarwa That would mean that the particular comparison between /home/divya.aggarwal/perf_test_ws/log_PointCloud4m_22-10-2019_20-47-50 and /home/divya.aggarwal/perf_test_ws/log_PointCloud4m_22-10-2019_20-49-03 resulted in the target not approving the check (thus failing). In this case, as you can see in the comparison section, all Latency's, RSS' and Jitter's performance where more than the 5% allowed worse than the reference. As for the script, if any of the comparisons fails, the exit code will be 1, signifying that there was something wrong.

Thank you for the clarification. Looks good to me. We can merge this in.