Closed maria-18-git closed 1 year ago
As an example of potential solution:
we need to update if
in if os.path.exists( symlink_to ):
https://github.com/krai/axs2mlperf/blob/master/base_loadgen_program/code_axs.py#L55
should change symlink_to
to power_client_entrydic_path
We need to use last_mlperf_logs
because when we run power command:
time axs byquery power_loadgen_output,task=image_classification,framework=onnxrt,loadgen_scenario=SingleStream,loadgen_mode=PerformanceOnly,model_name=resnet50,loadgen_dataset_size=20,loadgen_buffer_size=100,loadgen_target_latency=0.66,sut_name=eb6-kilt-qaic
We have created command:
/usr/bin/python3 /data/maria/work_collection/mlperf_power_git_master/ptd_client_server/client.py --run-workload "axs byquery loadgen_output,task=image_classification,framework=onnxrt,loadgen_scenario=SingleStream,loadgen_mode=AccuracyOnly,model_name=resnet50,loadgen_dataset_size=150,loadgen_buffer_size=8,sut_name=eb6-kilt-qaic,_with_power+,symlink_to=/data/maria/work_collection/generated_by_power_measurement_on_run_a98af0817b7d488185fc48d70389989e/last_mlperf_logs,power_client_entrydic_path=/data/maria/work_collection/generated_by_power_measurement_on_run_a98af0817b7d488185fc48d70389989e/program_output.json,effective_no_ranging-" --loadgen-logs "/data/maria/work_collection/generated_by_power_measurement_on_run_a98af0817b7d488185fc48d70389989e/last_mlperf_logs" --output "/data/maria/work_collection/generated_by_power_measurement_on_run_a98af0817b7d488185fc48d70389989e/power_logs" --addr 192.168.4.3 --port 4949 --ntp time.google.com --no-timestamp-path
So we need to set input directory with loadgen loags
--loadgen-logs "/data/maria/work_collection/generated_by_power_measurement_on_run_a98af0817b7d488185fc48d70389989e/last_mlperf_logs"
At this moment we can't use experiment name from program_output.json
(we don't have this file at this time).
So we have two possible solutions:
maria@eb6 ~/work_collection/axs2mlperf/power_measurement (master *=)$ time axs byquery power_loadgen_output,task=image_classification,framework=onnxrt,loadgen_scenario=SingleStream,loadgen_mode=PerformanceOnly,model_name=resnet50,loadgen_dataset_size=20,loadgen_buffer_size=100,loadgen_target_latency=0.67,sut_name=eb6-kilt-qaic,no_ranging+
['^', 'byname', 'generated_by_power_measurement_on_run_aece90df425c42649b176a442e5295e3']
real 10m44.616s
user 79m20.386s
sys 0m3.840s
maria@eb6 ~/work_collection/axs2mlperf/power_measurement (master *=)$ ll ~/work_collection/generated_by_power_measurement_on_run_aece90df425c42649b176a442e5295e3
total 28
drwxr-xr-x 4 maria krai 4096 Oct 23 21:48 ./
drwxr-xr-x 110 maria krai 8192 Oct 23 21:38 ../
-rw-r--r-- 1 maria krai 2760 Oct 23 21:48 data_axs.json
drwxr-xr-x 4 maria krai 4096 Oct 23 21:48 power_logs/
-rw-r--r-- 1 maria krai 127 Oct 23 21:48 program_output.json
drwxr-xr-x 2 maria krai 4096 Oct 23 21:38 tmp/
maria@eb6 ~/work_collection/axs2mlperf/power_measurement (master *=)$ cat ~/work_collection/generated_by_power_measurement_on_run_aece90df425c42649b176a442e5295e3/program_output.json
{
"testing_entry_name": "generated_by_image_classification_using_onnxrt_loadgen_on_get_1457c2f3f0ae4613ac67609b2f72d6f9"
}
Performance:
maria@eb6 ~/work_collection/axs2mlperf/power_measurement (master *=)$ axs byquery power_loadgen_output,task=image_classification,framework=onnxrt,loadgen_scenario=SingleStream,loadgen_mode=PerformanceOnly,model_name=resnet50,loadgen_dataset_size=20,loadgen_buffer_size=100,loadgen_target_latency=0.67,sut_name=eb6-kilt-qaic,no_ranging+ , get performance
VALID : _Early_stopping_90th_percentile_estimate=136.828 (milliseconds)
power:
maria@eb6 ~/work_collection/axs2mlperf/power_measurement (master *=)$ axs byquery power_loadgen_output,task=image_classification,framework=onnxrt,loadgen_scenario=SingleStream,loadgen_mode=PerformanceOnly,model_name=resnet50,loadgen_dataset_size=20,loadgen_buffer_size=100,loadgen_target_latency=0.67,sut_name=eb6-kilt-qaic,no_ranging+ , avg_power
16.18421666666664
maria@eb6 ~/work_collection/axs2mlperf/power_measurement (master *=)$ time axs byquery power_loadgen_output,task=image_classification,framework=onnxrt,loadgen_scenario=SingleStream,loadgen_mode=PerformanceOnly,model_name=resnet50,loadgen_dataset_size=20,loadgen_buffer_size=100,loadgen_target_latency=0.66,sut_name=eb6-kilt-qaic
...
['^', 'byname', 'generated_by_power_measurement_on_run_09cc474c97ca435cb3fb469d4feb9dc2']
real 21m8.759s
user 158m44.115s
sys 0m6.650s
maria@eb6 ~/work_collection/axs2mlperf/power_measurement (master *=)$ ll ~/work_collection/generated_by_power_measurement_on_run_09cc474c97ca435cb3fb469d4feb9dc2
total 28
drwxr-xr-x 4 maria krai 4096 Oct 23 22:15 ./
drwxr-xr-x 110 maria krai 8192 Oct 23 22:05 ../
-rw-r--r-- 1 maria krai 2683 Oct 23 22:16 data_axs.json
drwxr-xr-x 5 maria krai 4096 Oct 23 22:16 power_logs/
-rw-r--r-- 1 maria krai 251 Oct 23 22:15 program_output.json
drwxr-xr-x 2 maria krai 4096 Oct 23 22:05 tmp/
Performance:
maria@eb6 ~/work_collection/axs2mlperf/power_measurement (master *=)$ time axs byquery power_loadgen_output,task=image_classification,framework=onnxrt,loadgen_scenario=SingleStream,loadgen_mode=PerformanceOnly,model_name=resnet50,loadgen_dataset_size=20,loadgen_buffer_size=100,loadgen_target_latency=0.66,sut_name=eb6-kilt-qaic , get performance
VALID : _Early_stopping_90th_percentile_estimate=127.267 (milliseconds)
power
maria@eb6 ~/work_collection/axs2mlperf/power_measurement (master *=)$ axs byquery power_loadgen_output,task=image_classification,framework=onnxrt,loadgen_scenario=SingleStream,loadgen_mode=PerformanceOnly,model_name=resnet50,loadgen_dataset_size=20,loadgen_buffer_size=100,loadgen_target_latency=0.66,sut_name=eb6-kilt-qaic , avg_power
16.595569999999995
symlink
for last_mlperf_logs
:Accuracy:
maria@eb6 ~/work_collection/axs2mlperf (master *=)$ time axs byquery power_loadgen_output,task=image_classification,framework=onnxrt,loadgen_scenario=SingleStream,loadgen_mode=AccuracyOnly,model_name=resnet50,loadgen_dataset_size=250,loadgen_buffer_size=8,sut_name=eb6-kilt-qaic
...
['^', 'byname', 'generated_by_power_measurement_on_run_8a772e749efe45e495be0d1c70cfb75b']
real 2m7.708s
user 7m58.012s
sys 0m1.642s
maria@eb6 ~/work_collection/axs2mlperf (master *=)$ ll ~/work_collection/generated_by_power_measurement_on_run_8a772e749efe45e495be0d1c70cfb75b
total 32
drwxr-xr-x 3 maria krai 4096 Oct 24 10:09 ./
drwxr-xr-x 116 maria krai 12288 Oct 24 10:09 ../
-rw-r--r-- 1 maria krai 2594 Oct 24 10:09 data_axs.json
lrwxrwxrwx 1 maria krai 122 Oct 24 10:09 last_mlperf_logs -> /data/maria/work_collection/generated_by_image_classification_using_onnxrt_loadgen_on_get_e231f38ac80643cc8bfb7b875f84b05f/
drwxr-xr-x 5 maria krai 4096 Oct 24 10:09 power_logs/
-rw-r--r-- 1 maria krai 251 Oct 24 10:09 program_output.json
maria@eb6 ~/work_collection/axs2mlperf (master *=)$ cat ~/work_collection/generated_by_power_measurement_on_run_8a772e749efe45e495be0d1c70cfb75b/program_output.json
{
"ranging_entry_name": "generated_by_image_classification_using_onnxrt_loadgen_on_get_726e50f4f4a3448caafac37cb3a0c786",
"testing_entry_name": "generated_by_image_classification_using_onnxrt_loadgen_on_get_e231f38ac80643cc8bfb7b875f84b05f"
}
maria@eb6 ~/work_collection/axs2mlperf (master *=)$ time axs byquery power_loadgen_output,task=image_classification,framework=onnxrt,loadgen_scenario=SingleStream,loadgen_mode=PerformanceOnly,model_name=resnet50,loadgen_dataset_size=20,loadgen_buffer_size=100,loadgen_target_latency=0.64,sut_name=eb6-kilt-qaic,no_ranging+
...
/usr/bin/python3 /data/maria/work_collection/mlperf_power_git_master/ptd_client_server/client.py --run-workload "axs byquery loadgen_output,task=image_classification,framework=onnxrt,loadgen_scenario=SingleStream,loadgen_mode=PerformanceOnly,model_name=resnet50,loadgen_dataset_size=20,loadgen_buffer_size=100,loadgen_target_latency=0.64,sut_name=eb6-kilt-qaic,no_ranging+,_with_power+,symlink_to=/data/maria/work_collection/generated_by_power_measurement_on_run_81c441472ec1495d9010f2b4c6450b5f/last_mlperf_logs,power_client_entrydic_path=/data/maria/work_collection/generated_by_power_measurement_on_run_81c441472ec1495d9010f2b4c6450b5f/program_output.json,effective_no_ranging+" --loadgen-logs "/data/maria/work_collection/generated_by_power_measurement_on_run_81c441472ec1495d9010f2b4c6450b5f/last_mlperf_logs" --output "/data/maria/work_collection/generated_by_power_measurement_on_run_81c441472ec1495d9010f2b4c6450b5f/power_logs" --addr 192.168.4.3 --port 4949 --ntp time.google.com --no-timestamp-path --max-amps 0.5 --max-volts 300
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
client 2023-10-24 10:15:28,767 [INFO] Creating output directory '/data/maria/work_collection/generated_by_power_measurement_on_run_81c441472ec1495d9010f2b4c6450b5f/power_logs'
client 2023-10-24 10:15:28,768 [INFO] Sending command to the server: 'mlcommons/power client v3'
client 2023-10-24 10:15:28,769 [INFO] Got response: 'mlcommons/power server v3'
client 2023-10-24 10:15:28,769 [INFO] Synchronizing with the server and with time.google.com...
client 2023-10-24 10:15:28,791 [INFO] NTP:offset = 0.000 s, delay = 0.011 s
client 2023-10-24 10:15:28,791 [INFO] Sending command to the server: 'time'
client 2023-10-24 10:15:28,792 [INFO] Got response: '1698138928.7924335'
client 2023-10-24 10:15:28,792 [INFO] The time difference between the client and the server is within range -0.782 ms..0.115 ms
client 2023-10-24 10:15:28,792 [INFO] Sending command to the server: 'new,,94ba4393-a465-4c4e-9e90-e929dc8c1efd'
client 2023-10-24 10:15:28,793 [INFO] Got response: 'OK 2023-10-24_10-15-28,9cece5a5-2e85-44b2-b964-c9170b9a1151'
client 2023-10-24 10:15:28,793 [INFO] Session id is '2023-10-24_10-15-28'
client 2023-10-24 10:15:28,793 [INFO] Sources: {"sources": {"__init__.py": "da39a3ee5e6b4b0d3255bfef95601890afd80709", "client.py": "33ca4f26368777ac06e01f9567b714a4b8063886", "lib/__init__.py": "da39a3ee5e6b4b0d3255bfef95601890afd80709", "lib/client.py": "ac2aa093c8e8bbc9569b9e2a3471bc64e58a2258", "lib/common.py": "611d8b29633d331eb19c9455ea3b5fa3284ed6df", "lib/external/__init__.py": "da39a3ee5e6b4b0d3255bfef95601890afd80709", "lib/external/ntplib.py": "4da8f970656505a40483206ef2b5d3dd5e81711d", "lib/server.py": "c7af63c31bb2fbedea4345f571f6e3507d268ada", "lib/source_hashes.py": "60a2e02193209e8d392803326208d5466342da18", "lib/summary.py": "aa92f0a3f975eecd44d3c0cd0236342ccc9f941d", "lib/time_sync.py": "80894ef2389e540781ff78de94db16aa4203a14e", "server.py": "c3f90f2f7eeb4db30727556d0c815ebc89b3d28b", "tests/unit/__init__.py": "da39a3ee5e6b4b0d3255bfef95601890afd80709", "tests/unit/test_server.py": "948c1995d4008bc2aa6c4046a34ffa3858d6d671", "tests/unit/test_source_hashes.py": "00468a2907583c593e6574a1f6b404e4651c221a"}, "modules": {"ptd_client_server.lib.client": "lib/client.py", "ptd_client_server.lib.common": "lib/common.py", "ptd_client_server.lib.external.ntplib": "lib/external/ntplib.py", "ptd_client_server.lib.source_hashes": "lib/source_hashes.py", "ptd_client_server.lib.summary": "lib/summary.py", "ptd_client_server.lib.time_sync": "lib/time_sync.py"}}
client 2023-10-24 10:15:28,794 [WARNING] Providing manual ranges are only for experimental purpose and the produced results won't be valid for submission
client 2023-10-24 10:15:28,794 [INFO] Running workload in testing mode
client 2023-10-24 10:15:28,794 [INFO] Synchronizing with the server and with time.google.com...
client 2023-10-24 10:15:28,805 [INFO] NTP:offset = -0.000 s, delay = 0.011 s
client 2023-10-24 10:15:28,805 [INFO] Sending command to the server: 'time'
client 2023-10-24 10:15:28,806 [INFO] Got response: '1698138928.8062258'
client 2023-10-24 10:15:28,806 [INFO] The time difference between the client and the server is within range -0.773 ms..0.094 ms
client 2023-10-24 10:15:28,806 [INFO] Sending command to the server: 'session,2023-10-24_10-15-28,start,testing,300.0,0.5'
...
['^', 'byname', 'generated_by_power_measurement_on_run_81c441472ec1495d9010f2b4c6450b5f']
real 10m46.590s
user 79m20.510s
sys 0m3.600s
maria@eb6 ~/work_collection/axs2mlperf (master *=)$ ll ~/work_collection/generated_by_power_measurement_on_run_81c441472ec1495d9010f2b4c6450b5f
total 32
drwxr-xr-x 3 maria krai 4096 Oct 24 10:26 ./
drwxr-xr-x 118 maria krai 12288 Oct 24 10:15 ../
-rw-r--r-- 1 maria krai 2775 Oct 24 10:26 data_axs.json
lrwxrwxrwx 1 maria krai 122 Oct 24 10:26 last_mlperf_logs -> /data/maria/work_collection/generated_by_image_classification_using_onnxrt_loadgen_on_get_a9334b8209234006a39c6ac1e4e725e2/
drwxr-xr-x 4 maria krai 4096 Oct 24 10:26 power_logs/
-rw-r--r-- 1 maria krai 127 Oct 24 10:26 program_output.json
maria@eb6 ~/work_collection/axs2mlperf (master *=)$ axs byquery power_loadgen_output,task=image_classification,framework=onnxrt,loadgen_scenario=SingleStream,loadgen_mode=PerformanceOnly,model_name=resnet50,loadgen_dataset_size=20,loadgen_buffer_size=100,loadgen_target_latency=0.64,sut_name=eb6-kilt-qaic,no_ranging+ , get performance
VALID : _Early_stopping_90th_percentile_estimate=132.806 (milliseconds)
maria@eb6 ~/work_collection/axs2mlperf (master *=)$ axs byquery power_loadgen_output,task=image_classification,framework=onnxrt,loadgen_scenario=SingleStream,loadgen_mode=PerformanceOnly,model_name=resnet50,loadgen_dataset_size=20,loadgen_buffer_size=100,loadgen_target_latency=0.64,sut_name=eb6-kilt-qaic,no_ranging+ , avg_power
16.40086666666667
no_ranging-
(by default)maria@eb6 ~/work_collection/axs2mlperf/power_measurement (master *=)$ time axs byquery power_loadgen_output,task=image_classification,framework=onnxrt,loadgen_scenario=SingleStream,loadgen_mode=PerformanceOnly,model_name=resnet50,loadgen_dataset_size=20,loadgen_buffer_size=100,loadgen_target_latency=0.64,sut_name=eb6-kilt-qaic
...
['^', 'byname', 'generated_by_power_measurement_on_run_95ceb14b94e7437f9fa7b1d341a97ca2']
real 21m12.831s
user 158m36.430s
sys 0m6.439s
maria@eb6 ~/work_collection/axs2mlperf/power_measurement (master *=)$ ll ~/work_collection/generated_by_power_measurement_on_run_95ceb14b94e7437f9fa7b1d341a97ca2
total 32
drwxr-xr-x 3 maria krai 4096 Oct 24 11:53 ./
drwxr-xr-x 117 maria krai 12288 Oct 24 11:43 ../
-rw-r--r-- 1 maria krai 2698 Oct 24 11:53 data_axs.json lrwxrwxrwx 1 maria krai 122 Oct 24 11:53 last_mlperf_logs -> /data/maria/work_collection/generated_by_image_classification_using_onnxrt_loadgen_on_get_cfcaad2eb0be4b1386cf8509a38e1cbe/
drwxr-xr-x 5 maria krai 4096 Oct 24 11:53 power_logs/
-rw-r--r-- 1 maria krai 251 Oct 24 11:53 program_output.json
maria@eb6 ~/work_collection/axs2mlperf/power_measurement (master *=)$ axs byquery power_loadgen_output,task=image_classification,framework=onnxrt,loadgen_scenario=Sing
leStream,loadgen_mode=PerformanceOnly,model_name=resnet50,loadgen_dataset_size=20,loadgen_buffer_size=100,loadgen_target_latency=0.64,sut_name=eb6-kilt-qaic , get perf
ormance
VALID : _Early_stopping_90th_percentile_estimate=157.474 (milliseconds)
maria@eb6 ~/work_collection/axs2mlperf/power_measurement (master *=)$ axs byquery power_loadgen_output,task=image_classification,framework=onnxrt,loadgen_scenario=SingleStream,loadgen_mode=PerformanceOnly,model_name=resnet50,loadgen_dataset_size=20,loadgen_buffer_size=100,loadgen_target_latency=0.64,sut_name=eb6-kilt-qaic , avg_power
15.896006666666665
Summary: Solution 2(use 1 symlink
for last_mlperf_logs
) selected.
Status: Done.
Now we have links for other experiments in power experiment. In this case it is difficult to move power experiments to other machines where links will be wrong. Need to switch to
JSON
with names of experiments. Example: