mlcommons / power-dev

Dev repo for power measurement for the MLPerf™ benchmarks
https://mlcommons.org/en/groups/best-practices-power
Apache License 2.0
16 stars 22 forks source link

Experimental no-ranging mode does not appear to be working #314

Closed psyhtest closed 1 year ago

psyhtest commented 1 year ago

With --max-amps 10 --max-volts 250 passed, I only get the Voltage value propagated. The Current value turns out to be None.

ptd-server 2023-06-15 11:28:42,910 [INFO] Initial range for Amps is Auto for Volts is Auto
ptd-server 2023-06-15 11:28:42,910 [INFO] Sending to ptd: 'SR,V,250.0'
06-15-2023 16:28:42.911: Volt range set to 300.000000V
06-15-2023 16:28:44.412: Response to client sent: Range V changed
ptd-server 2023-06-15 11:28:44,413 [INFO] Reply from ptd: 'Range V changed'
ptd-server 2023-06-15 11:28:44,413 [INFO] Sending to ptd: 'SR,A,None'
06-15-2023 16:28:44.413: ERROR: invalid range 0.000000A requested
06-15-2023 16:28:44.413: Response to client sent: Error setting range
ptd-server 2023-06-15 11:28:44,413 [INFO] Reply from ptd: 'Error setting range'
ptd-server 2023-06-15 11:28:44,414 [ERROR] Error setting current range: None
ptd-server 2023-06-15 11:28:44,414 [INFO] Sending to ptd: 'Stop'
psyhtest commented 1 year ago

It's possible that None comes from the default value of the _desirableCurrentRange variable.

The code to set this variable relies on self._avgWatts. This in turn relies on the max_volts_amps_avg_watts function, which seems to parse a ranging log.

Now, what happens if we don't have one, @arjunsuresh?

arjunsuresh commented 1 year ago

Hi @psyhtest can you please share the full log on the client side? There should be a line similar to this

client 2023-06-15 21:32:11,775 [INFO] Sending command to the server: 'session,2023-06-15_21-32-11,start,testing,250.0,10.0'

The initialization of _desirableCurrentRange happens here

Testing on a dummy device I'm getting expected results

Selected power meter 'Dummy (testing only)' from dummy.cpp
  ****************************************************************************
                      ***********************************                     
                               SPEC PTDaemon Tool                             
                        Version 1.10.0-ed9a21d2-20220817                     
                      ***********************************                     
                     Licensed Materials - Property of SPEC
     Copyright 2006-2022 Standard Performance Evaluation Corporation (SPEC)
                              All Rights Reserved.
  For use with benchmark products from SPEC and authorized organizations only.
  ****************************************************************************

Redirecting data output to file /tmp/tmpjctu0ok9/ptd_logfile.txt
Calculated PTD CRC: 0xed9a21d2, 7188608
06-15-2023 20:32:11.904: Attempting to connect to measurement device type 0...
06-15-2023 20:32:11.904: Dummy identifies: SPECpower's Dummy Analyzer
06-15-2023 20:32:11.904: Uncertainty checking for Dummy is activated
06-15-2023 20:32:11.904: Connected to Dummy successfully
06-15-2023 20:32:11.904: Establishing the listener on port 8888...
06-15-2023 20:32:11.904: Waiting for a connection...
ptd-server 2023-06-15 21:32:11,980 [INFO] Sending to ptd: 'Hello'
06-15-2023 20:32:11.980: Accepted connection from 127.0.0.1:34026
06-15-2023 20:32:11.980: Response to client sent: Hello, PTDaemon here!
ptd-server 2023-06-15 21:32:11,980 [INFO] Reply from ptd: 'Hello, PTDaemon here!'
ptd-server 2023-06-15 21:32:11,980 [INFO] Sending to ptd: 'Identify'
06-15-2023 20:32:11.981: Response to client sent: Dummy,1000,1,1,1,1,1,1,0,version=1.10.0-ed9a21d2-20220817,OS=Linux 5.19.0-41-generic #42~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue Apr 18 17:40:00 UTC 2 x86_64,mode=power,1,1,1,0,0,no_cal_date,SPECpower's Dummy Analyzer v1.0
ptd-server 2023-06-15 21:32:11,981 [INFO] Reply from ptd: "Dummy,1000,1,1,1,1,1,1,0,version=1.10.0-ed9a21d2-20220817,OS=Linux 5.19.0-41-generic #42~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue Apr 18 17:40:00 UTC 2 x86_64,mode=power,1,1,1,0,0,no_cal_date,SPECpower's Dummy Analyzer v1.0"
ptd-server 2023-06-15 21:32:11,981 [INFO] Connected to PTDaemon
ptd-server 2023-06-15 21:32:11,981 [INFO] Sending to ptd: 'RR'
06-15-2023 20:32:11.981: Response to client sent: Ranges,-1,-1.000000,-1,-1.000000
ptd-server 2023-06-15 21:32:11,981 [INFO] Reply from ptd: 'Ranges,-1,-1.000000,-1,-1.000000'
ptd-server 2023-06-15 21:32:11,981 [INFO] Initial range for Amps is Auto for Volts is Auto
ptd-server 2023-06-15 21:32:11,981 [INFO] Sending to ptd: 'SR,V,250.0'
06-15-2023 20:32:11.981: Ranges are -1.000000, -1, 250.000000, 0
06-15-2023 20:32:11.981: Response to client sent: Range V changed
ptd-server 2023-06-15 21:32:11,981 [INFO] Reply from ptd: 'Range V changed'
ptd-server 2023-06-15 21:32:11,981 [INFO] Sending to ptd: 'SR,A,10.0'
06-15-2023 20:32:11.982: Ranges are 10.000000, 0, 250.000000, 0
06-15-2023 20:32:11.982: Response to client sent: Range A changed
ptd-server 2023-06-15 21:32:11,982 [INFO] Reply from ptd: 'Range A changed'
ptd-server 2023-06-15 21:32:21,992 [INFO] Starting testing mode
ptd-server 2023-06-15 21:32:21,993 [INFO] maxAmps: 10.0, maxVolts: 250.0
ptd-server 2023-06-15 21:32:21,993 [INFO] Sending to ptd: 'Go,1000,0,2023-06-15_21-32-11_testing'
06-15-2023 20:32:21.993: Go with mark '2023-06-15_21-32-11_testing'
06-15-2023 20:32:21.993: Response to client sent: Starting untimed measurement, maximum 500000 samples at 1000ms with 0 rampup samples
ptd-server 2023-06-15 21:32:21,994 [INFO] Reply from ptd: 'Starting untimed measurement, maximum 500000 samples at 1000ms with 0 rampup samples'
ptd-server 2023-06-15 21:32:21,994 [INFO] Sending reply to client 'OK'
ptd-server 2023-06-15 21:32:21,996 [INFO] Got command from the client 'time'
ptd-server 2023-06-15 21:32:21,996 [INFO] Sending reply to client '1686861141.996324'
ptd-server 2023-06-15 21:32:47,035 [INFO] Got command from the client 'time'
ptd-server 2023-06-15 21:32:47,036 [INFO] Sending reply to client '1686861167.0360458'
ptd-server 2023-06-15 21:32:47,037 [INFO] Got command from the client 'session,2023-06-15_21-32-11,stop,testing'
ptd-server 2023-06-15 21:32:57,047 [INFO] Sending to ptd: 'Watts'
06-15-2023 20:32:57.047: Response to client sent: Watts,20.175000,20.000000,20.350000,36,0,36
ptd-server 2023-06-15 21:32:57,048 [INFO] Reply from ptd: 'Watts,20.175000,20.000000,20.350000,36,0,36'
ptd-server 2023-06-15 21:32:57,048 [INFO] Sending to ptd: 'Uncertainty'
06-15-2023 20:32:57.048: Response to client sent: Uncertainty,0.002018,0.002001,0.002036,36,0,36,0
ptd-server 2023-06-15 21:32:57,049 [INFO] Reply from ptd: 'Uncertainty,0.002018,0.002001,0.002036,36,0,36,0'
ptd-server 2023-06-15 21:32:57,049 [INFO] Sending to ptd: 'Stop'
06-15-2023 20:32:57.049: Response to client sent: Stopping untimed measurement
ptd-server 2023-06-15 21:32:57,049 [INFO] Reply from ptd: 'Stopping untimed measurement'
ptd-server 2023-06-15 21:32:57,050 [INFO] Sending to ptd: 'RL'
06-15-2023 20:32:57.994: Completed test
06-15-2023 20:32:57.994: Avg watts 20.175000, min watts 20.000000, max watts 20.350000, samples 36, errors 0, valid 36
06-15-2023 20:32:57.994: Response to client sent: Last 36 samples
psyhtest commented 1 year ago

@arjunsuresh Unfortunately, I've temporarily lost access to the machine. As soon as it's restored, I'll update you.

psyhtest commented 1 year ago

@arjunsuresh Apologies for the delay. Here's a full log with --max_volts=300 and --max-amps=5:

ptd-server 2023-06-20 17:19:36,805 [INFO] Got command from the client 'session,2023-06-20_17-19-36,start,testing,300.0,5.0'
ptd-server 2023-06-20 17:19:36,806 [INFO] Running PTDaemon: ['/prj/austin/system_mgt/validation/aus/crd/yokogawa/power-main/inference_v1.0/ptd-linux-x86', '-l', '/tmp/tmpaxe0hlgv/ptd_logfile.txt', '-p', '8894', '-c', '1', '49', '/dev/usbtmc-Yoko-13']
Selected power meter 'Yokogawa WT310' from wt310.cpp
  ****************************************************************************
                      ***********************************
                               SPEC PTDaemon Tool
                        Version 1.10.0-ed9a21d2-20220817
                      ***********************************
                     Licensed Materials - Property of SPEC
     Copyright 2006-2022 Standard Performance Evaluation Corporation (SPEC)
                              All Rights Reserved.
  For use with benchmark products from SPEC and authorized organizations only.
  ****************************************************************************

Redirecting data output to file /tmp/tmpaxe0hlgv/ptd_logfile.txt
Calculated PTD CRC: 0xed9a21d2, 7188608                                                                                                                                                                         06-20-2023 22:19:36.911: Attempting to connect to measurement device type 49...
06-20-2023 22:19:37.512: Analyzer identity response of 33 bytes: YOKOGAWA,WT310E,C2YK12022V,F1.04

06-20-2023 22:19:37.513: WT3XX enhanced model detected.
06-20-2023 22:19:53.827: Uncertainty checking for YokogawaWT310E is activated
06-20-2023 22:19:53.827: Connected to YokogawaWT310E successfully
06-20-2023 22:19:53.827: Establishing the listener on port 8894...
06-20-2023 22:19:53.827: Waiting for a connection...
ptd-server 2023-06-20 17:19:53,837 [INFO] Sending to ptd: 'Hello'
06-20-2023 22:19:53.837: Accepted connection from 127.0.0.1:58312
ptd-server 2023-06-20 17:19:53,839 [INFO] Reply from ptd: 'Hello, PTDaemon here!'
ptd-server 2023-06-20 17:19:53,839 [INFO] Sending to ptd: 'Identify'
06-20-2023 22:19:53.838: Response to client sent: Hello, PTDaemon here!
06-20-2023 22:19:53.839: Response to client sent: YokogawaWT310E,1000,1,1,1,1,0,1,1,version=1.10.0-ed9a21d2-20220817,OS=Linux 5.4.0-136-generic #153~18.04.1-Ubuntu SMP Wed Nov 30 15:47:57 UTC 2022 x86_64,mode=power,1,1,1,0,0,no_cal_date,YOKOGAWA;WT310E;C2YK12022V;F1.04
ptd-server 2023-06-20 17:19:53,839 [INFO] Reply from ptd: 'YokogawaWT310E,1000,1,1,1,1,0,1,1,version=1.10.0-ed9a21d2-20220817,OS=Linux 5.4.0-136-generic #153~18.04.1-Ubuntu SMP Wed Nov 30 15:47:57 UTC 2022 x86_64,mode=power,1,1,1,0,0,no_cal_date,YOKOGAWA;WT310E;C2YK12022V;F1.04'
ptd-server 2023-06-20 17:19:53,840 [INFO] Connected to PTDaemon
ptd-server 2023-06-20 17:19:53,840 [INFO] Sending to ptd: 'RR'
06-20-2023 22:19:53.840: Response to client sent: Ranges,1,2.000000,1,300.000000
ptd-server 2023-06-20 17:19:53,840 [INFO] Reply from ptd: 'Ranges,1,2.000000,1,300.000000'
ptd-server 2023-06-20 17:19:53,841 [INFO] Initial range for Amps is Auto for Volts is Auto
ptd-server 2023-06-20 17:19:53,841 [INFO] Sending to ptd: 'SR,V,300.0'
06-20-2023 22:19:53.841: Volt range set to 300.000000V
06-20-2023 22:19:55.342: Response to client sent: Range V changed
ptd-server 2023-06-20 17:19:55,343 [INFO] Reply from ptd: 'Range V changed'
ptd-server 2023-06-20 17:19:55,343 [INFO] Sending to ptd: 'SR,A,None'
06-20-2023 22:19:55.343: ERROR: invalid range 0.000000A requested
06-20-2023 22:19:55.343: Response to client sent: Error setting range
ptd-server 2023-06-20 17:19:55,344 [INFO] Reply from ptd: 'Error setting range'
ptd-server 2023-06-20 17:19:55,344 [ERROR] Error setting current range: None
ptd-server 2023-06-20 17:19:55,344 [INFO] Sending to ptd: 'Stop'
06-20-2023 22:19:55.344: Response to client sent: Error: no measurement to stop
ptd-server 2023-06-20 17:19:55,345 [INFO] Reply from ptd: 'Error: no measurement to stop'
ptd-server 2023-06-20 17:19:55,345 [INFO] Sending to ptd: 'SR,V,Auto'
06-20-2023 22:19:55.345: Volt range set to Auto
06-20-2023 22:19:55.848: Response to client sent: Range V changed
ptd-server 2023-06-20 17:19:55,848 [INFO] Reply from ptd: 'Range V changed'
ptd-server 2023-06-20 17:19:55,848 [INFO] Sending to ptd: 'SR,A,Auto'
06-20-2023 22:19:55.849: Ampere range set to Auto
06-20-2023 22:19:56.351: Response to client sent: Range A changed
ptd-server 2023-06-20 17:19:56,351 [INFO] Reply from ptd: 'Range A changed'
ptd-server 2023-06-20 17:19:56,352 [INFO] Set initial values for Amps Auto and Volts Auto
ptd-server 2023-06-20 17:19:56,352 [INFO] Stopping ptd...
06-20-2023 22:19:56.352: No data returned by socket read.
06-20-2023 22:19:56.352: Shutting connection down...
06-20-2023 22:19:56.352: Connection is down.
06-20-2023 22:19:56.352: Waiting for a connection...
ptd-server 2023-06-20 17:19:56,354 [INFO] Sending reply to client 'Error setting current range: None'
ptd-server 2023-06-20 17:19:56,355 [INFO] Connection closed
ptd-server 2023-06-20 17:19:56,356 [WARNING] Client connection closed unexpectedly
ptd-server 2023-06-20 17:19:56,359 [INFO] Done processing
arjunsuresh commented 1 year ago

Thank you @psyhtest for sharing the log. The command from the client side looks fine.

ptd-server 2023-06-20 17:19:36,805 [INFO] Got command from the client 'session,2023-06-20_17-19-36,start,testing,300.0,5.0'

For the above command, the server should be setting current to 5

Is the issue happening also with device_type=0(No real power analyzer)? Can you also please share the commit hash of the power-dev repository on the server running the PTD?

psyhtest commented 1 year ago

Can you also please share the commit hash of the power-dev repository on the server running the PTD?

When I reported this, both the server and client were running the following:

alokhmot@squawkbox:/prj/crd/austin/validation/common/yokogawa/power-dev/ptd_client_server$ git log -1                                                                                                           commit 165c0b03be29ec884770b9320ab7b26fc1fcb050 (HEAD -> master, origin/master, origin/HEAD)
Merge: 1bfb6a9 59cc757
Author: Arun Tejusve Raghunath Rajan <74993399+araghun@users.noreply.github.com>                                                                                                                                Date:   Tue May 23 15:31:15 2023 -0700

    Merge pull request #309 from arjunsuresh/patch-3

    Update sync.yml
    Discussed in 5/23 PowerWG meeting. This seems to be a bug fix of previous approved GitHub action.

Let me update to the latest and try again.

arjunsuresh commented 1 year ago

Thank you @psyhtest for sharing. I guess there's no major code change since then. Can you please confirm if device_type 0 is seeing this error? I'm not able to test with a real power analyzer for now.

psyhtest commented 1 year ago

I guess there's no major code change since then.

I thought so too, but then I updated to the latest on both sides:

alokhmot@squawkbox:/prj/crd/austin/validation/common/yokogawa/power-dev/ptd_client_server$ git log -1                                                                                                           commit 165c0b03be29ec884770b9320ab7b26fc1fcb050 (HEAD -> master, origin/master, origin/HEAD)
Merge: 1bfb6a9 59cc757
Author: Arun Tejusve Raghunath Rajan <74993399+araghun@users.noreply.github.com>                                                                                                                                Date:   Tue May 23 15:31:15 2023 -0700

    Merge pull request #309 from arjunsuresh/patch-3

    Update sync.yml
    Discussed in 5/23 PowerWG meeting. This seems to be a bug fix of previous approved GitHub action.

and it seems to be working fine! Let me give it some more checking over the next couple of days.

Can you please confirm if device_type 0 is seeing this error?

Do you I need to change the server.conf file for that and re-run, or is there a magic flag somewhere else?

arjunsuresh commented 1 year ago

oh, I believe what could be happening is that the power server might have been running with an older version of the code. There was no update to the server.py since May 23. Probably this issue is also related. This can be confirmed by seeing the server.json file in the results directory as it'll have the source checksums.

"Do you I need to change the server.conf file for that and re-run, or is there a magic flag somewhere else?"

MLCommons is not providing a server.conf file. So what we do is to dynamically generate this server.conf file based on the user inputs. But if you have this file static, this can be changed in it.

arjunsuresh commented 1 year ago

@psyhtest Can you please close both these issues? For peace of mind I would like to unfollow this git repository :)

psyhtest commented 1 year ago

Closed with thanks :)