Closed jlashner closed 6 months ago
I added the 'hold' option to the process_action
.
The combination of on/off has been changed -- Kyohei found more efficient operation to stop the HWP rotation.
https://github.com/simonsobs/socs/blob/61d971fda05564bd84a90787c813efd79ebe9fe2/socs/agents/hwp_pcu/agent.py#L41-L43
Many thanks for updating the pcu agent!
I tested the send_command
function with the pcu instrument and it worked as expected.
However, _get_and_publish_data
does not seem to work. Nothing is published, but there is no error message.
The pcu uses a relay module and I have defined get_status
according to the sample code:
https://github.com/numato/samplecode/blob/master/RelayAndGPIOModules/USBRelayAndGPIOModules/python/usbrelay1_2_4_8/relayread.py
This command does not seem to be very stable and often takes a long time.
The problem with _get_and_publish_data
may be due to get_status
being unstable.
So I recommend to publish the status as the last sent command. I would like to have your opinion.
Hi Junna, no problem!
I do think reading the status from the relays instead of using the last sent command is very important. It will make the data from the agent a lot more trustworthy, especially in cases where:
I suggest we invest some time to understand why the get_status
function is not behaving properly.
I got the importance of reading the status.
I made _get_and_publish_data
work by increasing the sampling rate to every 30 sec.
https://github.com/simonsobs/socs/blob/61d971fda05564bd84a90787c813efd79ebe9fe2/socs/agents/hwp_pcu/agent.py#L130C1-L130C35
This is the published status when I set the operation mode as hold --> off --> on_1 --> hold.
| 2023-12-21 14:25:00 | hold
| 2023-12-21 14:24:00 | hold
| 2023-12-21 14:23:00 | hold
| 2023-12-21 14:22:00 | failed
| 2023-12-21 14:21:00 | failed
| 2023-12-21 14:20:00 | failed
| 2023-12-21 14:19:00 | on_1
| 2023-12-21 14:18:00 | on_1
| 2023-12-21 14:17:00 | failed
| 2023-12-21 14:16:00 | failed
| 2023-12-21 14:15:00 | off
| 2023-12-21 14:14:00 | hold
Although it takes a few minutes to obtain the correct status immediately after changing the operating mode, the response becomes stable thereafter. If you are comfortable with this sample rate, this may be the solution.
Thanks Junna, this is much better! Do you think we can log the "response" in the cases where it's failing so we can get a better idea why? Other than that I think this is fine to merge. It may be worth coming back later and figure out how to make get_status more reliable though.
Actually, I was testting reliable get_status, and I could do it by the following script. It also takes more time to send command but acceptable, I think. For some reason, I needed to switch all 8 relays to get_status correctly.
I'm happy to implement more reliable get_status later. But I would like to use this agent for satp3 observations ASAP.
import serial
import argparse
import time
import numpy as np
patterns = {
'off': [0, 0, 0, 0, 0, 0, 0, 0],
'on_1': [1, 1, 1, 0, 0, 1, 1, 1],
'on_2': [1, 1, 1, 0, 0, 1, 0, 0],
'stop': [0, 1, 1, 0, 0, 1, 0, 0],
'hold': [1, 1, 1, 0, 0, 1, 0, 0],
}
class pcu:
def __init__(self, port="/dev/ttyACM0"):
self.PCU = serial.Serial(
port=port,
baudrate=19200,
timeout=1,
bytesize=serial.EIGHTBITS,
stopbits=serial.STOPBITS_ONE
)
def close(self):
self.PCU.close()
def wait(self):
time.sleep(.1)
def read(self):
msg = self.PCU.read_until(b'\n\r').strip().decode()
out = self.PCU.read_until(b'\n\r').strip().decode()
return msg, out
def get_status(self):
self.PCU.write(str.encode('relay readall\n\r'))
self.wait()
out = int(self.read()[1], 16)
print('pattern', [(out >> i & 1) for i in range(8)])
for k, v in patterns.items():
v = np.sum([vv*(2**i) for i, vv in enumerate(v)])
if out == v:
print('current state is '+k)
return
print('current state is undefined')
def test(self, command):
pattern = patterns[command]
for j, i in enumerate(pattern):
self.PCU.write(str.encode("relay "+{0:'off', 1:'on'}[i]+' '+str(j)+"\n\r"))
self.wait()
print(self.read())
self.wait()
print(command + ' is done')
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('command', type=str,)
args = parser.parse_args()
pcu = pcu()
pcu.test(args.command)
pcu.get_status()
pcu.close()
I guess the reason why the current get_status sometimes fails because there is exception for the
response = self.port.read(25)
response = response.decode('utf-8')
I'm not sure if this is the best way, but following method in my script worked as long as I tested.
def read(self)
msg = self.PCU.read_until(b'\n\r').strip().decode()
out = self.PCU.read_until(b'\n\r').strip().decode()
self.PCU.write(str.encode('relay readall\n\r'))
out = int(self.read()[1], 16)
@jlashner Can we merge this branch now? I do understand the importacne of get_status, but I would like to start using this agent in the satp3 observation over the holidays. Start using this agent is really important step for the observation using HWP now. I will be carefully watching the satp3 observation until we implement the get_status.
Hi kyohei, yep! Totally fine with merging this now and then figuring out remaining issues after vacation. Good luck!
I realize it may not have been clear what I was saying about restructuring the PCU agent to avoid timeout locks since we still don't have any examples, so thought I'd throw it into a branch here for people to see and use if they'd like.
Description
Restructure HWP PCU to use new lockless agent structure. I think this is a easier way to structure the agent, and should solve all of the issues that have to do with lock-acquisition, and generally make the agent more threadsafe.
Motivation and Context
Solves locking issues, and serves as an example for timeout lockless agents
How Has This Been Tested?
Not tested
Types of changes
Checklist: