slaclab / pysmurf

Other
2 stars 9 forks source link

Use new registers which give the state of the configuration process during setup #462

Closed jesusvasquez333 closed 4 years ago

jesusvasquez333 commented 4 years ago

Describe the problem

In recent PRs (#459 and #461) I added two new registers to the tree, under the SmurfApplication device. These register give information about the state of the configuration process which is trigger when calling the setDefaults command. These new register are:

So, I propose to use this new registers in the set_defaults_pv method, instead of the current way of calling the method and then wait.

Describe the solution you'd like

An example on how these variables can be used, could be something like this:

def set_defaults():
    """
    Configure the system.

    This method returns True if the system was configured correctly.
    """

    # Optional, measure how long the process take
    start_time = time.time()

    # Start by calling the 'setDefaults' command. Set the 'wait' flag
    # to wait for the command to finish, although the server usually 
    # gets unresponsive during setup and the connection is lost.
    epics.caput(f'{epics_prefix}:AMCc:setDefaults', 1, wait=True)

    # Now let's wait until the process is finished. We define a maximum
    # time we will wait, 400 seconds in this case, divided in smaller 
    # tries of 10 second each
    max_timeout=400
    caget_timeout=10
    num_retries=int(max_timeout/caget_timeout)
    success=False
    for i in range(num_retries):
        # Try to read the status of the "ConfiguringInProgress" flag.
        ret = epics.caget(f'{epics_prefix}:AMCc:SmurfApplication:ConfiguringInProgress', as_string=True, timeout=caget_timeout)

        # We successfully exit the loop when we are able to read
        # the "ConfiguringInProgress" flag and it is set to "False".
        # Otherwise we keep trying.
        if ret == 'False':
            success=True
            break

    # If after out maximum defined timeout, we weren't able to read the 
    # "ConfiguringInProgress" flags as "False", we error on error.
    if not success:
        print('ERROR. The system configuration did not finished after {max_timeout} seconds.') # This should go to the logs instead.
        return False

    # Optional, measure how long the process take
    end_time = time.time()

    # At this point, we determine that the configuration sequence ended in the server via the
    # "ConfiguringInProgress" flag. The final status of the configuration sequence is available
    # in the "SystemConfigured" flag. So, let's read it and use it as out return value
    success = epics.caget(f'{epics_prefix}:AMCc:SmurfApplication:SystemConfigured', as_string=True)
    print(f'System configuration finished after {int(end_time - start_time)} seconds. The final state was {success}') # Optional and should go to the logs instead.
    return success

Then other functions calling this method should use its return value to determine is the system was configured correctly. Subsequent function calls shouldn't continue of this one fails.

swh76 commented 4 years ago

@jesusvasquez333 to get the version with the new epics registers (e.g. ConfiguringInProgress and SystemConfigured) do I just update the release scripts and checkout a new v4.0.0 dev_fw? Or do I need to run with a dev_sw version?

swh76 commented 4 years ago

We should make sure the solution to this issue is backwards compatible ; suggest polling at the start for one of the new PVs and if that times out, configuring the way we currently do (by just hitting setDefaults).

jesusvasquez333 commented 4 years ago

@swh76 these changes are not in a tagged release yet, so we will need to use a software development mode to test it.

Regarding backward compatible: An alternative is to read the pysmurf version available in the register SmurfApplication.SmurfVersion. Of course, this will work only once we do a new release. For example let's say v4.1.0 is the new version, then:

swh76 commented 4 years ago

Got it, thanks.

swh76 commented 4 years ago

@jesusvasquez333 @ruck314 Mostly implemented in branch issue462, but ran into a small issue ; I see approx. a 5sec lab between when ConfiguringInProgress clears and when SystemConfigured returns True. I'll add a wait for now but might want to have ConfiguringInProgress only -> False after SystemConfigured is updated, if possible.

swh76 commented 4 years ago

@jesusvasquez333 @ruck314 Also, should setting setDefaults=1 clear the SystemConfigured flag?

jesusvasquez333 commented 4 years ago

@swh76 I will take a look. That delay is not expected

jesusvasquez333 commented 4 years ago

@swh76 I did a quick test using camonitor from bash, and it looks correct:

cryo@smurf-srv19:/$ camonitor  smurf_server_s2:AMCc:SmurfApplication:SystemConfigured smurf_server_s2:AMCc:SmurfApplication:ConfiguringInProgress
smurf_server_s2:AMCc:SmurfApplication:SystemConfigured 2020-07-21 20:10:08.566601 True
smurf_server_s2:AMCc:SmurfApplication:ConfiguringInProgress 2020-07-21 20:10:08.567282 False
smurf_server_s2:AMCc:SmurfApplication:SystemConfigured 2020-07-21 20:10:08.567081 True
smurf_server_s2:AMCc:SmurfApplication:ConfiguringInProgress 2020-07-21 20:10:08.570074 False

<<<< called setDefaults here  >>>

smurf_server_s2:AMCc:SmurfApplication:SystemConfigured 2020-07-21 20:10:43.568072 *** disconnected
smurf_server_s2:AMCc:SmurfApplication:ConfiguringInProgress 2020-07-21 20:10:43.568120 *** disconnected
CA.Client.Exception...............................................
    Warning: "Virtual circuit unresponsive"
    Context: "smurf-srv19.slac.stanford.edu:38401"
    Source File: ../tcpiiu.cpp line 920
    Current Time: Tue Jul 21 2020 20:10:43.568040572
..................................................................
smurf_server_s2:AMCc:SmurfApplication:SystemConfigured 2020-07-21 20:10:56.834265 True
smurf_server_s2:AMCc:SmurfApplication:ConfiguringInProgress 2020-07-21 20:10:56.834317 False
smurf_server_s2:AMCc:SmurfApplication:ConfiguringInProgress 2020-07-21 20:10:56.902352 False
smurf_server_s2:AMCc:SmurfApplication:SystemConfigured 2020-07-21 20:10:56.902432 True
smurf_server_s2:AMCc:SmurfApplication:SystemConfigured 2020-07-21 20:10:57.221892 True
smurf_server_s2:AMCc:SmurfApplication:ConfiguringInProgress 2020-07-21 20:10:57.221961 False

When I call the setDefaults command, the server becomes unresponsive and the PVs get disconnected. But the process finished, both SystemConfigured and ConfiguringInProgress are et almost at the same time.

jesusvasquez333 commented 4 years ago

And this is what I see when running the script I added in the description of this issue:

cryo@smurf-srv19:/$ /shared/test.py
Setting defaults...
CA.Client.Exception...............................................
    Warning: "Virtual circuit unresponsive"
    Context: "smurf-srv19.slac.stanford.edu:38401"
    Source File: ../tcpiiu.cpp line 920
    Current Time: Tue Jul 21 2020 20:22:30.125183348
..................................................................
System configuration finished after 42 seconds. The final state was True

So, I don't see the delay you mentioned.

jesusvasquez333 commented 4 years ago

@swh76 and yes, calling setDefaults clear SystemConfigured.

This what happens when setDefaults is called, as defined here:

swh76 commented 4 years ago

Ah great re: calling setDefaults clear SystemConfigured. I think I just got fooled because the gui doesn't always update during the setDefaults call. Thanks!