NSLS-II / nslsii

NSLS-II related devices
BSD 3-Clause "New" or "Revised" License
10 stars 22 forks source link

Update eurotherm #46

Closed awalter-bnl closed 5 years ago

awalter-bnl commented 5 years ago

Description of issue

This PR is a response to issue #45. Essentially we can get into the situation where a callback 'hangs' because after a timeout error it is not removed from the readback signal. A second non-critical issue was raised, which is that we should be using threading.Lock objects instead of boolean objects as locks.

Resolution

  1. It adds a threading.Timer object that is used to do the timeout loop (instead of using the timeout in DeviceStatus).
  2. it replaces the previous 'boolean' lock with a threading.lock.

The purpose of (1) is that currently the callback is not removed from readback when the timeout error is raised. So we could get into the situation whereby we never remove the callback. Moving the timeout process to a threading.Timer object associated allows the callback to be removed from readback after the timeout.

Testing

The built in Travis pytest was updated to test these additions, including ensuring that the callback is removed on a timeout error.

codecov-io commented 5 years ago

Codecov Report

Merging #46 into master will increase coverage by 2.35%. The diff coverage is 91.17%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #46      +/-   ##
==========================================
+ Coverage   40.96%   43.32%   +2.35%     
==========================================
  Files          10       10              
  Lines         559      584      +25     
==========================================
+ Hits          229      253      +24     
- Misses        330      331       +1
Impacted Files Coverage Δ
nslsii/tests/temperature_controllers_test.py 100% <100%> (ø) :arrow_up:
nslsii/temperature_controllers.py 91.83% <86.95%> (+0.66%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 4f1896c...7d6fd9a. Read the comment docs.

awalter-bnl commented 5 years ago

@danielballan thanks a lot for the quick review, it is appreciated. I have now resolved all of your specified issues.

awalter-bnl commented 5 years ago

The Travis error is:

1.36s$ flake8 --max-line-length=79 ./versioneer.py:941:-13592: W605 invalid escape sequence '\s' ./versioneer.py:941:-13408: W605 invalid escape sequence '\s' ./versioneer.py:941:-13228: W605 invalid escape sequence '\s' ./versioneer.py:941:-11196: W605 invalid escape sequence '\d' ./versioneer.py:941:-8162: W605 invalid escape sequence '\d' ./nslsii/ad33.py:92:11: W605 invalid escape sequence '\d' The command "flake8 --max-line-length=79" exited with 1. 0.00s$ set -e

These files where not modifed in this PR, and I don't get the same errors when I ru nflake8 locally (I have updated flake8 locally and that did not resolve the issue). Any ideas on why my local flake8 is different to this one?

I will take a look at these on Monday

tacaswell commented 5 years ago

This is not how I have seen threading locks normally used. Rather than the lock it's self being something we check on, it is used to protect the state we care about, ex

with self.lock:
    if self._is_running:
        raise

We also have a lot of examples of using a locking wrapper on methods.

awalter-bnl commented 5 years ago

@tacaswell , thanks for the feedback. For my own learning process is there a reason that using with self._lock is better than 'checking' the self._lock prior to executing. My understanding is that they are equivalent, with the only major change the fact that I don't have to 'release' the lock at the end. As there is at least two other 'cleanup' items (i.e stopping the timer and removing the callback) to me I don't see it being a big difference.

So is this just an aesthetic thing or is there something I am missing (either way I am happy to make the change, just want to be clear myself on the reason for future work)?

awalter-bnl commented 5 years ago

Thanks @mrakitin for getting Travis to pass !-)

danielballan commented 5 years ago

As there is at least two other 'cleanup' items (i.e stopping the timer and removing the callback) to me I don't see it being a big difference.

Let's think about it. There are two differences I see:

  1. It's true that there is more cleanup to do so this doesn't save much/any code, but a context manager provides stronger guarantees that the release will be ~fun~ run [hilarious typo], akin to a finally block, which is very important. It also makes it extremely clear to human readers what the scope of the lock is, which is arguably as important.
  2. This is a more idiomatic use of a lock. A lock typically protects business-logic state rather than embodies business logic state. "Idomatic" is a bit stronger than "aesthetic": it's not only more elegant, it's easier to for practitioners to interpret and extend (i.e. add more state inside the same lock's context later).
awalter-bnl commented 5 years ago

Thanks for the explanation @danielballan, this is helpful.

Unfortuantely, despite several days and numerous searches, I was not able to get the current code to work using a context manager. The issue as I see it is that the current set method, and the inclusion of the callback that indicates the done status, does not lend itself to the following code layout:

 some_lock.acquire()
 try:
   _do something..._
 finally:
   some_lock.release()

Which I think is functionally equivalent to (based on what I have found to read online):

with some_lock:
    _do something..._

When I attempt either of these and run euro.set('some_value') the lock is immediately released, at the same time that the status object is returned. We require the lock not to be released until the time that the callback is removed.

All of the advice I have found on stack overflow, and other sites, regarding how to resolve this issue indicate that the best solution is the one currently in this PR, i.e.

if some_lock.acquire():
    _define callback, ensuring some_lock.release() is called when callback is removed_
    _subscribe callback..._

@tacaswell or @danielballan do either of you have any thoughts on a way around this problem so that we can use a context manager with the lock being released when a callback is removed?

A second, but less critical, issue related to using a context manager is that I would like to run the lock in non-blocking mode so that I can return an exception when trying to set while a set is in process. This can be resolved by creating a 'non-blocking' context manager to wrap around lock however.

danielballan commented 5 years ago

There might be a way to do it, but in the interest of moving this forward how about we stick with acquire/release, make the lock protect state rather than embody state, and as you say use non-blocking mode.

awalter-bnl commented 5 years ago

either way is fine with me

danielballan commented 5 years ago

OK, let's move ahead then.

danielballan commented 5 years ago

I think this was accidentally forgotten. @tacaswell, are you content with the current state of this?

awalter-bnl commented 5 years ago

Can I get someone (@danielballan, @tacaswell, @mrakitin ?) to take a look at this and merge if it is OK? I think it has been lost in the noise.