pcdshub / pcdsdevices

Collection of Ophyd device subclasses for IOCs unique to LCLS PCDS.
https://pcdshub.github.io/pcdsdevices/
Other
5 stars 59 forks source link

Thread/Memory Leak in State Devices #965

Open ZLLentz opened 2 years ago

ZLLentz commented 2 years ago

Current Behavior

If the subscription status associated with a state move does not have an associated timeout, and the state never reaches the correct destination (or is never alerted as such by updates to the state signal, as when state is an AttributeSignal) then each such call to move or set creates a new thread that is never cleaned up.

Possible Solution

Some sort of enforced timeout? Some other sort of non-timeout failure state? Special handling for AttributeSignal?

Steps to Reproduce (for bugs)

  1. Create a state device with an AttributeSignal as the state component (or, any signal that does not update)
  2. Move it a bunch
  3. Watch the thread count rise

Context

XPP ran into this while manipulating their state shutter devices (xpp.devices)

Your Environment

xpp3 3/8/2022 (latest pcdsdevices, I think)

klauer commented 2 years ago

I wonder if we should track status objects and clean stale ones/their threads up somehow. Not a complete or well-thought out idea, of course. Concern just arises from long-running hutch-python sessions, and dangling resources are not great for that.

ZLLentz commented 2 years ago

We might be able to check the previous status each time we want to replace it with a new one and mark the old status as failed. This, at the very least, puts a hard cap on the number of threads we can have at once.

klauer commented 2 years ago

I think that's a great - and easy - starting point.