Open danielballan opened 4 years ago
At the first glance I have 17 new threads started once I run import ophyd.sim
: 15 of those threads are started by https://github.com/bluesky/ophyd/blob/b572a4caa2d0be29cf74f60218e684c62b0dac04/ophyd/__init__.py#L72
and the other 2 by https://github.com/bluesky/ophyd/blob/b572a4caa2d0be29cf74f60218e684c62b0dac04/ophyd/sim.py#L1434
The two threads from ophyd.sim
itself generate INFO-level log messages periodically, such that logging.basicConfig()
gets very noisy. (A downstream user reported this. We tend not to use basicConfig()
because it's a blunt instrument, but nonetheless I don't think it would be reasonable to say, "Just don't use logging.basicConfig()
.") The original motivation for this issue was to resolve that. But now I'm also curious why we see different numbers of threads.
What output do you get for this, @dmgav?
>>> import threading
>>> [t.name for t in threading.enumerate()]
['MainThread']
>>> import ophyd
^[[A[t.name for t in threading.enumerate()]
['MainThread', 'metadata', 'monitor', 'get_put', 'util0', 'util1', 'util2', 'util3', 'debug_monitor']
>>> import ophyd.sim
>>> [t.name for t in threading.enumerate()]
['MainThread', 'metadata', 'monitor', 'get_put', 'util0', 'util1', 'util2', 'util3', 'debug_monitor', 'Thread-9', 'Thread-10']
In [1]: import threading
In [2]: [t.name for t in threading.enumerate()]
Out[2]: ['MainThread', 'IPythonHistorySavingThread']
In [3]: import ophyd
In [4]: [t.name for t in threading.enumerate()]
Out[4]:
['MainThread',
'IPythonHistorySavingThread',
'selector',
'command',
'check_for_unresponsive_servers',
'retry',
'search',
'activate_subscriptions',
'selector',
'metadata',
'monitor',
'get_put',
'util0',
'util1',
'util2',
'util3',
'debug_monitor']
In [5]: import ophyd.sim
In [6]: [t.name for t in threading.enumerate()]
Out[6]:
['MainThread',
'IPythonHistorySavingThread',
'selector',
'command',
'check_for_unresponsive_servers',
'retry',
'search',
'activate_subscriptions',
'selector',
'metadata',
'monitor',
'get_put',
'util0',
'util1',
'util2',
'util3',
'debug_monitor',
'Thread-246',
'Thread-247']
Now I understand. In my environment, I happen to have pyepics installed, so ophyd creates a pyepics control layer. You do not have pyepics but you do have caproto, so ophyd falls back to that. A caproto context has more Python threads than a pyepics one does. (Pyepics has C threads, via libca, to manage these things.)
It’s unfortunate that import ophyd
unavaoidably starts a bunch of threads. Data Broker imports ophyd to get access to the handler that it ships, and as a side effect starts 10 or 17 threads. Can we defer that somehow without making things too complicated?
Deferring the ones in ophyd.sim
seems to require a small breaking change, as mentioned above: either omitting the two relevant instances from the global namespace, or requiring a new method to explicitly start their thread.
Deferring the ones in the control layer sounds harder, and might better be considered in a separate PR. (It’s possible that we should not change that.)
The objects that are causing trouble are rand
and rand2
(class SynPeriodicSignal
). One of the ways to deal with this is to add a new 'start' method in SynPeriodicSignal
that will start the thread (and start the value updates). But it will require updating all the code that is using rand
, rand2
and SynPeriodicSignal
.
The other way is to override methods of Signal
class in SynPeriodicSignal
(trigger
, get
, put
, set
, read
, value
properties to first check if the thread exists and create the thread if needed).
Good thought! Could we do both? That is, provide a public method but automatically call that public method when one of those methods(get
, put
, ...) is called? I think we'll have to include subscribe
as well.
It might be good to give the method a name that makes it clear to the reader that this is only necessary because it is a simulated detector, like start_simulation()
.
Two of the threads were addressed by #877. The others are related to starting a context in the control layer at import time. Can we instead start these threads the first time the control layer is actually used?
This is less than ideal:
I guess this is due to the various simulated signals that update on a background thread. It is worth considering an API breaking change to move these out of global scope (only create them when
hw()
is called) and/or add a special method to explicitly start their background thread.Edit: As @dmgav points out below, only two of these threads are actually due to
ophyd.sim
. It seems the rest are related to starting the control layer, with its various background threads for handling callbacks and so on. So the leakage isn't as bad as I originally feared, but it's still worth fixing.