Open dstoychev opened 3 years ago
On the title you say "Cannot use functions in device_server config file" does that mean it works if you use a class instead of a function?
Which Python version are you using?
Is this a new system, a new installation of Python, or did it stop working after a recent change in Microscope?
If this is due to a recent change in Microscope, can you find the commit that introduced it?
It may be this python bug https://bugs.python.org/issue38263
I narrowed down the issue to the fact that under Windows DeviceServer
objects, being subclassed from multiprocessing.Process
, need to be pickleable. This doesn't work for functions defined in the config file because functions are pickled by fully qualified name reference rather than value, which is incompatible with the way we load the config. I believe this is also why the traceback says that the module config
could not be imported; config
is the name of the variable that stores the loaded config file. In any case, if the function is defined in one of the files of the microscope package then there are no issues. The PermissionError in the traceback is a red herring, a consequence of the pickling error.
Maybe this restriction (function need to be passed as reference rather than value) should be clarified in the documentation or maybe it could be addressed by custom pickling method (__reduce__
) in DeviceServer
.
[D]oes that mean it works if you use a class instead of a function?
Yes
Which Python version are you using?
3.8.5
Is this a new system, a new installation of Python, or did it stop working after a recent change in Microscope?
First time trying this feature.
If this is due to a recent change in Microscope, can you find the commit that introduced it?
I believe it's always been broken.
I see. This is also because Windows uses the processing start method "spawn" (in Linux the default is "fork"). If I force the processing start method in Linux to be "spawn", I can reproduce the problem.
$ device-server microscope-controllers.py
Traceback (most recent call last):
File "/home/carandraug/.local/bin/device-server", line 11, in <module>
load_entry_point('microscope', 'console_scripts', 'device-server')()
File "/home/carandraug/src/python-microscope/microscope/device_server.py", line 586, in _setuptools_entry_point
return main(sys.argv)
File "/home/carandraug/src/python-microscope/microscope/device_server.py", line 573, in main
serve_devices(devices)
File "/home/carandraug/src/python-microscope/microscope/device_server.py", line 451, in serve_devices
servers[-1].start()
File "/usr/lib/python3.7/multiprocessing/process.py", line 112, in start
self._popen = self._Popen(self)
File "/usr/lib/python3.7/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/usr/lib/python3.7/multiprocessing/popen_spawn_posix.py", line 32, in __init__
super().__init__(process_obj)
File "/usr/lib/python3.7/multiprocessing/popen_fork.py", line 20, in __init__
self._launch(process_obj)
File "/usr/lib/python3.7/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/usr/lib/python3.7/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <function make_controller at 0x7fae8e6d9378>: import of module 'config' failed
There is a different implementation of the multiprocessing library that uses dill
instead of pickle
and it may be able to cope with this corner case: https://github.com/uqfoundation/pathos
But I think it's better to modify the DeviceServer
class to be pickleable. The way to do this is to implement the __getstate__
and __setstate__
methods. As long as the class has knowledge of the location of the config file (as you know, further made necessary because of the logging), the following should work:
def __getstate__(self):
state = self.__dict__.copy()
if state["_device_def"]["cls"].__module__ == "config":
# The callable object came from the config file => substitute it
# with a something that could be pickled, such as a tuple
state["_device_def"]["cls"] = (
self._options.config_fpath,
state["_device_def"]["cls"].__name__,
)
return state
def __setstate__(self, state):
if isinstance(state["_device_def"]["cls"], tuple):
# Restore the callable object.
# Tuple format is (config path, callable name)
config = _load_source(state["_device_def"]["cls"][0])
callable_obj = getattr(
config, state["_device_def"]["cls"][1], None
)
state["_device_def"]["cls"] = callable_obj
self.__dict__.update(state)
But I think it's better to modify the DeviceServer class to be pickleable. The way to do this is to implement the getstate and setstate methods. As long as the class has knowledge of the location of the config file [...]
I've looked into this solution and I think it has the following two issues:
a separate code path for functions reloads the config file which makes changes in the global namespace. For example, changing Pyro configuration on the config file will be reloaded for functions but not for classes. Would be better if the behaviour is the same for both (I think we need to reload the config file in both cases)
this change requires that the device is defined in a module named config
. This works as long as we are using the device_server
program because that's where we load it but will not work if anyone is integrating DeviceServer
on their own program because they would just pass a device definition that they constructed we don't know how.
That said, I think your approach is correct in that DeviceServer
needs to read the config file again, it's just that it needs to be done for all devices. So DeviceServer
just takes the path for the config file, or possibly a string, or maybe different DeviceServerX
classes for different options. This does mean that loading the config file multiple times needs to always return the same DEVICES
.
Just to recap (mainly for my benefit): the issue here is that DeviceServer
objects will be ran by sub-processes, which could have different scopes than the main process. Therefore, dynamically created objects, such as the importing of the config file, will be absent in the sub-processes.
I completely agree with the two issues you raise. I think the best solution would be to have some sort of initialisation hook for DeviceServer
objects. It will be called in the __setstate__
method mentioned above. This way users will be able to precisely define the initial state of the DeviceServer
instance. For us, in device_server
this would be the reloading of config file, but other users may want more involved behaviour like modifying the PYRO parameters, as you mention. The implementation of this hook seems quite simple in my head: just add another argument of type Callable
to the DeviceServer
class.
I don't really need this any longer, but it still seems like a cool thing to have and not difficult to add either.
Here is the config file that I pass to device_server:
And here is the error I am getting: