Closed acuthber closed 5 months ago
It seems this is by design. The rule Context uses threading.local() which prevents pickling, eg:
import threading
import pickle
class Obj:
def __init__(self):
self.tls = threading.local()
pickle.dumps(Obj())
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[115], line 8
5 def __init__(self):
6 self.tls = threading.local()
----> 8 pickle.dumps(Obj())
TypeError: cannot pickle '_thread._local' object
I have no idea if this is a terrible idea but creating a custom context and overridingself._thread_local
seems to work provided we also use dill
rather than pickle
from rule_engine.engine import _ThreadLocalStorage
class MockTls:
def __init__(self):
self.storage = _ThreadLocalStorage()
class CustomRuleContext(rule_engine.Context):
def __init__(self, *args, **kwargs):
predefined_functions = kwargs.pop("predefined_functions", {})
super().__init__(*args, **kwargs)
self.builtins = rule_engine.engine.builtins.Builtins.from_defaults(
predefined_functions,
timezone=self.default_timezone,
)
self._thread_local = MockTls()
c = CustomRuleContext()
rule = rule_engine.Rule("test", context=CustomRuleContext())
dill.dumps(rule)
It is neglible to just recreate the context as well as the rules (in my case < 100) within every new process the process pool creates. I do not think it is worth the hassle to allow rules/context to be pickled.
I have a lot of data I need to run through rules and wanted to speed up with multiprocessing. I am facing an issue with pickling rules with both pickle and dill.
It looks to be because of the rule builtins from the rule context. I am unsure if there is a way around it so wanted to ask if you had any insight. Thanks!
Using dill I get more info on what is wrong: