bakwc / PySyncObj

A library for replicating your python class between multiple servers, based on raft protocol
MIT License
706 stars 113 forks source link

Pickle dump issue #147

Open naluka1994-zz opened 3 years ago

naluka1994-zz commented 3 years ago

pickle.dump(obj, file, __protocol) TypeError: can't pickle _thread.RLock objects

This issue is happening with python3.7 version.

bakwc commented 3 years ago

What are you trying to achive? Could you provide more details please?

naluka1994-zz commented 3 years ago

I have a sample code which should continuously run all the time. To make this highly available, I am inheriting pysync obj in my class. After this I am connecting 3 servers, where the leader gets elected and the leader has to do some job.

I am using Python 3.7.9 for doing this. The code runs for sometime and then I get the following issue.

Exception in thread Thread-1:
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py", line 926, in _bootstrap_inner
    self.run()
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/name/workspace/envs/genv/lib/python3.7/site-packages/pysyncobj/syncobj.py", line 508, in _autoTickThread
    self._onTick(self.__conf.autoTickPeriod)
  File "/Users/name/workspace/envs/genv/lib/python3.7/site-packages/pysyncobj/syncobj.py", line 614, in _onTick
    self.__tryLogCompaction()
  File "/Users/name/workspace/envs/genv/lib/python3.7/site-packages/pysyncobj/syncobj.py", line 1238, in __tryLogCompaction
    self.__serializer.serialize((data, lastAppliedEntries[1], lastAppliedEntries[0], cluster), lastAppliedEntries[0][1])
  File "/Users/name/workspace/envs/genv/lib/python3.7/site-packages/pysyncobj/serializer.py", line 70, in serialize
    pickle.dump(data, g)
  File "/Users/name/workspace/envs/g

Can you please help me in fixing this issue ?.

bakwc commented 3 years ago

You should declare all non-serializable fields after calling super. Could you post your class code?

naluka1994-zz commented 3 years ago

This is how my class looks like. I see I have serializable fields after calling super class. To resolve - > Can I separately initialize the sync obj and then pass to my class. If I am passing this syncObj to the method in class, will this object value change when the leader change ?

class myclass(SyncObj):
    def __init__(self, config, currentHost,partners):
        super(myclass, self).__init__(currentHost, partners)
        this_dir = os.path.dirname(os.path.realpath(__file__))
        for ssh_config in config["ssh_configs"].values():
            key_filename = ssh_config["key_filename"]
            if not key_filename.startswith("/"):
                key_filename = os.path.join(this_dir, key_filename)
            ssh_config["key_filename"] = os.path.realpath(key_filename)
        self.default_exec_timeout = config["default_exec_timeout"]
        self.default_read_timeout = config["default_read_timeout"]
        self._ssh_configs = config["ssh_configs"]
        self._registration_lock = threading.RLock()
        self.message_queue = queue.Queue()
        self.currentHost = currentHost

   def doJob(self, value):
         if self._getLeader() == self.currentHost:
               insertdata(value)
   test = myclass(config,currentHost,partners)
   i = 0
   while True:
        i = i+1
        test.doJob(i)
bakwc commented 3 years ago

You better make a separate class with fields required to synchronize.

class myclass(SyncObj):
    def __init__(self, config, currentHost,partners):
        super(myclass, self).__init__(currentHost, partners)
        self.message_queue = queue.Queue()

   @replicated
   def enqueue(self, someObject):
        ...
naluka1994-zz commented 3 years ago

I don't want to synchronize, I want to have some specific configuration fields in a class which has serializable values. How to do that ?.

bakwc commented 3 years ago

Sory, I don't understand what you try to achive. You can't use SynObj the way you are trying to do it. SyncObj is a libary for replicating state machines, described as a Python Class. It can not replicate threading.RLock(). Also you should not pass any values from local configs, except with the help of replicated methods. You need to make some redesign of your architecture to make it fault-tolerant. Or just use existing batteries. There is already a replicated queue, you can use it if you need a queue.

naluka1994-zz commented 3 years ago

Local config values are some values which will be used in the code. Actually they are constants which are read from a config file. Instead of mixing the sync object with my class, I am passing the sync obj in the myclass method that I want to use. This way I don't see the previous error. Let me know if I can do the way of passing the sync obj in the myclass method ?.

naluka1994-zz commented 3 years ago

@bakwc I hosted the code on three servers which has its own hostname and a port number 5000. Passing localhost:5000 works fine, but passing remotehost:5000 is not working. Can you please help, if its the correct configuration to pass ?

bakwc commented 3 years ago

Does you external IP address matches network adapter address (ifconfig output)? If not - you should use your external addresses as partnerAddresses, and specify bindAddress in config like this:

cfg = SyncObjConf()
cfg.bindAddress = '172.31.91.119:9999'
naluka1994-zz commented 3 years ago

@bakwc

Can you given example for host1, host2, host3 and let's say 5000 port is open on all hosts.

My external ip address matches with ifconfig output.

class MyCounter(pysyncobj.SyncObj):
    def __init__(self, selfAddr, otherAddrs, **kwargs):
        super(MyCounter, self).__init__(selfAddr, otherAddrs, **kwargs)
        self._counter = 0

Can you please give an example of how to do with this given hosts host1,host2,host3 ?. How does the selfAddr and otherAddrs look like ? what is this bindAddress and If I need to specify dynamically how can I do ?