Closed simon-at-spire closed 1 year ago
Platform: x86_64
$ pip list | grep cyclonedds
cyclonedds 0.10.2
$ python3 --version
Python 3.8.10
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.6 LTS
Release: 20.04
Codename: focal
You might want to give this a try. It doesn't win any beauty contest and it is done without sufficient knowledge about memory ordering guarantees in Python to feel confident about using double-checked locking ...
It works for me though 🙂
diff --git a/cyclonedds/idl/_main.py b/cyclonedds/idl/_main.py
index ffc004a..decd25d 100644
--- a/cyclonedds/idl/_main.py
+++ b/cyclonedds/idl/_main.py
@@ -16,6 +16,7 @@ from enum import EnumMeta, Enum
from inspect import isclass
from struct import unpack
from hashlib import md5
+import threading
from ._support import Buffer, Endianness, CdrKeyVmNamedJumpOp, KeyScanner, KeyScanResult, SerializeKind, DeserializeKind
from ._type_helper import get_origin, get_args, Annotated
@@ -52,6 +53,8 @@ class IDLNamespaceScope:
class IDL:
def __init__(self, datatype):
self._populated: bool = False
+ self._lock = threading.RLock()
+ self._populating: bool = False
self.buffer: Buffer = Buffer()
self.datatype: type = datatype
self.keyless: bool = None
@@ -67,9 +70,9 @@ class IDL:
self._xt_bytedata: Tuple[Optional[bytes], Optional[bytes]] = (None, None)
self.member_ids: Dict[str, int] = None
- def populate(self):
- if not self._populated:
- self._populated = True
+ def populate_locked(self):
+ if not self._populating:
+ self._populating = True
annotations = get_idl_annotations(self.datatype)
field_annotations = get_idl_field_annotations(self.datatype)
@@ -119,6 +122,13 @@ class IDL:
else:
self.v2_key_max_size = 17 # or bigger ;)
+ def populate(self):
+ with self._lock:
+ self.populate_locked()
+ # hopefully the memory order guarantees of Python are strong enough to make it
+ # impossible for another thread to observe a partially populated self
+ self._populated = True
+
def serialize(self, object, use_version_2: bool = None, buffer=None, endianness=None) -> bytes:
if not self._populated:
self.populate()
Thank you, it looks like it would solve the issue, but we have been starting the services one after the other so far and it is working, so this was just a bug report. Do you want me to test the diff on our platform to validate it?
it is not too much trouble to check it on your platform, then that would be great. I have only had a chance to try it on macOS and there it simply failed 100% of the time, so clearly it is less "interesting" than your platform!
Yep, I can confirm that it worked on my platform (10 success out of 10 runs) ! Thank you for taking the time to investigate!
As discussed on Discord, if I instantiate 2 dds services on the same process at the same time (in my case that was my earlier attempts at testing), there is a race condition that makes the initialisation fail some time (~ 50%)
Minimal code that reproduce the issue: Spin up 2 threads that will start at the same time and initialise their own DomainParticipant/Topic/Data{Reader/Writer}:
Might work:
or get this error: