The issue was there all the time, but became critical in 9950dd05 with version 0.3.3.
class _S3ConfigurationMap is not thread safe because both set/get_configuration() ask for self.arguments and self.resources being None. However, since the configuration map is a singleton, when using the library in a multithreaded environment, it can happen that half the part of _initial_setup has run and self.arguments is indeed not None, but self.resources is still None because in another thread, creation of the boto3 resource is still ongoing. Then, our thread would not call _initial_setup(), but still die on self.resources being None when trying to access it, leading to the error described in #97.
I could not make the bug appear without introducing the IO delay by creating the resource in _initial_setup(). Maybe it is a CPython implementation detail that makes _initial_setup() atomic or incredibly rare to not be atomic without the creation of the boto resource.
Anyways, the setup is now synchronised with a lock, which is required, again, given that _S3ConfigurationMap is a singleton.
Here is the code that reliably triggers the bug without the change:
from multiprocessing.pool import ThreadPool
from s3path import S3Path
def do(i):
dst = S3Path.from_uri('s3://something') / f'hello_{i}.txt'
dst.write_bytes(b'hello')
ThreadPool(processes=20).map(do, range(100))
The issue was there all the time, but became critical in 9950dd05 with version 0.3.3.
class _S3ConfigurationMap
is not thread safe because bothset/get_configuration()
ask forself.arguments
andself.resources
beingNone
. However, since the configuration map is a singleton, when using the library in a multithreaded environment, it can happen that half the part of_initial_setup
has run andself.arguments
is indeed notNone
, butself.resources
is stillNone
because in another thread, creation of the boto3 resource is still ongoing. Then, our thread would not call_initial_setup()
, but still die onself.resources
beingNone
when trying to access it, leading to the error described in #97.I could not make the bug appear without introducing the IO delay by creating the resource in
_initial_setup()
. Maybe it is a CPython implementation detail that makes_initial_setup()
atomic or incredibly rare to not be atomic without the creation of the boto resource.Anyways, the setup is now synchronised with a lock, which is required, again, given that
_S3ConfigurationMap
is a singleton.Here is the code that reliably triggers the bug without the change: