While AWS SDK documentation states that S3 client is thread safe, initialization of S3 client doesn't seem to be thread safe. In particular, there's some global error variable that gets accessed when parsing JSON, so if you initialize two separate VFS instances from different threads, and TSAN is enabled, you will end up with a stack trace like the one below (truncated):
Other TSAN output suggests that "Location is global 'global_error' of size 16 at 0x7fe93658a1b0 (libexternal_Saws_Slibaws.so+0x0000007dc1b0)" is to blame.
I haven't looked into this too deeply, but it'd seem that if there is a global variable, then initialization of the client itself needs to be protected by a global mutex, i.e. something like
needs to be added to init_client(). The other mutex becomes redundant in this case. Performance impact should be fairly negligible, since S3 is slow as molasses anyway.
While AWS SDK documentation states that S3 client is thread safe, initialization of S3 client doesn't seem to be thread safe. In particular, there's some global error variable that gets accessed when parsing JSON, so if you initialize two separate VFS instances from different threads, and TSAN is enabled, you will end up with a stack trace like the one below (truncated):
Other TSAN output suggests that "Location is global 'global_error' of size 16 at 0x7fe93658a1b0 (libexternal_Saws_Slibaws.so+0x0000007dc1b0)" is to blame.
I haven't looked into this too deeply, but it'd seem that if there is a global variable, then initialization of the client itself needs to be protected by a global mutex, i.e. something like
needs to be added to
init_client()
. The other mutex becomes redundant in this case. Performance impact should be fairly negligible, since S3 is slow as molasses anyway.