fsspec / s3fs

S3 Filesystem
http://s3fs.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
857 stars 271 forks source link

set_session does not seem to be thread / jobs safe #854

Open GuichardVictor opened 5 months ago

GuichardVictor commented 5 months ago

Most functions call _s3_call which calls set_session. The function goal is to "connect" to aws.

In the context of multiple jobs (or batch_size > 1), each jobs will race to create the session despite the check if the session was already created.

Using process authentification, assume role and source_profile in the aws config, loading the credentials will fail with InfiniteLoopConfigError which can be replicated with the following example:

import botocore.session
import botocore.credentials

# this part will be done once by s3fs
session = botocore.session.Session(profile="profile")
resolver = botocore.credentials.create_credential_resolver(session, region_name=None)

# this part might be runned multiple times due to a potential race condition
credentials = resolver.load_credentials() # <--- works
credentials = resolver.load_credentials() # <--- fails

where profile depends on an other profile.

One way to fix this is to use asyncio.Lock when creating the session to ensure that only one job will create the session.

I'm open to contribute if this is something we want to fix.

martindurant commented 5 months ago

Since you are talking about "jobs", do you mean you have threads running?

Agree, an syncio lock would be reasonable. The only await is at the call for s3creator, but load_credentials() happens earlier. I wonder if there's an asyncio clean way to say: "I have to await this, but I don't really want to yield control". Otherwise, we need to pass this new Lock object around.

GuichardVictor commented 5 months ago

On my experiment it was only multiple asyncio futures that was running (using the batch size argument of the async file system of fsspec). I have added a asyncio.Lock and it seemed to fixed the problem. That's why I opened the issue.

martindurant commented 5 months ago

You are welcome to propose that change as a PR

GuichardVictor commented 5 months ago

Sure, I will open a PR this weekend.