Open Mahdi-Hosseinali opened 2 years ago
Sorry, what does "cross-environment" mean?
Sorry for the bad terminology, it is cross-account permission
, the destination account has a role that allows the other one to put objects on s3 but not to create buckets. I updated the issue.
In our example, then, does the target bucket already exist?
Yes, the target bucket already exists.
Having the same issue
OK, so certainly s3 mkdirs should not fail when the bucket exists. The problem is not having a foolproof way to know if a bucket exists, because that's yet another permission you might not have. However, we can know if
ls("")
-> fs.dircache[""]
)So we could certainly check the first two cached locations, and we could also ignore errors in mkdirs when it is called as a precursor to a write operation.
Just want to gently bump this issue—it's what's preventing us from adopting s3fs across our app.
Would you care to try with this change
--- a/s3fs/core.py
+++ b/s3fs/core.py
@@ -900,7 +900,7 @@ class S3FileSystem(AsyncFileSystem):
if not path:
raise ValueError
bucket, key, _ = self.split_path(path)
- if await self._exists(bucket):
+ if any(name.split("/", 1)[0] == bucket for name in self.dircache) or await self._exists(bucket):
if not key:
# requested to create bucket, but bucket already exist
raise FileExistsError
@@ -935,7 +935,9 @@ class S3FileSystem(AsyncFileSystem):
async def _makedirs(self, path, exist_ok=False):
try:
await self._mkdir(path, create_parents=True)
- except FileExistsError:
+ except (FileExistsError, PermissionError):
+ # PermissionError here allows for cases where user can only see some
+ # of the bucket in question
if exist_ok:
pass
ping on the thread: I have some little change above that might alleviate the situation. I am not easily able to test it, so I would appreciate if the other participants would give it a go.
similar to #451, writing to a cross-account bucket tries to create the bucket which leads to an
Access Denied
error when the process doesn't have aCreateBucket
permission. Because the least privilege is the best practice, this can happen in production for most of the cross-account s3 access. here's the stack trace: