Closed EmanuelaBoros closed 8 months ago
It seems that it works locally, but not on a RunAI node. I am looking into it.
FAILED tests/test_path_s3.py::test_s3_iter_bucket - AttributeError: 'NoneType' object has no attribute 'name'
FAILED tests/test_path_s3.py::test_s3_filter_archives - AttributeError: 'NoneType' object has no attribute 'name'
FAILED tests/test_path_s3.py::test_s3_filter_archives_timebucket - AttributeError: 'NoneType' object has no attribute 'name'
FAILED tests/utils/test_kube.py::test_dask_cluster - kubernetes.config.config_exception.ConfigException: Invalid kube-config file. No configuration found.
FAILED tests/utils/test_s3.py::test_get_s3_versions - AttributeError: 'NoneType' object has no attribute 'get_all_keys'
FAILED tests/utils/test_s3.py::test_read_jsonlines - AttributeError: 'NoneType' object has no attribute 'name'
FAILED tests/utils/test_s3.py::test_load_config - TypeError: argument of type 'NoneType' is not iterable
I found the issue. I do not have the rights to list all buckets.
def get_bucket(name, create=False, versioning=True):
"""Create a boto s3 connection and returns the requested bucket.
It is possible to ask for creating a new bucket
with the specified name (in case it does not exist), and (optionally)
to turn on the versioning on the newly created bucket.
>>> b = get_bucket('testb', create=False)
>>> b = get_bucket('testb', create=True)
>>> b = get_bucket('testb', create=True, versioning=False)
:param name: the bucket's name
:type name: string
:param create: creates the bucket if not yet existing
:type create: boolean
:param versioning: whether the new bucket should be versioned
:type versioning: boolean
:return: an s3 bucket
:rtype: `boto.s3.bucket.Bucket`
.. TODO:: avoid import both `boto` and `boto3`
"""
conn = get_s3_connection()
# try to fetch the specified bucket -- may return an empty list
bucket = [b for b in conn.get_all_buckets() if b.name == name]
This method assume that one has this right. However, I would propose to change this method to a direct connection to the specified bucket instead of looking the bucket up with get_all_buckets().
This works on my side:
# List the contents of the bucket
response = s3.list_objects(Bucket='rebuilt-data')
for content in response.get('Contents', []):
print(content['Key'])
I propose to talk about this, in case I am wrong.
The buckets cannot be retrieved + an import. The import can be solved (it seems that it was renamed) - I will do a PR. However, the fact that
return_bucket
returnsNone
is unclear from where it comes (I can connect with s3cmd with no issues).