sphuber / aiida-s3

AiiDA plugin that provides various storage backends that allow using cloud data storage services, such as AWS S3 and Azure Blob Storage.
MIT License
3 stars 2 forks source link

`S3RepositoryBackend` has incorrect implementation for `list_objects` #15

Closed sphuber closed 1 year ago

sphuber commented 1 year ago

The list_objects method uses the Client.list_objects method which only returns up to a 1000 objects by default according to the documentation. There is the MaxKeys argument to increase this number, but it is not clear if we could always get all the keys. Perhaps all keys should be retrieved in chunks. Note, however, that this operation is typically very expensive for an object storage server. But AiiDA current requires this method for archive imports and exports, as well as storage backend maintenance operations, so we are forced to implement it.

sphuber commented 1 year ago

We should probably also drop list_objects method in favor of list_objects_v2. The former is deprecated and only kept for backwards compatibility. The latter also improves pagination making it easier to retrieve all objects using the continuation marker. (See https://stackoverflow.com/questions/37534077/what-is-the-difference-between-boto3-list-objects-and-list-objects-v2)