Provide pagination metadata so that we may perform retrievals in parallel. i.e. async listObjectsV2, we never know how many pages there are and are unable to pull a page until we know the page before it and a dynamic 'next-token'.
Increase maxKeys limit from 1000 to allow something closer to maybe 10K or more.
Use Case
highly complex system writing streaming data to S3 - on disaster recovery we need to scan S3 objects, identify key patterns and determine where we left off in order to restart at the correct location. Across something like 10Million keys.
Today max request is 1000 keys at a time, and no way to do this in parallel.
Proposed Solution
Or suggest strategies - thoughts:
Ability to modify listObjectsV2Request.maxKeys to something much > 1000.
Push down a read-ahead parameter so subsequent page retrievals can happen before we actually request the next page, would like this to happen much faster than one at a time, cannot see any workaround at the moment.
Ability to retrieve commonPrefix only when delimiter is set, ignoring the loading of actual keys. This would allow us to get a handle on our key patterns and subsequently make parallel listObjectRequests where we find a match.
Wildcard ability in the prefix. We know the start(prefix) and some of the middle portions of the key due to our partition strategy. This way we wouldn't have to list all 10Million keys and could skip a great number.
Other Information
No response
Acknowledgements
[ ] I may be able to implement this feature request
Describe the feature
Provide pagination metadata so that we may perform retrievals in parallel. i.e. async listObjectsV2, we never know how many pages there are and are unable to pull a page until we know the page before it and a dynamic 'next-token'.
Increase maxKeys limit from 1000 to allow something closer to maybe 10K or more.
Use Case
highly complex system writing streaming data to S3 - on disaster recovery we need to scan S3 objects, identify key patterns and determine where we left off in order to restart at the correct location. Across something like 10Million keys.
Today max request is 1000 keys at a time, and no way to do this in parallel.
Proposed Solution
Or suggest strategies - thoughts:
Other Information
No response
Acknowledgements
AWS Java SDK version used
2.20.79
JDK version used
17
Operating System and version
AMD64/ARM64 architectures