Open ravindk89 opened 1 year ago
@balamurugana your insight here would be helpful.
DeleteObjects S3 API enables us to delete multiple objects by limiting to maximum of 1000 objects. In minio-py
, remove_objects()
is a higher level method which supports removal of more than 1000 objects. This is implemented by sending multiple DeleteObjects S3 API
requests sequentially on which each request having 1000 objects. If there are errors from S3 server while executing these requests are yield
ed. If consumer of these yield
prefers to stop the removal of next batch could simply exit from iterating.
As we yield
the error, the execution is lazy and returned iterator must be read to continue internal DeleteObjects S3 API
requests keep firing. If you prefer more control beyond what remove_objects()
method provides could use low level _delete_objects()
method by inheriting Minio
class or bypassing warning for using private method.
After remove_objects
was called, no errors returned and the object was not removed as expected.
My solution was to call remove_object
for each object in delete_object_list:
delete_object_list = map(
lambda x: DeleteObject(x.object_name),
client.list_objects(MLFLOW_BUCKET, obj.object_name, recursive=True),
)
for del_obj in delete_object_list:
print(del_obj._name)
if not DRY_RUN:
client.remove_object(MLFLOW_BUCKET, del_obj._name)
https://github.com/minio/minio-py/blob/master/docs/API.md?plain=1#L1473-L1475
As per internal discussion, it's not clear that this API is "lazy" in that it only fires if the user iterates the returned errors.
It also opens a few questions up:
Based on that we should update the docs for this API or any others which require iterating the response to fire (e.g. "lazy API")