Open sdarwin opened 9 months ago
Hi @sdarwin thanks for reaching out. Here is the delete-objects
documentation for reference: https://awscli.amazonaws.com/v2/documentation/api/latest/reference/s3api/delete-objects.html.
The 1000 limit you referenced is mentioned there:
The request contains a list of up to 1000 keys that you want to delete
I could reproduce the MalformedXML
error using a file containing 1001 keys generated by this script:
import json
data = {
"Objects": [
{
"Key":f"Key_{i}",
} for i in range(1, 1002)
],
"Quiet": True
}
with open('data.json', 'w') as outfile:
json.dump(data, outfile, indent=4)
aws s3api delete-objects --bucket $BUCKET --delete file://data.json
resulted in:
An error occurred (MalformedXML) when calling the DeleteObjects operation: The XML you provided was not well-formed or did not validate against our published schema
(The SSL validation failed...
error could have occurred for various reasons. Those errors are often caused by certificate issues as noted in the troubleshooting guide. But if it was something that you just encountered once then it may have been some kind of transient network/proxy issue.)
I also think that the error message could be improved. That error message is returned by the S3 DeleteObjects API, so we'll need to reach out to the S3 team to request a better error message. I'll go ahead and transfer this to our cross-SDK repository for further tracking as service APIs like this are used across SDKs. I'll also note a few related issues I found:
P99874682
The SSL validation failed... if it was something that you just encountered once
Just retesting now.
Ok, there is a different reason for the SSL error. It isn't the 1000 item limit, which I had thought.
SSL validation failed for https://s3.amazonaws.com/_bucket_?delete EOF occurred in violation of protocol (_ssl.c:2427)
That is caused by not specifying the region. Notice how https://s3.amazonaws.com/ seems to be a generic endpoint, without mentioning a region.
If I add --region us-east-2
to the command, it's fixed.
Well, that is another one to consider then? The user does not see a message such as "You should set the region." Simply a unknown SSL error.
The SSL validation error doesn't seem directly related to S3. A few other issues have reported that error and it could be caused by various things. If you add --debug
to your command then it may provide more insight, but if it's not a certificate or proxy issue then it could be an issue with whichever version of urllib3 or OpenSSL you have installed. I can't reproduce the error just by not providing a region.
This is a going off track now. :-) I will post the info, just to be complete, anyway.
Collecting a list of items, that will later be deleted:
aws s3api list-object-versions --max-items 1002 --profile $PROFILE \
--bucket $BUCKET \
--output=json \
--query='{Objects: Versions[].{Key:Key,VersionId:VersionId}}' --prefix $PREFIX > list.json
Because that is over 1000, it causes an error. In this case, (MalformedXML)
What if the number of items is larger, 10,000 instead?
aws s3api list-object-versions --max-items 10000 --profile $PROFILE \
--bucket $BUCKET \
--output=json \
--query='{Objects: Versions[].{Key:Key,VersionId:VersionId}}' --prefix $PREFIX > list.json
Now, it seems that a combination of all these factors:
Gets this error:
SSL validation failed for https://s3.amazonaws.com/_bucket_?delete EOF occurred in violation of protocol (_ssl.c:2427)
It can be resolved by reducing the number of items or adding the --region. Either one. That at least gets back to "MalformedXML".
Describe the feature
Improve the accuracy of the error messages during certain failures.
Use Case
Clear error messages assist in debugging.
Here is the problem I encountered. When running this command to delete objects in S3:
It supposedly has a limit of 1000 items. If you provide a million items to delete, the error messages are mysterious:
or
As evidence for MalformedXML error, consider this stackexchange article
"+1 for number of keys. I did not check the go code but for us in python boto3 over 1000 keys in the request caused the error. You would think they would document this is the documentation for MalformedXML on the sdk client, but you did find api documentation. We chunked the keys and it works perfect – vfrank66"
Proposed Solution
The problem is that neither message explains the actual cause of the problem, which is the input is over 1000 items in length. The errors should be more specific about the real reason.
Other Information
No response
Acknowledgements
CLI version used
aws-cli/2.13.17 Python/3.11.5 Linux/6.2.0-1013-gcp exe/x86_64.ubuntu.22 prompt/off
Environment details (OS name and version, etc.)
Ubuntu 22.04