Closed jitran closed 3 years ago
Can you please confirm if the output option check_object=true requires the s3:GetObject IAM permission?
Yes
Is there any other way the for the output plugin to check for the existence of an s3 key filename before incrementing the indice?
I'm not sure... if you have a better idea, please let me know.
Hi @repeatedly,
Have you considered S3::Client::client.list_objects_v2:
You could set the Prefix parameter to the destination filename such as 'path/test_log_2018/11/09/0845_0.gz'. It will return metadata for the matching file(s) in json. This will significantly reduce the network payload compared to s3:GetObject; especially when the upstream log file is large.
Alternatively, try s3:GetObjectAcl. It will fail if the upstream file doesn't exist. https://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectGETacl.html
This will significantly reduce the network payload compared to s3:GetObject
Really? We use head
operation and it gets only metadata.
https://github.com/aws/aws-sdk-ruby/blob/cce86987c17a7593cbc6546b32c83dd405c245ab/gems/aws-sdk-s3/lib/aws-sdk-s3/object.rb#L257 https://github.com/aws/aws-sdk-ruby/blob/cce86987c17a7593cbc6546b32c83dd405c245ab/gems/aws-sdk-s3/lib/aws-sdk-s3/waiters.rb#L114
head
is slower than list
or GetObjectAcl
?
Sorry, I didn't realise you were getting the metadata. Would you consider changing the check implementation to get the object acl as that'll limit the permission to s3:GetObjectACL?
Requiring s3:GetObject for the head check could potentially allow malicious access to the upstream s3 log file.
Thanks
Maybe, default behaviour can't be changed because it breaks existing environment.
Need option for it like use_acl_for_object_chck
.
I have the same problem around 'check_object=true' setting. It would be better that it is clarified that 'check_object=true' requires 's3:GetObject'-allow policy on the document.
This issue has been automatically marked as stale because it has been open 90 days with no activity. Remove stale label or comment or this issue will be closed in 30 days
This issue was automatically closed because of stale in 30 days
Hi
We are using the latest stable td-agent 2.3.6 release (fluent-plugin-s3-0.8.5, fluentd-0.12.40). When the s3 plugin generates an s3 object key that already exists, it throws an 'Aws::S3::Errors::Forbidden' error.
We have granted the following IAM permissions for our S3 Log bucket:
Here are the s3 output plugin configuration details to replicate the issue. It creates a minute based s3 key filename:
Now if one log line was generated for the minute, fluentd will successfully upload the s3 file:
Now if one log line was generated in the first 30s of the minute, and then another in the second half of the minute, you'd expect the plugin to upload two files:
However with our IAM permission above, the s3 output plugin will throw an 'Aws::S3::Errors::Forbidden' error. I'm guessing it's probably trying to do an s3:GetObject of the first file (test_log_2018/11/09/0845_0.gz) to check whether it exists, and if it did, increment the indice by 1, i.e. test_log_2018/11/09/0845_1.gz. This will prevent existing log files being overwritten in s3..
Once we added the s3:GetObject permission to our IAM permissions, the failed s3 upload retried successfully, and both test_log_2018/11/09/0845_0.gz and test_log_2018/11/09/0845_1.gz were present in our S3 bucket.
Can you please confirm if the output option check_object=true requires the s3:GetObject IAM permission? Is there any other way the for the output plugin to check for the existence of an s3 key filename before incrementing the indice?
We would prefer if the client forwarding the logs to S3 have write only access, and a separate IAM role for reading.