Open binarylogic opened 4 years ago
@bruceg could you investigate doing this for the aws_s3
sink first? I assume solving that will also solve the other AWS based sinks?
@binarylogic thanks for adding this!
The custom http header approach makes a lot of sense for the aws_s3
sink. We use at least a couple of them, and it will likely future-proof against any headers AWS adds in the future.
With regards to the other AWS sinks, from what I can see the non-S3 APIs (e.g. Kinesis streams) may not use optional HTTP headers at all. It might be worth doing a survey of the landscape just to see how many of the other sinks could make use of custom headers.
For the AWS S3 sink, at least, adding arbitrary headers is going to require a lot of duplication of external code. The S3Client::put_object
method we use creates and finishes the request object before returning. So to add our own request headers we will need to either duplicate the method or add support to the crate and use our local/custom copy until it gets upstreamed.
The PutObjectRequest
does have an option for a canned ACL, as well as for the specific x-amz-grant-full-control
header being requested. Can we scope this issue down to just those extra bits (and anything else already part of the PutObjectRequest
since it will satisfy the ownership request?
:( That's unfortunate. Would you mind opening an issue in Rusoto requesting this? And yes, in the interim let's just map the grant_*
options to our own. There are some other good options in here. Do you think we should open a separate issue for them?
I'm not sure if it will be of much value to the rusoto crate. AFAICT all of the options supported when creating S3 objects (S3 PutObject
API) are exposed through the rusoto_s3::PutObjectRequest
.
For any HTTP based sink, we should allow users to set custom headers. This is particularly important for services that support special headers (like S3). With S3, you can set all kinds of special headers that set the ACL, encryption mechanism, etc. This is already provided in the
http
sink and I would like the exact same option available to the following sinks:aws_s3
<---- do this first since users are waiting on thisaws_s3
docs addressing how to set ACLs, encryption, etc.aws_cloudwatch_logs
aws_cloudwatch_metrics
aws_kinesis_firehose
aws_kinesis_streams
clickhouse
?database_metrics
gcp_pubsub
gcp_stackdriver_logging
new_relic_logs
sematext
splunk_hec
I don't see any downside to providing this option.
For further context, this issue came out of a meeting with @zcapper. They're using the
aws_s3
sink to write objects across accounts. This scenario is perfectly described in this AWS tutorial. To summarize, I'll walk through a simple example.Given two AWS accounts
A
andB
:A
is where Vector is deployed.B
owns the S3 bucket.A
is granted cross-account access to accountB
's S3 bucket via S3's cross-account bucket permissions.A
writes to the bucket, the S3 object ownership remains under accountA
.B
cannot modify the object as a result.This can be easily solved by supplying the
x-amz-grant-full-control
header when writing the object.