treasure-data / digdag

Workload Automation System
https://www.digdag.io/
Apache License 2.0
1.31k stars 221 forks source link

Add multiple options to make a connection with ECS Cluster. #1759

Closed rariyama closed 1 year ago

rariyama commented 2 years ago

What I did

I added more options to access to ECS Cluster. There is currently only one option to make a connection with ECS Cluster which is the way to connect with AWS AccessKeyId and SecretAccessKey. I made a change that DefaultAWSCredentialsProviderChain is called if agent.command_executor.ecs.\<name>.access_key_id is not defined, and then the other types of credentials such as environment variables or Instance Profile are used. I referred to the process of authorizing S3 Bucket and made the implementation closer to it as possible.

What I confirmed

I confirmed the items below.

This is the first PR I create on this repository so I'm sorry in advance if there is any inconvenience.

hiroyuki-sato commented 1 year ago

I'm sorry, I actually commented the wrong PR. this comment was for #1762.

This PR author writes an article about this change. https://tech.wealthnavi.com/entry/20221125/1669353895 written in Japanese.

@yoyama, Could you look at this PR when you get a chance?

I think this PR does not change current behavior by default.

yoyama commented 1 year ago

Basically it looks good. But I hope adding tests for https://github.com/treasure-data/digdag/pull/1759/files#diff-4094a97e13055fa1f84ffe3297bb4c703cea9506ad50ae581ae9bd80a0f3504bR31

rariyama commented 1 year ago

Thanks for your reply. I'm sorry if I misunderstood but your comment means that createClient is needed to be tested? DefaultAWSCredentialsProviderChain is a method of AWS SDK so it should have been tested by others and it doesn't need to be tested in digdag.

szyn commented 1 year ago

Thank you for creating this PR. Since this PR change will change the behavior when system properties are not set, it would be a good idea to add a test to see if the expected authentication method (AWSStaticCredentialsProvider/DefaultAW SCredentialsProviderChain) is used depending on the existence of the system property.

IMO, I understand that there is basically no problem since the behavior is the same as before in cases where system property (agent.command_executor.ecs.<name>.access_key_id) is set. Maybe writing an integration test as a follow-up PR would be a good idea.

I have one request. I would appreciate it if you leave a note on this page describing the new behavior, as in the PR, and please mention that it is supported in v0.10.5 and above. https://github.com/treasure-data/digdag/blob/011d848689ae3d31a8624d59bc8ece5c38e3ae5e/digdag-docs/src/command_executor.md?plain=1#L58

rariyama commented 1 year ago

@szyn Thank you for your comments.

Thank you for creating this PR. Since this PR change will change the behavior when system properties are not set, it would be a good idea to add a test to see if the expected authentication method (AWSStaticCredentialsProvider/DefaultAW SCredentialsProviderChain) is used depending on the existence of the system property.

IMO, I understand that there is basically no problem since the behavior is the same as before in cases where system property (agent.command_executor.ecs..access_key_id) is set. Maybe writing an integration test as a follow-up PR would be a good idea.

I agreed with the idea to add integration test for confirming that digdag server can connect with ECS with the expected way. If possible, If possible, I'll try to add the test.

I have one request. I would appreciate it if you leave a note on this page describing the new behavior, as in the PR, and please mention that it is supported in v0.10.5 and above.

I added the explanation on this commit https://github.com/treasure-data/digdag/pull/1759/commits/e47c6fa885f051578760b01d87cf3f1594a09059, so please review this if you are available.

By the way, We can choose credentials provided by DefaultAWSCredentialsProviderChain other than AWS access key when connecting with S3 temporal storage because similar method has already been implemented in S3StorageFactory and TemporalProjectArchiveStorage uses the function. Thus, the description I added on this PR can be applied to S3 access key and secret, so I think it'll also be better to mention about that. How do you think about my idea?

szyn commented 1 year ago

Sounds good, thank you.

By the way, We can choose credentials provided by DefaultAWSCredentialsProviderChain other than AWS access key when connecting with S3 temporal storage because similar method has already been implemented in S3StorageFactory and TemporalProjectArchiveStorage uses the function. Thus, the description I added on this PR can be applied to S3 access key and secret, so I think it'll also be better to mention about that. How do you think about my idea?

Thank you for sharing your opinion. That's a good point 👍 I agree with mentioning it. May I ask you to add a note to the document on this point? Of course, this can be done as a separate PR.