hazelcast / hazelcast-aws

AWS EC2 discovery plugin for hazelcast
Other
38 stars 50 forks source link

AWS discovery does not support AWS ECS service (?) #11

Closed gsaslis closed 7 years ago

gsaslis commented 7 years ago

I've been struggling to get AWS discovery to work within docker containers (the official hazelcast docker containers) deployed on AWS ECS.

I've narrowed this down to the fact that this library only supports EC2 IAM roles, and does not support the credentials scheme defined by task definition roles in AWS ECS.

The problem is that the getKeysFromIamRole method in the DescribeInstances class - defined here - does not support looking up IAM roles for AWS ECS tasks. These use a slightly different scheme, as is documented here.

I am happy to help with a PR for this, but, as an entirely new contributor, I would like to know your thoughts on whether you think this should come in the form of a different configuration in the hazelcast.xml, etc., or whether we should simply extend the DescribeInstances class to also attempt looking up for an IAM task role, in case it can't find the regular EC2 IAM role.

mesutcelik commented 7 years ago

Hi @gsaslis ,

Thanks for the offer. We always appreciate community contributions. 👏

So you want to have a new config like where hazelcast-aws module does directly call /credential_provider_version/credentials?id=task_UUID.

My question is who is providing task_UUID? Can you shed some light on that?

gsaslis commented 7 years ago

Hey @mesutcelik!

Thanks - and yeah, sure!

So, according to the AWS docs about IAM Task Roles, this whole URL (including the task_UUID) is made available as an env var by the ECS agent (just in case, this is the app that links regular EC2 instances to an ECS cluster).

The env var is named AWS_CONTAINER_CREDENTIALS_RELATIVE_URI, so I guess we could use the existence (or not) of this env var as a way to tell whether we're running inside ECS or inside EC2... ?

If we go by this, we don't really need extra configuration to be added to hazelcast.xml, etc.

In fact, I'm not even sure we need the <iam-role>...</iam-role> element at all in this ECS case, since the URL does not include the role name.

I like the "self-discovery" side of this approach, if I may call it that, as it ties in very nicely with the containers / microservices / etc. world, but I'm not sure if I'm missing some other downside here...

mesutcelik commented 7 years ago

I just included <iam-role/> because if rolename is not provided then hazelcast-aws has some default behavior.

Let me summarize what is needed...

In case of none of those following parameters defined in hazelcast.xml, there should be some chain of actions to get security credentials.

hazelcast-aws should probably try following actions to get the credentials

  1. check if you are in ECS i.e AWS_CONTAINER_CREDENTIALS_RELATIVE_URI
  2. check if you have a default iamrole, that is achieved by sending no iamrole while fetching security credentials
  3. fail fast with an error log like no credentials or iamrole provided

Is this what you have in mind?

gsaslis commented 7 years ago

@mesutcelik yeah I guess this sounds about right...

If I understand correctly (2) is already supported, so I'll get working on a PR for (1) and (3), if that's ok?

mesutcelik commented 7 years ago

number 2 is only supported if you provide <iam-role> with no rolename i.e <iam-role/>

gsaslis commented 7 years ago

@mesutcelik got it - thanks!

gsaslis commented 7 years ago

@mesutcelik looking at https://github.com/hazelcast/hazelcast-aws/blob/master/src/main/java/com/hazelcast/aws/impl/DescribeInstances.java#L78, it seems that (2) is supported when someone declares <iam-role>DEFAULT</iam-role>..

Am I missing sth else?

mesutcelik commented 7 years ago

right that is supported but only in case iam-role is defined.

I assume you are gonna implement the logic where none of the following is defined in the config.

gsaslis commented 7 years ago

yep! so, if none are defined, we also check for ECS support before failing with the 'no credentials' error. Otherwise, to maintain backwards-compatibility, we (should) check for the default role only when <iam-role>DEFAULT</iam-role> is declared. Anyway, please let me submit the PR shortly, then we can discuss on the code itself in more detail ; )

Thanks for the feedback!!

mesutcelik commented 7 years ago

fixed by https://github.com/hazelcast/hazelcast-aws/pull/14

matthurne commented 7 years ago

Is it possible to cluster multiple members hosted in containers on the same ECS container instance? Regardless, are there particular steps required to properly configure ECS/the tasks to support Hazelcast (e.g. port mapping)?

gsaslis commented 7 years ago

@mhurne according to https://github.com/hazelcast/hazelcast-aws/issues/18#issuecomment-293551971 this is not possible atm.

mpataki commented 6 years ago

This thread and your PR, @gsaslis, have been interesting reads as I dig into this issue myself. Regarding the limitation that multiple cluster nodes can't run on the same ECS host; I'm wondering if anyone has attempted to use the (new since this thread) awsvpc task networking mode, which assigns an ENI to the task itself. This feature promises to provide tasks with the same networking properties as EC2. I'm curious if the experts here have any thoughts on this approach.

matthurne commented 6 years ago

@mpataki You may be interested in https://github.com/commercehub-oss/hazelcast-discovery-amazon-ecs, though there are not published releases.

mpataki commented 6 years ago

Very cool - I'll keep an eye on this. Thanks!

leszko commented 6 years ago

@mhurne

  1. From the version 2.2, we'll support multiple containers per EC2 Instance in ECS. Until then, you're limited to one container / instance.
  2. The configuration description is here: https://github.com/hazelcast/hazelcast-aws#configuring-hazelcast-members-for-aws-ecs. Meaning you need to set the network as "host", so no port mapping is supposed to be set.
matthurne commented 6 years ago

Thanks for that information, @leszko . We actually built our own solution to the problem in the form of a custom discovery strategy, the source code of which is available at https://github.com/commercehub-oss/hazelcast-discovery-amazon-ecs . It's been working well for us. It doesn't require use of "host" networking (though that should work too); we use it with "bridge" networking.

We previously discussed contributing the solution to Hazelcast with @mesutcelik , @googlielmo . We got as far as open sourcing the code at the previously linked to GitHub project.

laurocesar commented 2 years ago

Hi @mhurne and @leszko !

Do we have any news about multiple containers per EC2 Instance in ECS?

I saw that https://github.com/commercehub-oss/hazelcast-discovery-amazon-ecs is not open.

How to run multiple hazelcast containers per EC2 Instance in ECS?

Thanks!

matthurne commented 2 years ago

Hi @laurocesar, I've moved on from working with Hazelcast, so unfortunately I'm not in a position to lend you a hand. But best of luck!