hazelcast / hazelcast-aws

AWS EC2 discovery plugin for hazelcast
Other
38 stars 50 forks source link

Support for AWS ElasticBeanstalk EC2 Instances. #69

Closed pkgonan closed 3 years ago

pkgonan commented 6 years ago

Hi I think that It is not working. It can not detect other EC2 Instance on AWS ElasticBeanstalk. And can not clustering.

// Log message

Members {size:1, ver:1} [
Member [10.0.0.234]:5701 - 8fb81705-4afa-4753-bda6-5a2d97f9ee0f this
]

Members {size:1, ver:1} [
Member [10.0.0.234]:5702 - f01ca206-a534-40a0-aff0-830c68d13d4a this
]

Dropped: SplitBrainJoinMessage{packetVersion=4, buildNumber=20180424, 
memberVersion=3.10.0, clusterVersion=3.10, address=[10.0.0.203]:5702, uuid=‘b6aa90a0-b6e7- 
4c59-82d6-c920ba55d9c8’, liteMember=false, memberCount=1, dataMemberCount=1, 
memberListVersion=1}

// My Settings..

import com.hazelcast.aws.AwsDiscoveryStrategyFactory;
import com.hazelcast.config.*;
import com.hazelcast.core.HazelcastInstance;
import com.hazelcast.instance.HazelcastInstanceFactory;
import com.hazelcast.spring.cache.HazelcastCacheManager;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.cache.CacheManager;
import org.springframework.cache.annotation.EnableCaching;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.Map;

@enablecaching
@configuration
public class HazelcastConfig {

private static final String TAG_KEY = "elasticbeanstalk:environment-name";

@Value("${aws.environment-name")
private String environmentName;

@Value("${aws.region}")
private String region;

@Value("${aws.access-key}")
private String accessKey;

@Value("${aws.secret-key}")
private String secretKey;

@Bean
public Config config() {
    Config config = new Config();

    config.setInstanceName("Hazelcast-Instance");

    // EC2  Discovery true
    config.getProperties().setProperty("hazelcast.discovery.enabled", "true");

    // ZONE
    PartitionGroupConfig partitionGroupConfig = config.getPartitionGroupConfig();
    partitionGroupConfig
            .setEnabled(true)
            .setGroupType(PartitionGroupConfig.MemberGroupType.ZONE_AWARE);
    config.setPartitionGroupConfig(partitionGroupConfig);

    JoinConfig joinConfig = config.getNetworkConfig().getJoin();
    joinConfig.getTcpIpConfig().setEnabled(false);
    joinConfig.getMulticastConfig().setEnabled(false);
    joinConfig.getAwsConfig().setEnabled(false);

    // EC2  Discovery 
    AwsDiscoveryStrategyFactory awsDiscoveryStrategyFactory = new AwsDiscoveryStrategyFactory();
    Map<String, Comparable> properties = new HashMap<>();
    properties.put("access-key", accessKey);
    properties.put("secret-key", secretKey);
    properties.put("region", region);
    properties.put("host-header", "ec2.amazonaws.com");
    properties.put("tag-key", TAG_KEY);
    properties.put("tag-value", environmentName);
    properties.put("connection-timeout-seconds", "5");
    properties.put("hz-port","5701");

    DiscoveryStrategyConfig discoveryStrategyConfig = new 
    DiscoveryStrategyConfig(awsDiscoveryStrategyFactory, properties);
    joinConfig.getDiscoveryConfig().addDiscoveryStrategyConfig(discoveryStrategyConfig);

    ArrayList<DiscoveryStrategyConfig> discoveryStrategyConfigs = new ArrayList<>();
    joinConfig.getDiscoveryConfig().setDiscoveryStrategyConfigs(discoveryStrategyConfigs);

    // Cache Map
    MapConfig mapConfig = new MapConfig()
            .setName("cache")
            .setMaxSizeConfig(new MaxSizeConfig(200, MaxSizeConfig.MaxSizePolicy.FREE_HEAP_SIZE))
            .setEvictionPolicy(EvictionPolicy.LRU)
            .setTimeToLiveSeconds(20)
            .setNearCacheConfig(new NearCacheConfig());
    config.addMapConfig(mapConfig);

    // Cache ManagementCenter
    ManagementCenterConfig managementCenterConfig = new ManagementCenterConfig()
            .setEnabled(true)
            .setUrl("http://localhost:8080/mancenter")
            .setUpdateInterval(3);
    config.setManagementCenterConfig(managementCenterConfig);

    return config;
    }

    @Bean
    public HazelcastInstance hazelcastInstance() {
        return HazelcastInstanceFactory.newHazelcastInstance(config());
    }

    @Bean
    public CacheManager cacheManager() {
        return new HazelcastCacheManager(hazelcastInstance());
    }
}

Please let me know if it works with Elastic Beanstalk. I want to use Spring boot data jpa + AWS Beanstalk +Hazelcast Second level cache

-----------------------------------------
<dependency>
  <groupId>com.hazelcast</groupId>
  <artifactId>hazelcast-spring</artifactId>
  <version>3.10</version>
</dependency>
<dependency>
  <groupId>com.hazelcast</groupId>
  <artifactId>hazelcast-hibernate52</artifactId>
  <version>1.2.3</version>
</dependency>
<dependency>
  <groupId>com.hazelcast</groupId>
  <artifactId>hazelcast-aws</artifactId>
  <version>2.1.1</version>
</dependency>

 -----------------------------------------

properties:
hibernate.format_sql: true
hibernate.dialect: org.hibernate.dialect.MariaDBDialect
hibernate.cache.use_second_level_cache: true
hibernate.cache.use_query_cache: true
hibernate.cache.use_minimal_puts: true
hibernate.cache.region.factory_class: com.hazelcast.hibernate.HazelcastCacheRegionFactory
leszko commented 6 years ago

Hi @pkgonan,

Thank for reporting it. We'll check that.

mesutcelik commented 6 years ago

Hi @pkgonan We actually did not test hazelcast-aws in AWS ElasticBeanstalk Environment so I can't judge the effort required. However, your contributions would be very welcome in this.

pkgonan commented 6 years ago

@leszko @mesutcelik

Hi, I would like to use hazelcast in AWS Beanstalk environment. Is there another alternative to use?

mesutcelik commented 6 years ago

Hazelcast is not tested in AWS Beanstalk env. please feel free to test and provide feedback.

You can also use hazelcast in kubernetes env. too please check for more info. https://github.com/hazelcast/hazelcast-kubernetes

feanor07 commented 4 years ago

I tested and used aws elastic beanstalk via deploying a relatively simple application. It worked fine. One thing I need to mention is that you need to ensure security groups you use allow inbound traffic for ports hazelcast is relying on.

mesutcelik commented 4 years ago

@feanor07 Tesekkurler!

It would be useful for others if you share you codesample on github so that other people can benefit from it. The other option is to write a guide that will be hosted by hazelcast but not sure how to automate integration testing with Elasticbeanstalk to check the validity of the guide over time.

https://github.com/hazelcast-guides/base-guide/wiki/How-to-write-a-guide

feanor07 commented 4 years ago

@mesutcelik planning on writing a small how-to article if I find some time soon. I will post it to here when I am done 🤞

jklingsporn commented 4 years ago

@feanor07 If you have the time, can you please explain to me how you did it (just some bullet points)? Like any deviations from the default configuration? I am trying to configure it defining an IAM-role and using tags. I've also setup the security groups to allow any TCP communication on any port and provided the IAM-role with the necessary EC2-permissions but without luck. Your help would be really appreciated.

jklingsporn commented 4 years ago

Hazelcast-version 4.0.2 / hazelcast-aws-version 3.12 This is my ElasticBeanstalk configuration:

I am using vertx 4.0.0.Beta1 with the following module: https://vertx-web-site.github.io/docs/vertx-hazelcast/java/

This is my hazelcast-configuration:

<hazelcast xmlns="http://www.hazelcast.com/schema/config"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.hazelcast.com/schema/config
           http://www.hazelcast.com/schema/config/hazelcast-config-4.0.xsd">

    <properties>
        <property name="hazelcast.wait.seconds.before.join">0</property>
        <property name="hazelcast.jmx">true</property>
        <property name="hazelcast.member.list.publish.interval.seconds">5</property>
    </properties>

    <cluster-name>dev</cluster-name>

    <network>
        <port auto-increment="false">5701</port>
        <join>
            <multicast enabled="false"/>
            <aws enabled="true">
                <tag-key>cluster</tag-key>
                <tag-value>dev</tag-value>
                <region>eu-central-1</region>
                <iam-role>aws-elasticbeanstalk-ec2-role</iam-role>
                <hz-port>5701</hz-port>
                <connection-timeout-seconds>10</connection-timeout-seconds>
            </aws>
        </join>
    </network>

    <multimap name="__vertx.subs">
        <backup-count>1</backup-count>
        <value-collection-type>SET</value-collection-type>
    </multimap>

    <map name="__vertx.haInfo">
        <backup-count>1</backup-count>
    </map>

    <map name="__vertx.nodeInfo">
        <backup-count>1</backup-count>
    </map>

    <cp-subsystem>
        <cp-member-count>0</cp-member-count>
        <semaphores>
            <semaphore>
                <name>__vertx.*</name>
                <jdk-compatible>false</jdk-compatible>
            </semaphore>
        </semaphores>
    </cp-subsystem>

</hazelcast>

The observed behavior is as follows: upon application start the nodes do not detect each other. This can be proven when configuring one application as a lite-member. The boot will fail with the following message:

Aug 21 15:05:08 ip-172-31-8-247 web: 15:05:08.267 [vert.x-worker-thread-0] INFO  c.hazelcast.aws.AwsDiscoveryStrategy - Availability zone found: 'eu-central-1c'
Aug 21 15:05:08 ip-172-31-8-247 web: 15:05:08.267 [vert.x-worker-thread-0] DEBUG c.h.internal.cluster.ClusterService - [172.31.8.247]:5701 [dev] [4.0.2] Setting master address to null
Aug 21 15:05:08 ip-172-31-8-247 web: 15:05:08.268 [vert.x-worker-thread-0] DEBUG c.h.i.cluster.impl.DiscoveryJoiner - [172.31.8.247]:5701 [dev] [4.0.2] This node will assume master role since none of the possible members accepted join request.
Aug 21 15:05:08 ip-172-31-8-247 web: 15:05:08.268 [vert.x-worker-thread-0] TRACE c.h.i.c.impl.ClusterJoinManager - [172.31.8.247]:5701 [dev] [4.0.2] This node is being set as the master
Aug 21 15:05:08 ip-172-31-8-247 web: 15:05:08.268 [vert.x-worker-thread-0] DEBUG c.h.internal.cluster.ClusterService - [172.31.8.247]:5701 [dev] [4.0.2] Setting master address to [172.31.8.247]:5701
Aug 21 15:05:08 ip-172-31-8-247 web: 15:05:08.269 [vert.x-worker-thread-0] DEBUG c.h.i.cluster.impl.MembershipManager - [172.31.8.247]:5701 [dev] [4.0.2] Local member list join version is set to 1
Aug 21 15:05:08 ip-172-31-8-247 web: 15:05:08.269 [vert.x-worker-thread-0] DEBUG c.h.i.cluster.impl.DiscoveryJoiner - [172.31.8.247]:5701 [dev] [4.0.2] PostJoin master: [172.31.8.247]:5701, isMaster: true
Aug 21 15:05:08 ip-172-31-8-247 web: 15:05:08.270 [vert.x-worker-thread-0] INFO  c.h.internal.cluster.ClusterService - [172.31.8.247]:5701 [dev] [4.0.2]
Aug 21 15:05:08 ip-172-31-8-247 web: Members {size:1, ver:1} [
Aug 21 15:05:08 ip-172-31-8-247 web: Member [172.31.8.247]:5701 - f96cf760-7a1b-4da1-9812-093c6164179d this lite
Aug 21 15:05:08 ip-172-31-8-247 web: ]
Aug 21 15:05:08 ip-172-31-8-247 web: 15:05:08.278 [vert.x-worker-thread-0] TRACE c.h.i.m.ManagementCenterService - [172.31.8.247]:5701 [dev] [4.0.2] Creating new executor for Management Center service tasks with threadCount=2
Aug 21 15:05:08 ip-172-31-8-247 web: 15:05:08.278 [vert.x-worker-thread-0] TRACE com.hazelcast.config.Config - No configuration found for PhoneHome, using default config!
Aug 21 15:05:08 ip-172-31-8-247 web: 15:05:08.279 [vert.x-worker-thread-0] TRACE com.hazelcast.config.Config - No configuration found for PhoneHome, using default config!
Aug 21 15:05:08 ip-172-31-8-247 web: 15:05:08.287 [vert.x-worker-thread-0] TRACE c.h.i.diagnostics.HealthMonitor - [172.31.8.247]:5701 [dev] [4.0.2] HealthMonitor started
Aug 21 15:05:08 ip-172-31-8-247 web: 15:05:08.288 [vert.x-worker-thread-0] INFO  c.h.internal.jmx.ManagementService - [172.31.8.247]:5701 [dev] [4.0.2] Hazelcast JMX agent enabled.
Aug 21 15:05:08 ip-172-31-8-247 web: 15:05:08.309 [vert.x-worker-thread-0] INFO  com.hazelcast.core.LifecycleService - [172.31.8.247]:5701 [dev] [4.0.2] [172.31.8.247]:5701 is STARTED
Aug 21 15:05:08 ip-172-31-8-247 web: 15:05:08.330 [vert.x-worker-thread-0] TRACE c.h.internal.metrics.MetricsRegistry - [172.31.8.247]:5701 [dev] [4.0.2] Registered probeInstance [service=hz:impl:multiMapService,unit=count,metric=event.listenerCount]
Aug 21 15:05:08 ip-172-31-8-247 web: 15:05:08.330 [vert.x-worker-thread-0] TRACE c.h.internal.metrics.MetricsRegistry - [172.31.8.247]:5701 [dev] [4.0.2] Registered probeInstance [service=hz:impl:multiMapService,unit=count,metric=event.publicationCount]
Aug 21 15:05:08 ip-172-31-8-247 web: 15:05:08.453 [vert.x-acceptor-thread-0] TRACE io.vertx.core.net.impl.NetServerImpl - Net server listening on 172.31.8.247:/172.31.8.247:42189
Aug 21 15:05:08 ip-172-31-8-247 web: 15:05:08.465 [vert.x-worker-thread-2] WARN  c.h.i.p.impl.PartitionStateManager - [172.31.8.247]:5701 [dev] [4.0.2] No member group is available to assign partition ownership...
Aug 21 15:05:08 ip-172-31-8-247 web: 15:05:08.469 [vert.x-eventloop-thread-0] ERROR io.vertx.core.impl.VertxImpl - Failed to initialize clustered Vert.x
Aug 21 15:05:08 ip-172-31-8-247 web: com.hazelcast.partition.NoDataMemberInClusterException: Target of invocation cannot be found! Partition owner is null but partitions can't be assigned since all nodes in the cluster are lite members.
Aug 21 15:05:08 ip-172-31-8-247 web: at com.hazelcast.spi.impl.operationservice.impl.PartitionInvocation.newTargetNullException(PartitionInvocation.java:90)
Aug 21 15:05:08 ip-172-31-8-247 web: at com.hazelcast.spi.impl.operationservice.impl.Invocation.initInvocationTarget(Invocation.java:270)
Aug 21 15:05:08 ip-172-31-8-247 web: at com.hazelcast.spi.impl.operationservice.impl.Invocation.doInvoke(Invocation.java:562)
Aug 21 15:05:08 ip-172-31-8-247 web: at com.hazelcast.spi.impl.operationservice.impl.Invocation.invoke0(Invocation.java:540)
Aug 21 15:05:08 ip-172-31-8-247 web: at com.hazelcast.spi.impl.operationservice.impl.Invocation.invoke(Invocation.java:237)
Aug 21 15:05:08 ip-172-31-8-247 web: at com.hazelcast.spi.impl.operationservice.impl.InvocationBuilderImpl.invoke(InvocationBuilderImpl.java:59)
Aug 21 15:05:08 ip-172-31-8-247 web: at com.hazelcast.map.impl.proxy.MapProxySupport.invokeOperation(MapProxySupport.java:468)
Aug 21 15:05:08 ip-172-31-8-247 web: at com.hazelcast.map.impl.proxy.MapProxySupport.putInternal(MapProxySupport.java:407)
Aug 21 15:05:08 ip-172-31-8-247 web: at com.hazelcast.map.impl.proxy.MapProxyImpl.put(MapProxyImpl.java:121)
Aug 21 15:05:08 ip-172-31-8-247 web: at com.hazelcast.map.impl.proxy.MapProxyImpl.put(MapProxyImpl.java:111)
Aug 21 15:05:08 ip-172-31-8-247 web: at io.vertx.spi.cluster.hazelcast.HazelcastClusterManager.lambda$setNodeInfo$2(HazelcastClusterManager.java:173)
Aug 21 15:05:08 ip-172-31-8-247 web: at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:179)
Aug 21 15:05:08 ip-172-31-8-247 web: at io.vertx.core.impl.AbstractContext.emit(AbstractContext.java:181)
Aug 21 15:05:08 ip-172-31-8-247 web: at io.vertx.core.impl.ContextImpl.lambda$executeBlocking$1(ContextImpl.java:177)
Aug 21 15:05:08 ip-172-31-8-247 web: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
Aug 21 15:05:08 ip-172-31-8-247 web: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
Aug 21 15:05:08 ip-172-31-8-247 web: at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
Aug 21 15:05:08 ip-172-31-8-247 web: at java.lang.Thread.run(Thread.java:748)

When both environments have light-member setting set to false, the nodes only discover after several minutes.

feanor07 commented 4 years ago

@jklingsporn I have been quite busy these couple of weeks; hope will find some time to respond back with my configuration soon.

jklingsporn commented 4 years ago

Just a quick update: when I disable aws-discovery for the nodes that are lite-members and enable tcp-id with a fixed ip, everything works like a charm.

jklingsporn commented 4 years ago

@feanor07 our configuration has some issues regarding resilience. Let's say AWS decides to replace our only know instance with another machine, our whole cluster will die. Also, when we have to reboot the known cluster-member we have to reboot all other nodes as well in order to have a working cluster.

leszko commented 3 years ago

@jklingsporn I think you need to change the Deployment policy to Rolling. Otherwise, you kill all Hazelcast members at once, and therefore the cluster is down and loses data.

I tested Hazelcast on AWS Elastic Beanstalk and it works correctly. I added a description in this PR: https://github.com/hazelcast/hazelcast-aws/pull/207. It should close this issue. I suggest opening new GH issues with more precise steps to reproduce if anyone encounters any problem with the Beanstalk environment.