Unicast Cluster on Azure Virtual Machines not working (0.90)

dynamicdeploy commented 11 years ago

I am trying to build a unicast cluster on Azure Virtual machines and it does not seem to work. I don't see any errors. I have enabled debug mode and here are the logs. Also, both the machines can talk to each other. I looked at the source and the hosts array (using initial hosts [],) during initialization seems to be empty even though I am specifying 2 hosts in it.

Node 1: Config

#################################### Node #####################################

# Node names are generated dynamically on startup, so you're relieved
# from configuring them manually. You can tie this node to a specific name:
#
 node.name: "xES1"

################################## Discovery ##################################

# Discovery infrastructure ensures nodes can be found within a cluster
# and master node is elected. Multicast discovery is the default.

# Set to ensure a node sees N other master eligible nodes to be considered
# operational within the cluster. Set this option to a higher value (2-4)
# for large clusters (>3 nodes):
#
# discovery.zen.minimum_master_nodes: 1

# Set the time to wait for ping responses from other nodes when discovering.
# Set this option to a higher value on a slow or congested network
# to minimize discovery failures:
#
 discovery.zen.ping.timeout: 10s

# See <http://elasticsearch.org/guide/reference/modules/discovery/zen.html>
# for more information.

# Unicast discovery allows to explicitly control which nodes will be used
# to discover the cluster. It can be used when multicast is not present,
# or to restrict the cluster communication-wise.
#
#1. Disable multicast discovery (enabled by default):
#
 discovery.zen.ping.multicast.enabled: false
#
#2. Configure an initial list of master nodes in the cluster
#    to perform discovery when new nodes (master or data) are started:
#discovery.zen.ping.unicast.hosts: 
discovery.zen.ping.unicast.hosts:["elasticsearch3","rnynjpxyhcfhxdm"]
#discovery.zen.ping.unicast.hosts:["10.78.76.39:9300","10.78.26.64:9300"]

Node 1 logs:

[2013-05-23 19:35:12,433][INFO ][node                     ] [xES1] {0.90.0}[3136]: initializing ...
[2013-05-23 19:35:12,435][DEBUG][node                     ] [xES1] using home [C:\ddapplications\elasticsearch-0.90.0], config [C:\ddapplications\elasticsearch-0.90.0\config], data [[C:\ddapplications\elasticsearch-0.90.0\data]], logs [C:\ddapplications\elasticsearch-0.90.0\logs], work [C:\ddapplications\elasticsearch-0.90.0\work], plugins [C:\ddapplications\elasticsearch-0.90.0\plugins]
[2013-05-23 19:35:12,454][INFO ][plugins                  ] [xES1] loaded [], sites [head]
[2013-05-23 19:35:12,524][DEBUG][common.compress.lzf      ] using [UnsafeChunkDecoder] decoder
[2013-05-23 19:35:12,567][DEBUG][env                      ] [xES1] using node location [[C:\ddapplications\elasticsearch-0.90.0\data\elasticsearch\nodes\0]], local_node_id [0]
[2013-05-23 19:35:15,035][DEBUG][threadpool               ] [xES1] creating thread_pool [generic], type [cached], keep_alive [30s]
[2013-05-23 19:35:15,057][DEBUG][threadpool               ] [xES1] creating thread_pool [index], type [fixed], size [1], queue_size [null], reject_policy [abort], queue_type [linked]
[2013-05-23 19:35:15,059][DEBUG][threadpool               ] [xES1] creating thread_pool [bulk], type [fixed], size [1], queue_size [null], reject_policy [abort], queue_type [linked]
[2013-05-23 19:35:15,060][DEBUG][threadpool               ] [xES1] creating thread_pool [get], type [fixed], size [1], queue_size [null], reject_policy [abort], queue_type [linked]
[2013-05-23 19:35:15,065][DEBUG][threadpool               ] [xES1] creating thread_pool [search], type [fixed], size [2], queue_size [1k], reject_policy [abort], queue_type [linked]
[2013-05-23 19:35:15,066][DEBUG][threadpool               ] [xES1] creating thread_pool [percolate], type [fixed], size [1], queue_size [null], reject_policy [abort], queue_type [linked]
[2013-05-23 19:35:15,067][DEBUG][threadpool               ] [xES1] creating thread_pool [management], type [scaling], min [1], size [5], keep_alive [5m]
[2013-05-23 19:35:15,069][DEBUG][threadpool               ] [xES1] creating thread_pool [flush], type [scaling], min [1], size [1], keep_alive [5m]
[2013-05-23 19:35:15,070][DEBUG][threadpool               ] [xES1] creating thread_pool [merge], type [scaling], min [1], size [1], keep_alive [5m]
[2013-05-23 19:35:15,071][DEBUG][threadpool               ] [xES1] creating thread_pool [refresh], type [scaling], min [1], size [1], keep_alive [5m]
[2013-05-23 19:35:15,072][DEBUG][threadpool               ] [xES1] creating thread_pool [warmer], type [scaling], min [1], size [1], keep_alive [5m]
[2013-05-23 19:35:15,073][DEBUG][threadpool               ] [xES1] creating thread_pool [snapshot], type [scaling], min [1], size [1], keep_alive [5m]
[2013-05-23 19:35:15,147][DEBUG][transport.netty          ] [xES1] using worker_count[2], port[9300-9400], bind_host[null], publish_host[null], compress[false], connect_timeout[30s], connections_per_node[2/6/1], receive_predictor[512kb->512kb]
[2013-05-23 19:35:15,166][DEBUG][discovery.zen.ping.unicast] [xES1] using initial hosts [], with concurrent_connects [10]
[2013-05-23 19:35:15,169][DEBUG][discovery.zen            ] [xES1] using ping.timeout [10s], master_election.filter_client [true], master_election.filter_data [false]
[2013-05-23 19:35:15,171][DEBUG][discovery.zen.elect      ] [xES1] using minimum_master_nodes [-1]
[2013-05-23 19:35:15,174][DEBUG][discovery.zen.fd         ] [xES1] [master] uses ping_interval [1s], ping_timeout [30s], ping_retries [3]
[2013-05-23 19:35:15,185][DEBUG][discovery.zen.fd         ] [xES1] [node  ] uses ping_interval [1s], ping_timeout [30s], ping_retries [3]
[2013-05-23 19:35:15,264][DEBUG][monitor.jvm              ] [xES1] enabled [true], last_gc_enabled [false], interval [1s], gc_threshold [{default=GcThreshold{name='default', warnThreshold=10000, infoThreshold=5000, debugThreshold=2000}, ParNew=GcThreshold{name='ParNew', warnThreshold=1000, infoThreshold=700, debugThreshold=400}, ConcurrentMarkSweep=GcThreshold{name='ConcurrentMarkSweep', warnThreshold=10000, infoThreshold=5000, debugThreshold=2000}}]
[2013-05-23 19:35:15,782][DEBUG][monitor.os               ] [xES1] Using probe [org.elasticsearch.monitor.os.SigarOsProbe@4ec48e7] with refresh_interval [1s]
[2013-05-23 19:35:15,805][DEBUG][monitor.process          ] [xES1] Using probe [org.elasticsearch.monitor.process.SigarProcessProbe@26d6221b] with refresh_interval [1s]
[2013-05-23 19:35:15,825][DEBUG][monitor.jvm              ] [xES1] Using refresh_interval [1s]
[2013-05-23 19:35:15,827][DEBUG][monitor.network          ] [xES1] Using probe [org.elasticsearch.monitor.network.SigarNetworkProbe@2d923a8f] with refresh_interval [5s]
[2013-05-23 19:35:16,019][DEBUG][monitor.network          ] [xES1] net_info
host [rnynjpxyhcfhxdm]
lo  display_name [Software Loopback Interface 1]
        address [/127.0.0.1] [/0:0:0:0:0:0:0:1] 
        mtu [-1] multicast [true] ptp [false] loopback [true] up [true] virtual [false]
net0    display_name [WAN Miniport (L2TP)]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
net1    display_name [WAN Miniport (SSTP)]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
net2    display_name [WAN Miniport (IKEv2)]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
net3    display_name [WAN Miniport (PPTP)]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
ppp0    display_name [WAN Miniport (PPPOE)]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
eth0    display_name [WAN Miniport (IP)]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
eth1    display_name [WAN Miniport (IPv6)]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
eth2    display_name [WAN Miniport (Network Monitor)]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
eth3    display_name [Microsoft Kernel Debug Network Adapter]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
ppp1    display_name [RAS Async Adapter]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
eth4    display_name [Microsoft Hyper-V Network Adapter]
        address [/10.78.76.39] [/fe80:0:0:0:4ccd:6139:c38f:e027%12] 
        mtu [1500] multicast [true] ptp [false] loopback [false] up [true] virtual [false]
eth5    display_name [WAN Miniport (IP)-QoS Packet Scheduler-0000]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
eth6    display_name [WAN Miniport (IPv6)-QoS Packet Scheduler-0000]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
eth7    display_name [WAN Miniport (Network Monitor)-QoS Packet Scheduler-0000]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
eth8    display_name [Microsoft Hyper-V Network Adapter-QoS Packet Scheduler-0000]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
eth9    display_name [Microsoft Hyper-V Network Adapter-WFP 802.3 MAC Layer LightWeight Filter-0000]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
eth10   display_name [Microsoft Hyper-V Network Adapter-WFP Native MAC Layer LightWeight Filter-0000]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
net4    display_name [Microsoft ISATAP Adapter]
        address [/fe80:0:0:0:0:5efe:a4e:4c27%19] 
        mtu [1280] multicast [false] ptp [true] loopback [false] up [false] virtual [false]

[2013-05-23 19:35:16,105][DEBUG][monitor.fs               ] [xES1] Using probe [org.elasticsearch.monitor.fs.SigarFsProbe@e1b054c] with refresh_interval [1s]
[2013-05-23 19:35:16,703][DEBUG][indices.store            ] [xES1] using indices.store.throttle.type [none], with index.store.throttle.max_bytes_per_sec [0b]
[2013-05-23 19:35:16,718][DEBUG][cache.memory             ] [xES1] using bytebuffer cache with small_buffer_size [1kb], large_buffer_size [1mb], small_cache_size [10mb], large_cache_size [500mb], direct [true]
[2013-05-23 19:35:16,744][DEBUG][script                   ] [xES1] using script cache with max_size [500], expire [null]
[2013-05-23 19:35:16,836][DEBUG][cluster.routing.allocation.decider] [xES1] using node_concurrent_recoveries [2], node_initial_primaries_recoveries [4]
[2013-05-23 19:35:16,840][DEBUG][cluster.routing.allocation.decider] [xES1] using [cluster.routing.allocation.allow_rebalance] with [indices_all_active]
[2013-05-23 19:35:16,841][DEBUG][cluster.routing.allocation.decider] [xES1] using [cluster_concurrent_rebalance] with [2]
[2013-05-23 19:35:16,846][DEBUG][gateway.local            ] [xES1] using initial_shards [quorum], list_timeout [30s]
[2013-05-23 19:35:17,100][DEBUG][indices.recovery         ] [xES1] using max_size_per_sec[0b], concurrent_streams [3], file_chunk_size [512kb], translog_size [512kb], translog_ops [1000], and compress [true]
[2013-05-23 19:35:17,254][DEBUG][http.netty               ] [xES1] using max_chunk_size[8kb], max_header_size[8kb], max_initial_line_length[4kb], max_content_length[100mb], receive_predictor[512kb->512kb]
[2013-05-23 19:35:17,263][DEBUG][indices.memory           ] [xES1] using index_buffer_size [101.5mb], with min_shard_index_buffer_size [4mb], max_shard_index_buffer_size [512mb], shard_inactive_time [30m]
[2013-05-23 19:35:17,279][DEBUG][indices.cache.filter     ] [xES1] using [node] weighted filter cache with size [20%], actual_size [203.1mb], expire [null], clean_interval [1m]
[2013-05-23 19:35:17,282][DEBUG][indices.fielddata.cache  ] [xES1] using size [-1] [-1b], expire [null]
[2013-05-23 19:35:17,297][DEBUG][gateway.local.state.meta ] [xES1] using gateway.local.auto_import_dangled [YES], with gateway.local.dangling_timeout [2h]
[2013-05-23 19:35:17,358][DEBUG][gateway.local.state.meta ] [xES1] took 60ms to load state
[2013-05-23 19:35:17,359][DEBUG][gateway.local.state.shards] [xES1] took 0s to load started shards state
[2013-05-23 19:35:17,367][DEBUG][bulk.udp                 ] [xES1] using enabled [false], host [null], port [9700-9800], bulk_actions [1000], bulk_size [5mb], flush_interval [5s], concurrent_requests [4]
[2013-05-23 19:35:17,370][INFO ][node                     ] [xES1] {0.90.0}[3136]: initialized
[2013-05-23 19:35:17,371][INFO ][node                     ] [xES1] {0.90.0}[3136]: starting ...
[2013-05-23 19:35:17,484][DEBUG][netty.channel.socket.nio.SelectorUtil] Using select timeout of 500
[2013-05-23 19:35:17,485][DEBUG][netty.channel.socket.nio.SelectorUtil] Epoll-bug workaround enabled = false
[2013-05-23 19:35:17,588][DEBUG][transport.netty          ] [xES1] Bound to address [/0:0:0:0:0:0:0:0:9300]
[2013-05-23 19:35:17,659][INFO ][transport                ] [xES1] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/10.78.76.39:9300]}
[2013-05-23 19:35:27,756][DEBUG][discovery.zen            ] [xES1] filtered ping responses: (filter_client[true], filter_data[false]) {none}
[2013-05-23 19:35:27,767][DEBUG][cluster.service          ] [xES1] processing [zen-disco-join (elected_as_master)]: execute
[2013-05-23 19:35:27,769][DEBUG][cluster.service          ] [xES1] cluster state updated, version [1], source [zen-disco-join (elected_as_master)]
[2013-05-23 19:35:27,772][INFO ][cluster.service          ] [xES1] new_master [xES1][QVFMXbh8Tt6A25sY8eZauA][inet[/10.78.76.39:9300]], reason: zen-disco-join (elected_as_master)
[2013-05-23 19:35:27,845][DEBUG][transport.netty          ] [xES1] connected to node [[xES1][QVFMXbh8Tt6A25sY8eZauA][inet[/10.78.76.39:9300]]]
[2013-05-23 19:35:27,851][DEBUG][cluster.service          ] [xES1] processing [zen-disco-join (elected_as_master)]: done applying updated cluster_state
[2013-05-23 19:35:27,852][INFO ][discovery                ] [xES1] elasticsearch/QVFMXbh8Tt6A25sY8eZauA
[2013-05-23 19:35:27,875][DEBUG][cluster.service          ] [xES1] processing [local-gateway-elected-state]: execute
[2013-05-23 19:35:27,897][DEBUG][cluster.service          ] [xES1] cluster state updated, version [2], source [local-gateway-elected-state]
[2013-05-23 19:35:27,982][INFO ][gateway                  ] [xES1] recovered [0] indices into cluster_state
[2013-05-23 19:35:27,984][DEBUG][cluster.service          ] [xES1] processing [local-gateway-elected-state]: done applying updated cluster_state
[2013-05-23 19:35:27,985][DEBUG][river.cluster            ] [xES1] processing [reroute_rivers_node_changed]: execute
[2013-05-23 19:35:27,985][DEBUG][river.cluster            ] [xES1] processing [reroute_rivers_node_changed]: no change in cluster_state
[2013-05-23 19:35:27,987][DEBUG][river.cluster            ] [xES1] processing [reroute_rivers_node_changed]: execute
[2013-05-23 19:35:27,988][DEBUG][river.cluster            ] [xES1] processing [reroute_rivers_node_changed]: no change in cluster_state
[2013-05-23 19:35:28,059][INFO ][http                     ] [xES1] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/10.78.76.39:9200]}
[2013-05-23 19:35:28,061][INFO ][node                     ] [xES1] {0.90.0}[3136]: started
[2013-05-23 19:35:37,849][DEBUG][cluster.service          ] [xES1] processing [routing-table-updater]: execute
[2013-05-23 19:35:37,851][DEBUG][cluster.service          ] [xES1] processing [routing-table-updater]: no change in cluster_state
[2013-05-23 19:40:51,172][TRACE][action.admin.cluster.health] [xES1] Calculating health based on state version [2]

Node2 Config

#################################### Node #####################################

# Node names are generated dynamically on startup, so you're relieved
# from configuring them manually. You can tie this node to a specific name:
#
 node.name: "xES2"

################################## Discovery ##################################

# Discovery infrastructure ensures nodes can be found within a cluster
# and master node is elected. Multicast discovery is the default.

# Set to ensure a node sees N other master eligible nodes to be considered
# operational within the cluster. Set this option to a higher value (2-4)
# for large clusters (>3 nodes):
#
# discovery.zen.minimum_master_nodes: 1

# Set the time to wait for ping responses from other nodes when discovering.
# Set this option to a higher value on a slow or congested network
# to minimize discovery failures:
#
 discovery.zen.ping.timeout: 10s

# See <http://elasticsearch.org/guide/reference/modules/discovery/zen.html>
# for more information.

# Unicast discovery allows to explicitly control which nodes will be used
# to discover the cluster. It can be used when multicast is not present,
# or to restrict the cluster communication-wise.
#
#1. Disable multicast discovery (enabled by default):
#
 discovery.zen.ping.multicast.enabled: false
#
#2. Configure an initial list of master nodes in the cluster
#    to perform discovery when new nodes (master or data) are started:
#
# discovery.zen.ping.unicast.hosts: ["host1", "host2:port", "host3[portX-portY]"]
discovery.zen.ping.unicast.hosts:["rnynjpxyhcfhxdm","elasticsearch3"]

Node 2 Logs

====================================================================
[2013-05-23 19:58:28,732][INFO ][node                     ] [xES2] {0.90.0}[3604]: initializing ...
[2013-05-23 19:58:28,734][DEBUG][node                     ] [xES2] using home [C:\ddapplications\elasticsearch-0.90.0], config [C:\ddapplications\elasticsearch-0.90.0\config], data [[C:\ddapplications\elasticsearch-0.90.0\data]], logs [C:\ddapplications\elasticsearch-0.90.0\logs], work [C:\ddapplications\elasticsearch-0.90.0\work], plugins [C:\ddapplications\elasticsearch-0.90.0\plugins]
[2013-05-23 19:58:28,751][INFO ][plugins                  ] [xES2] loaded [], sites [head]
[2013-05-23 19:58:28,821][DEBUG][common.compress.lzf      ] using [UnsafeChunkDecoder] decoder
[2013-05-23 19:58:28,863][DEBUG][env                      ] [xES2] using node location [[C:\ddapplications\elasticsearch-0.90.0\data\elasticsearch\nodes\0]], local_node_id [0]
[2013-05-23 19:58:31,712][DEBUG][threadpool               ] [xES2] creating thread_pool [generic], type [cached], keep_alive [30s]
[2013-05-23 19:58:31,742][DEBUG][threadpool               ] [xES2] creating thread_pool [index], type [fixed], size [1], queue_size [null], reject_policy [abort], queue_type [linked]
[2013-05-23 19:58:31,743][DEBUG][threadpool               ] [xES2] creating thread_pool [bulk], type [fixed], size [1], queue_size [null], reject_policy [abort], queue_type [linked]
[2013-05-23 19:58:31,744][DEBUG][threadpool               ] [xES2] creating thread_pool [get], type [fixed], size [1], queue_size [null], reject_policy [abort], queue_type [linked]
[2013-05-23 19:58:31,749][DEBUG][threadpool               ] [xES2] creating thread_pool [search], type [fixed], size [2], queue_size [1k], reject_policy [abort], queue_type [linked]
[2013-05-23 19:58:31,750][DEBUG][threadpool               ] [xES2] creating thread_pool [percolate], type [fixed], size [1], queue_size [null], reject_policy [abort], queue_type [linked]
[2013-05-23 19:58:31,751][DEBUG][threadpool               ] [xES2] creating thread_pool [management], type [scaling], min [1], size [5], keep_alive [5m]
[2013-05-23 19:58:31,753][DEBUG][threadpool               ] [xES2] creating thread_pool [flush], type [scaling], min [1], size [1], keep_alive [5m]
[2013-05-23 19:58:31,754][DEBUG][threadpool               ] [xES2] creating thread_pool [merge], type [scaling], min [1], size [1], keep_alive [5m]
[2013-05-23 19:58:31,755][DEBUG][threadpool               ] [xES2] creating thread_pool [refresh], type [scaling], min [1], size [1], keep_alive [5m]
[2013-05-23 19:58:31,756][DEBUG][threadpool               ] [xES2] creating thread_pool [warmer], type [scaling], min [1], size [1], keep_alive [5m]
[2013-05-23 19:58:31,757][DEBUG][threadpool               ] [xES2] creating thread_pool [snapshot], type [scaling], min [1], size [1], keep_alive [5m]
[2013-05-23 19:58:31,827][DEBUG][transport.netty          ] [xES2] using worker_count[2], port[9300-9400], bind_host[null], publish_host[null], compress[false], connect_timeout[30s], connections_per_node[2/6/1], receive_predictor[512kb->512kb]
[2013-05-23 19:58:31,852][DEBUG][discovery.zen.ping.unicast] [xES2] using initial hosts [], with concurrent_connects [10]
[2013-05-23 19:58:31,855][DEBUG][discovery.zen            ] [xES2] using ping.timeout [10s], master_election.filter_client [true], master_election.filter_data [false]
[2013-05-23 19:58:31,857][DEBUG][discovery.zen.elect      ] [xES2] using minimum_master_nodes [-1]
[2013-05-23 19:58:31,860][DEBUG][discovery.zen.fd         ] [xES2] [master] uses ping_interval [1s], ping_timeout [30s], ping_retries [3]
[2013-05-23 19:58:31,871][DEBUG][discovery.zen.fd         ] [xES2] [node  ] uses ping_interval [1s], ping_timeout [30s], ping_retries [3]
[2013-05-23 19:58:31,962][DEBUG][monitor.jvm              ] [xES2] enabled [true], last_gc_enabled [false], interval [1s], gc_threshold [{default=GcThreshold{name='default', warnThreshold=10000, infoThreshold=5000, debugThreshold=2000}, ParNew=GcThreshold{name='ParNew', warnThreshold=1000, infoThreshold=700, debugThreshold=400}, ConcurrentMarkSweep=GcThreshold{name='ConcurrentMarkSweep', warnThreshold=10000, infoThreshold=5000, debugThreshold=2000}}]
[2013-05-23 19:58:32,483][DEBUG][monitor.os               ] [xES2] Using probe [org.elasticsearch.monitor.os.SigarOsProbe@4ec48e7] with refresh_interval [1s]
[2013-05-23 19:58:32,495][DEBUG][monitor.process          ] [xES2] Using probe [org.elasticsearch.monitor.process.SigarProcessProbe@26d6221b] with refresh_interval [1s]
[2013-05-23 19:58:32,515][DEBUG][monitor.jvm              ] [xES2] Using refresh_interval [1s]
[2013-05-23 19:58:32,526][DEBUG][monitor.network          ] [xES2] Using probe [org.elasticsearch.monitor.network.SigarNetworkProbe@2d923a8f] with refresh_interval [5s]
[2013-05-23 19:58:32,711][DEBUG][monitor.network          ] [xES2] net_info
host [elasticsearch3]
lo  display_name [Software Loopback Interface 1]
        address [/127.0.0.1] [/0:0:0:0:0:0:0:1] 
        mtu [-1] multicast [true] ptp [false] loopback [true] up [true] virtual [false]
net0    display_name [WAN Miniport (L2TP)]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
net1    display_name [WAN Miniport (SSTP)]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
net2    display_name [WAN Miniport (IKEv2)]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
net3    display_name [WAN Miniport (PPTP)]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
ppp0    display_name [WAN Miniport (PPPOE)]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
eth0    display_name [WAN Miniport (IP)]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
eth1    display_name [WAN Miniport (IPv6)]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
eth2    display_name [WAN Miniport (Network Monitor)]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
eth3    display_name [Microsoft Kernel Debug Network Adapter]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
ppp1    display_name [RAS Async Adapter]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
eth4    display_name [Microsoft Hyper-V Network Adapter]
        address [/10.78.26.64] [/fe80:0:0:0:e024:1b29:9f61:eda%12] 
        mtu [1500] multicast [true] ptp [false] loopback [false] up [true] virtual [false]
eth5    display_name [WAN Miniport (IP)-QoS Packet Scheduler-0000]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
eth6    display_name [WAN Miniport (IPv6)-QoS Packet Scheduler-0000]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
eth7    display_name [WAN Miniport (Network Monitor)-QoS Packet Scheduler-0000]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
eth8    display_name [Microsoft Hyper-V Network Adapter-QoS Packet Scheduler-0000]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
eth9    display_name [Microsoft Hyper-V Network Adapter-WFP 802.3 MAC Layer LightWeight Filter-0000]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
eth10   display_name [Microsoft Hyper-V Network Adapter-WFP Native MAC Layer LightWeight Filter-0000]
        address 
        mtu [-1] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
net4    display_name [Microsoft ISATAP Adapter]
        address [/fe80:0:0:0:0:5efe:a4e:1a40%19] 
        mtu [1280] multicast [false] ptp [true] loopback [false] up [false] virtual [false]

[2013-05-23 19:58:32,786][DEBUG][monitor.fs               ] [xES2] Using probe [org.elasticsearch.monitor.fs.SigarFsProbe@e1b054c] with refresh_interval [1s]
[2013-05-23 19:58:33,377][DEBUG][indices.store            ] [xES2] using indices.store.throttle.type [none], with index.store.throttle.max_bytes_per_sec [0b]
[2013-05-23 19:58:33,391][DEBUG][cache.memory             ] [xES2] using bytebuffer cache with small_buffer_size [1kb], large_buffer_size [1mb], small_cache_size [10mb], large_cache_size [500mb], direct [true]
[2013-05-23 19:58:33,418][DEBUG][script                   ] [xES2] using script cache with max_size [500], expire [null]
[2013-05-23 19:58:33,515][DEBUG][cluster.routing.allocation.decider] [xES2] using node_concurrent_recoveries [2], node_initial_primaries_recoveries [4]
[2013-05-23 19:58:33,517][DEBUG][cluster.routing.allocation.decider] [xES2] using [cluster.routing.allocation.allow_rebalance] with [indices_all_active]
[2013-05-23 19:58:33,518][DEBUG][cluster.routing.allocation.decider] [xES2] using [cluster_concurrent_rebalance] with [2]
[2013-05-23 19:58:33,523][DEBUG][gateway.local            ] [xES2] using initial_shards [quorum], list_timeout [30s]
[2013-05-23 19:58:33,775][DEBUG][indices.recovery         ] [xES2] using max_size_per_sec[0b], concurrent_streams [3], file_chunk_size [512kb], translog_size [512kb], translog_ops [1000], and compress [true]
[2013-05-23 19:58:33,931][DEBUG][http.netty               ] [xES2] using max_chunk_size[8kb], max_header_size[8kb], max_initial_line_length[4kb], max_content_length[100mb], receive_predictor[512kb->512kb]
[2013-05-23 19:58:33,941][DEBUG][indices.memory           ] [xES2] using index_buffer_size [101.5mb], with min_shard_index_buffer_size [4mb], max_shard_index_buffer_size [512mb], shard_inactive_time [30m]
[2013-05-23 19:58:33,954][DEBUG][indices.cache.filter     ] [xES2] using [node] weighted filter cache with size [20%], actual_size [203.1mb], expire [null], clean_interval [1m]
[2013-05-23 19:58:33,957][DEBUG][indices.fielddata.cache  ] [xES2] using size [-1] [-1b], expire [null]
[2013-05-23 19:58:33,973][DEBUG][gateway.local.state.meta ] [xES2] using gateway.local.auto_import_dangled [YES], with gateway.local.dangling_timeout [2h]
[2013-05-23 19:58:34,033][DEBUG][gateway.local.state.meta ] [xES2] took 59ms to load state
[2013-05-23 19:58:34,034][DEBUG][gateway.local.state.shards] [xES2] took 0s to load started shards state
[2013-05-23 19:58:34,049][DEBUG][bulk.udp                 ] [xES2] using enabled [false], host [null], port [9700-9800], bulk_actions [1000], bulk_size [5mb], flush_interval [5s], concurrent_requests [4]
[2013-05-23 19:58:34,051][INFO ][node                     ] [xES2] {0.90.0}[3604]: initialized
[2013-05-23 19:58:34,052][INFO ][node                     ] [xES2] {0.90.0}[3604]: starting ...
[2013-05-23 19:58:34,259][DEBUG][netty.channel.socket.nio.SelectorUtil] Using select timeout of 500
[2013-05-23 19:58:34,260][DEBUG][netty.channel.socket.nio.SelectorUtil] Epoll-bug workaround enabled = false
[2013-05-23 19:58:34,263][DEBUG][transport.netty          ] [xES2] Bound to address [/0:0:0:0:0:0:0:0:9300]
[2013-05-23 19:58:34,340][INFO ][transport                ] [xES2] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/10.78.26.64:9300]}
[2013-05-23 19:58:44,430][DEBUG][discovery.zen            ] [xES2] filtered ping responses: (filter_client[true], filter_data[false]) {none}
[2013-05-23 19:58:44,439][DEBUG][cluster.service          ] [xES2] processing [zen-disco-join (elected_as_master)]: execute
[2013-05-23 19:58:44,440][DEBUG][cluster.service          ] [xES2] cluster state updated, version [1], source [zen-disco-join (elected_as_master)]
[2013-05-23 19:58:44,442][INFO ][cluster.service          ] [xES2] new_master [xES2][Nss1G-bwT0aU3UYeF-A0Lg][inet[/10.78.26.64:9300]], reason: zen-disco-join (elected_as_master)
[2013-05-23 19:58:44,592][DEBUG][transport.netty          ] [xES2] connected to node [[xES2][Nss1G-bwT0aU3UYeF-A0Lg][inet[/10.78.26.64:9300]]]
[2013-05-23 19:58:44,599][DEBUG][cluster.service          ] [xES2] processing [zen-disco-join (elected_as_master)]: done applying updated cluster_state
[2013-05-23 19:58:44,600][INFO ][discovery                ] [xES2] elasticsearch/Nss1G-bwT0aU3UYeF-A0Lg
[2013-05-23 19:58:44,629][DEBUG][cluster.service          ] [xES2] processing [local-gateway-elected-state]: execute
[2013-05-23 19:58:44,640][DEBUG][cluster.service          ] [xES2] cluster state updated, version [2], source [local-gateway-elected-state]
[2013-05-23 19:58:44,722][INFO ][gateway                  ] [xES2] recovered [0] indices into cluster_state
[2013-05-23 19:58:44,723][DEBUG][cluster.service          ] [xES2] processing [local-gateway-elected-state]: done applying updated cluster_state
[2013-05-23 19:58:44,727][DEBUG][river.cluster            ] [xES2] processing [reroute_rivers_node_changed]: execute
[2013-05-23 19:58:44,728][DEBUG][river.cluster            ] [xES2] processing [reroute_rivers_node_changed]: no change in cluster_state
[2013-05-23 19:58:44,729][DEBUG][river.cluster            ] [xES2] processing [reroute_rivers_node_changed]: execute
[2013-05-23 19:58:44,730][DEBUG][river.cluster            ] [xES2] processing [reroute_rivers_node_changed]: no change in cluster_state
[2013-05-23 19:58:44,796][INFO ][http                     ] [xES2] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/10.78.26.64:9200]}
[2013-05-23 19:58:44,797][INFO ][node                     ] [xES2] {0.90.0}[3604]: started

renaudboutet commented 11 years ago

Hi,

Are you deploying your VMs in the same virtual network? I am a user of Azure and ES (obviously :)) and by experience if you use DNS names only you have to go through the Azure's load balancer which can intercept and cut your connections.

This is working for us as all our machines are deployed under virtual networks.

dynamicdeploy commented 11 years ago

Hello! Renaud,

I am deploying both the VMs in the same cloud service and therefore they should be able to talk to each other. The cloud service creates a network boundary (similar to virtual network). When I deploy, I am able to get to the http://[name of the other machine]:9200 directly without going through the load-balancer. Somehow, from the debug logs, it feels like its not even trying to ping the machines listed in the Config file.

If you are interested, I can setup a quick environment for you and you can try it out.

Thanks for your response.

Tejaswi

dynamicdeploy commented 11 years ago

I also tried using ipaddress, but with the same result.

dynamicdeploy commented 11 years ago

If I hit 9300 in the browser, I get the following error on the other machine. That means, the connection can reach. Then why is it not detecting during startup?

[2013-05-24 16:15:34,176][WARN ][transport.netty ] [Celestial Madonna] exception caught on transport layer [[id: 0xa028c0e0, /10.78.26.64:50376 :> /10.78.76.39:9300]], closing connection java.io.StreamCorruptedException: invalid internal transport message format at org.elasticsearch.transport.netty.SizeHeaderFrameDecoder.decode(SizeHeaderFrameDecoder.java:27) at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425) at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.cleanup(FrameDecoder.java:482) at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.channelDisconnected(FrameDecoder.java:365) at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) at org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559) at org.elasticsearch.common.netty.channel.Channels.fireChannelDisconnected(Channels.java:396) at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:336) at org.elasticsearch.common.netty.channel.socket.nio.NioServerSocketPipelineSink.handleAcceptedSocket(NioServerSocketPipelineSink.java:81) at org.elasticsearch.common.netty.channel.socket.nio.NioServerSocketPipelineSink.eventSunk(NioServerSocketPipelineSink.java:36) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:574) at org.elasticsearch.common.netty.channel.Channels.close(Channels.java:812) at org.elasticsearch.common.netty.channel.AbstractChannel.close(AbstractChannel.java:197) at org.elasticsearch.transport.netty.NettyTransport.exceptionCaught(NettyTransport.java:505) at org.elasticsearch.transport.netty.MessageChannelHandler.exceptionCaught(MessageChannelHandler.java:227) at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:112) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.exceptionCaught(FrameDecoder.java:377) at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:112) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) at org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559) at org.elasticsearch.common.netty.channel.Channels.fireExceptionCaught(Channels.java:525) at org.elasticsearch.common.netty.channel.AbstractChannelSink.exceptionCaught(AbstractChannelSink.java:48) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.notifyHandlerException(DefaultChannelPipeline.java:658) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:566) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) at org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559) at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268) at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255) at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107) at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312) at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88) at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722)

dadoonet commented 11 years ago

Could you try to add a space just before:

 discovery.zen.ping.unicast.hosts:["elasticsearch3","rnynjpxyhcfhxdm"]

This log entry should not be empty AFAIK:

[2013-05-23 19:58:31,852][DEBUG][discovery.zen.ping.unicast] [xES2] using initial hosts [], with concurrent_connects [10]

BTW, we are currently building an azure discovery plugin that will ease your elasticsearch azure discovery setup.

dynamicdeploy commented 11 years ago

Thanks David. That worked. I would love to test the Azure Discovery plugin.

mikelazell commented 11 years ago

I had exactly the same issue. A missing space.. Thanks guys

elastic / elasticsearch