Importing data to ElasticSearch using Pig

splikhita commented 7 years ago

What kind an issue is this?

[ ] Bug report. If you’ve found a bug, please provide a code snippet or test to reproduce it below.
The easier it is to track down the bug, the faster it is solved.

Issue description

I'm importing data to elasticsearch using apache pig. First I installed apache pig, then downloaded ES-Hadoop jar file and imported it to HDFS. While running the commands from the grunt console I get errors.

Steps to reproduce

From the grunt console I've registered ES-Hadoop jar file with pig:

grunt> REGISTER hdfs://localhost:9000/lib/elasticsearch-hadoop-2.1.1.jar;

Then, load the CSV data file as a relation using the following command:

grunt> SOURCE = load '/lib/ch07/crime_data.csv' using PigStorage(',') as (id:chararray, caseNumber:chararray, date:datetime, block:chararray, iucr:chararray, primaryType:chararray, description:chararray, location:chararray, arrest:boolean, domestic:boolean, lat:double,lon:double);

Generated the target Pig relation that has the structure:

grunt> TARGET = foreach SOURCE generate id, caseNumber, date, block, iucr, primaryType, description, location, arrest, domestic, TOTUPLE(lon, lat) AS geoLocation;

And finally, stored the TARGET relation to the Elasticsearch index, as shown in the following command:

grunt> STORE TARGET INTO 'esh_pig/crimes' USING org.elasticsearch.hadoop.pig.EsStorage('es.http.timeout = 5m', 'es.index.auto.create = true', 'es.mapping.names=arrest:isArrest, domestic:isDomestic', 'es.mapping.id=id');

It gives me the below errors:

java.lang.Exception: org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot determine write shards for [esh_pig/crimes]; likely its format is incorrect (maybe it contains illegal characters?)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot determine write shards for [esh_pig/crimes]; likely its format is incorrect (maybe it contains illegal characters?)
    at org.elasticsearch.hadoop.util.Assert.isTrue(Assert.java:50)
    at org.elasticsearch.hadoop.rest.RestService.initSingleIndex(RestService.java:429)
    at org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:394)
    at org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.init(EsOutputFormat.java:173)
    at org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.write(EsOutputFormat.java:149)
    at org.elasticsearch.hadoop.pig.EsStorage.putNext(EsStorage.java:192)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:136)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:95)
    at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:658)
    at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
    at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:281)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:274)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:748)
2017-07-20 10:39:04,246 [communication thread] INFO  org.apache.hadoop.mapred.LocalJobRunner - map > map
2017-07-20 10:39:04,577 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2017-07-20 10:39:04,577 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_local845900398_0005 has failed! Stop running all dependent jobs
2017-07-20 10:39:04,577 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2017-07-20 10:39:04,578 [main] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2017-07-20 10:39:04,580 [main] INFO  org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2017-07-20 10:39:04,580 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2017-07-20 10:39:04,581 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: 

HadoopVersion   PigVersion  UserId  StartedAt   FinishedAt  Features
2.7.3   0.15.0  eshadoop    2017-07-20 10:38:48 2017-07-20 10:39:04 UNKNOWN

Failed!

Failed Jobs:
JobId   Alias   Feature Message Outputs
job_local845900398_0005 SOURCE,TARGET   MAP_ONLY    Message: Job failed!    esh_pig/crimes,

Input(s):
Failed to read data from "/lib/ch07/crime_data.csv"

Output(s):
Failed to produce result in "esh_pig/crimes"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_local845900398_0005

2017-07-20 10:39:04,581 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!

Version Info

OS: : CentOS 7 JVM : 1.8.0_131 Hadoop/Spark: Hadoop 2.7.3 ES : 1.7.1

Can someone please help me!!!

jbaiera commented 7 years ago

@splikhita I would download and try the latest ES-Hadoop version to see if the problem you are seeing is already resolved. ES-Hadoop supports all the way back to the 1.7 line from the current 5.5.0 release.

Second, could you increase the logging level to TRACE on the org.elasticsearch.hadoop package and post the resulting logs here?

splikhita commented 7 years ago

Thank you so much for your response sir. I am new to this area, So if you don't mind could you please let me know where to change the logging level.

splikhita commented 7 years ago

I got it. I've changed all of them to TRACE and here are the logs:

[2017-07-21 10:20:57,512][TRACE][monitor.sigar            ] [myapp_es_data_001] sigar loaded successfully
[2017-07-21 10:20:57,869][TRACE][indices.breaker          ] [myapp_es_data_001] parent circuit breaker with settings [PARENT,type=PARENT,limit=726571417/692.9mb,overhead=1.0]
[2017-07-21 10:20:57,870][TRACE][indices.breaker          ] [myapp_es_data_001] creating ChildCircuitBreaker with settings [FIELDDATA,type=MEMORY,limit=622775500/593.9mb,overhead=1.03]
[2017-07-21 10:20:57,870][TRACE][indices.breaker          ] [myapp_es_data_001] creating ChildCircuitBreaker with settings [REQUEST,type=MEMORY,limit=415183667/395.9mb,overhead=1.0]
[2017-07-21 10:20:57,872][DEBUG][monitor.jvm              ] [myapp_es_data_001] enabled [true], last_gc_enabled [false], interval [1s], gc_threshold [{default=GcThreshold{name='default', warnThreshold=10000, infoThreshold=5000, debugThreshold=2000}, young=GcThreshold{name='young', warnThreshold=1000, infoThreshold=700, debugThreshold=400}, old=GcThreshold{name='old', warnThreshold=10000, infoThreshold=5000, debugThreshold=2000}}]
[2017-07-21 10:20:58,382][DEBUG][monitor.os               ] [myapp_es_data_001] Using probe [org.elasticsearch.monitor.os.SigarOsProbe@2c6ee758] with refresh_interval [1s]
[2017-07-21 10:20:58,386][DEBUG][monitor.process          ] [myapp_es_data_001] Using probe [org.elasticsearch.monitor.process.SigarProcessProbe@7cb2651f] with refresh_interval [1s]
[2017-07-21 10:20:58,392][DEBUG][monitor.jvm              ] [myapp_es_data_001] Using refresh_interval [1s]
[2017-07-21 10:20:58,392][DEBUG][monitor.network          ] [myapp_es_data_001] Using probe [org.elasticsearch.monitor.network.SigarNetworkProbe@14f3c6fc] with refresh_interval [5s]
[2017-07-21 10:20:58,397][DEBUG][monitor.network          ] [myapp_es_data_001] net_info
host [localhost.localdomain]
virbr0  display_name [virbr0]
        address [/192.168.122.1] 
        mtu [1500] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
em3 display_name [em3]
        address [/fe80:0:0:0:a36c:b7e7:5ddb:1158%em3] [/10.180.6.133] 
        mtu [1500] multicast [true] ptp [false] loopback [false] up [true] virtual [false]
lo  display_name [lo]
        address [/0:0:0:0:0:0:0:1%lo] [/127.0.0.1] 
        mtu [65536] multicast [false] ptp [false] loopback [true] up [true] virtual [false]

[2017-07-21 10:20:58,401][TRACE][monitor.network          ] [myapp_es_data_001] ifconfig

lo  Link encap:Local Loopback
    inet addr:127.0.0.1  Mask:255.0.0.0
    UP LOOPBACK RUNNING  MTU:65536  Metric:1
    RX packets:2318251 errors:0 dropped:0 overruns:0 frame:0
    TX packets:2318251 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0
    RX bytes:830054505 (792M)  TX bytes:830054505 (792M)
em3 Link encap:Ethernet HWaddr 24:6E:96:36:21:C4
    inet addr:10.180.6.133  Bcast:10.180.6.255  Mask:255.255.255.0
    UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
    RX packets:2964676 errors:0 dropped:0 overruns:0 frame:0
    TX packets:702668 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0
    RX bytes:682306994 (651M)  TX bytes:109784965 (105M)
virbr0  Link encap:Ethernet HWaddr 52:54:00:DF:D1:32
    inet addr:192.168.122.1  Bcast:192.168.122.255  Mask:255.255.255.0
    UP BROADCAST MULTICAST  MTU:1500  Metric:1
    RX packets:0 errors:0 dropped:0 overruns:0 frame:0
    TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0
    RX bytes:0 (  0 )  TX bytes:0 (  0 )
p4p1    Link encap:Ethernet HWaddr F4:E9:D4:AD:F2:B0
    inet addr:0.0.0.0  Bcast:0.0.0.0  Mask:0.0.0.0
    UP BROADCAST MULTICAST  MTU:1500  Metric:1
    RX packets:0 errors:0 dropped:0 overruns:0 frame:0
    TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0
    RX bytes:0 (  0 )  TX bytes:0 (  0 )
p4p2    Link encap:Ethernet HWaddr F4:E9:D4:AD:F2:B2
    inet addr:0.0.0.0  Bcast:0.0.0.0  Mask:0.0.0.0
    UP BROADCAST MULTICAST  MTU:1500  Metric:1
    RX packets:0 errors:0 dropped:0 overruns:0 frame:0
    TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0
    RX bytes:0 (  0 )  TX bytes:0 (  0 )
virbr0-nic  Link encap:Ethernet HWaddr 52:54:00:DF:D1:32
    inet addr:0.0.0.0  Bcast:0.0.0.0  Mask:0.0.0.0
    BROADCAST MULTICAST  MTU:1500  Metric:1
    RX packets:0 errors:0 dropped:0 overruns:0 frame:0
    TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0
    RX bytes:0 (  0 )  TX bytes:0 (  0 )
em2 Link encap:Ethernet HWaddr 24:6E:96:36:21:C2
    inet addr:0.0.0.0  Bcast:0.0.0.0  Mask:0.0.0.0
    UP BROADCAST MULTICAST  MTU:1500  Metric:1
    RX packets:0 errors:0 dropped:0 overruns:0 frame:0
    TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0
    RX bytes:0 (  0 )  TX bytes:0 (  0 )
em1 Link encap:Ethernet HWaddr 24:6E:96:36:21:C0
    inet addr:0.0.0.0  Bcast:0.0.0.0  Mask:0.0.0.0
    UP BROADCAST MULTICAST  MTU:1500  Metric:1
    RX packets:0 errors:0 dropped:0 overruns:0 frame:0
    TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0
    RX bytes:0 (  0 )  TX bytes:0 (  0 )
em4 Link encap:Ethernet HWaddr 24:6E:96:36:21:C5
    inet addr:0.0.0.0  Bcast:0.0.0.0  Mask:0.0.0.0
    UP BROADCAST MULTICAST  MTU:1500  Metric:1
    RX packets:0 errors:0 dropped:0 overruns:0 frame:0
    TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0
    RX bytes:0 (  0 )  TX bytes:0 (  0 )

[2017-07-21 10:20:58,403][DEBUG][monitor.fs               ] [myapp_es_data_001] Using probe [org.elasticsearch.monitor.fs.SigarFsProbe@7144655b] with refresh_interval [1s]
[2017-07-21 10:20:58,408][DEBUG][common.netty             ] using gathering [true]
[2017-07-21 10:20:58,582][DEBUG][script                   ] [myapp_es_data_001] using script cache with max_size [100], expire [null]
[2017-07-21 10:20:58,585][TRACE][script                   ] [myapp_es_data_001] Using scripts directory [/usr/local/elasticsearch/config/scripts] 
[2017-07-21 10:20:58,587][DEBUG][cluster.routing.allocation.decider] [myapp_es_data_001] using node_concurrent_recoveries [2], node_initial_primaries_recoveries [4]
[2017-07-21 10:20:58,588][DEBUG][cluster.routing.allocation.decider] [myapp_es_data_001] using [cluster.routing.allocation.allow_rebalance] with [indices_all_active]
[2017-07-21 10:20:58,588][DEBUG][cluster.routing.allocation.decider] [myapp_es_data_001] using [cluster_concurrent_rebalance] with [2]
[2017-07-21 10:20:58,590][DEBUG][indices.recovery         ] [myapp_es_data_001] using max_bytes_per_sec[40mb], concurrent_streams [3], file_chunk_size [512kb], translog_size [512kb], translog_ops [1000], and compress [true]
[2017-07-21 10:20:58,639][DEBUG][gateway.local            ] [myapp_es_data_001] using initial_shards [quorum]
[2017-07-21 10:20:58,671][DEBUG][http.netty               ] [myapp_es_data_001] using max_chunk_size[8kb], max_header_size[8kb], max_initial_line_length[4kb], max_content_length[100mb], receive_predictor[512kb->512kb], pipelining[true], pipelining_max_events[10000]
[2017-07-21 10:20:58,678][DEBUG][indices.store            ] [myapp_es_data_001] using indices.store.throttle.type [MERGE], with index.store.throttle.max_bytes_per_sec [20mb]
[2017-07-21 10:20:58,679][DEBUG][indices.memory           ] [myapp_es_data_001] using index_buffer_size [98.9mb], with min_shard_index_buffer_size [4mb], max_shard_index_buffer_size [512mb], shard_inactive_time [5m]
[2017-07-21 10:20:58,679][DEBUG][indices.cache.filter     ] [myapp_es_data_001] using [node] weighted filter cache with size [10%], actual_size [98.9mb], expire [null], clean_interval [1m]
[2017-07-21 10:20:58,680][DEBUG][indices.fielddata.cache  ] [myapp_es_data_001] using size [-1] [-1b], expire [null]
[2017-07-21 10:20:58,687][DEBUG][bulk.udp                 ] [myapp_es_data_001] using enabled [false], host [null], port [9700-9800], bulk_actions [1000], bulk_size [5mb], flush_interval [5s], concurrent_requests [4]
[2017-07-21 10:20:58,689][INFO ][node                     ] [myapp_es_data_001/cluster1] version[1.7.1], pid[93270], build[b88f43f/2015-07-29T09:54:16Z]
[2017-07-21 10:20:58,689][INFO ][node                     ] [myapp_es_data_001/cluster1] initializing ...
[2017-07-21 10:20:58,689][DEBUG][node                     ] [myapp_es_data_001/cluster1] using home [/usr/local/elasticsearch], config [/usr/local/elasticsearch/config], data [[/usr/local/elasticsearch/data]], logs [/usr/local/elasticsearch/logs], work [/usr/local/elasticsearch/work], plugins [/usr/local/elasticsearch/plugins]
[2017-07-21 10:20:58,689][TRACE][plugins                  ] [myapp_es_data_001/cluster1] --- adding plugin [/usr/local/elasticsearch/plugins/head]
[2017-07-21 10:20:58,689][TRACE][plugins                  ] [myapp_es_data_001/cluster1] --- adding plugin [/usr/local/elasticsearch/plugins/marvel]
[2017-07-21 10:20:58,691][DEBUG][plugins                  ] [myapp_es_data_001/cluster1] lucene property is not set in plugin es-plugin.properties file. Skipping test.
[2017-07-21 10:20:58,691][TRACE][plugins                  ] [myapp_es_data_001/cluster1] found a jvm plugin [marvel], [Elasticsearch Management & Monitoring]: with _site structure
[2017-07-21 10:20:58,692][TRACE][plugins                  ] [myapp_es_data_001/cluster1] found a site plugin name [head], version [NA], description [No description found.]
[2017-07-21 10:20:58,692][INFO ][plugins                  ] [myapp_es_data_001/cluster1] loaded [marvel], sites [marvel, head]
[2017-07-21 10:20:58,693][DEBUG][threadpool               ] [myapp_es_data_001/cluster1] creating thread_pool [generic], type [cached], keep_alive [30s]
[2017-07-21 10:20:58,693][DEBUG][threadpool               ] [myapp_es_data_001/cluster1] creating thread_pool [index], type [fixed], size [32], queue_size [200]
[2017-07-21 10:20:58,693][DEBUG][threadpool               ] [myapp_es_data_001/cluster1] creating thread_pool [bulk], type [fixed], size [32], queue_size [50]
[2017-07-21 10:20:58,693][DEBUG][threadpool               ] [myapp_es_data_001/cluster1] creating thread_pool [get], type [fixed], size [32], queue_size [1k]
[2017-07-21 10:20:58,693][DEBUG][threadpool               ] [myapp_es_data_001/cluster1] creating thread_pool [search], type [fixed], size [49], queue_size [1k]
[2017-07-21 10:20:58,694][DEBUG][threadpool               ] [myapp_es_data_001/cluster1] creating thread_pool [suggest], type [fixed], size [32], queue_size [1k]
[2017-07-21 10:20:58,694][DEBUG][threadpool               ] [myapp_es_data_001/cluster1] creating thread_pool [percolate], type [fixed], size [32], queue_size [1k]
[2017-07-21 10:20:58,694][DEBUG][threadpool               ] [myapp_es_data_001/cluster1] creating thread_pool [management], type [scaling], min [1], size [5], keep_alive [5m]
[2017-07-21 10:20:58,695][DEBUG][threadpool               ] [myapp_es_data_001/cluster1] creating thread_pool [listener], type [fixed], size [10], queue_size [null]
[2017-07-21 10:20:58,695][DEBUG][threadpool               ] [myapp_es_data_001/cluster1] creating thread_pool [flush], type [scaling], min [1], size [5], keep_alive [5m]
[2017-07-21 10:20:58,695][DEBUG][threadpool               ] [myapp_es_data_001/cluster1] creating thread_pool [merge], type [scaling], min [1], size [5], keep_alive [5m]
[2017-07-21 10:20:58,695][DEBUG][threadpool               ] [myapp_es_data_001/cluster1] creating thread_pool [refresh], type [scaling], min [1], size [10], keep_alive [5m]
[2017-07-21 10:20:58,695][DEBUG][threadpool               ] [myapp_es_data_001/cluster1] creating thread_pool [warmer], type [scaling], min [1], size [5], keep_alive [5m]
[2017-07-21 10:20:58,695][DEBUG][threadpool               ] [myapp_es_data_001/cluster1] creating thread_pool [snapshot], type [scaling], min [1], size [5], keep_alive [5m]
[2017-07-21 10:20:58,695][DEBUG][threadpool               ] [myapp_es_data_001/cluster1] creating thread_pool [optimize], type [fixed], size [1], queue_size [null]
[2017-07-21 10:20:58,695][DEBUG][threadpool               ] [myapp_es_data_001/cluster1] creating thread_pool [fetch_shard_started], type [scaling], min [1], size [64], keep_alive [5m]
[2017-07-21 10:20:58,695][DEBUG][threadpool               ] [myapp_es_data_001/cluster1] creating thread_pool [fetch_shard_store], type [scaling], min [1], size [64], keep_alive [5m]
[2017-07-21 10:20:58,696][TRACE][marvel.agent             ] agent disabled due to a tribe sub node [cluster1]
[2017-07-21 10:20:58,844][TRACE][monitor.sigar            ] [myapp_es_data_001/cluster1] sigar loaded successfully
[2017-07-21 10:20:58,883][TRACE][indices.breaker          ] [myapp_es_data_001/cluster1] parent circuit breaker with settings [PARENT,type=PARENT,limit=726571417/692.9mb,overhead=1.0]
[2017-07-21 10:20:58,883][TRACE][indices.breaker          ] [myapp_es_data_001/cluster1] creating ChildCircuitBreaker with settings [FIELDDATA,type=MEMORY,limit=622775500/593.9mb,overhead=1.03]
[2017-07-21 10:20:58,883][TRACE][indices.breaker          ] [myapp_es_data_001/cluster1] creating ChildCircuitBreaker with settings [REQUEST,type=MEMORY,limit=415183667/395.9mb,overhead=1.0]
[2017-07-21 10:20:58,884][DEBUG][monitor.jvm              ] [myapp_es_data_001/cluster1] enabled [true], last_gc_enabled [false], interval [1s], gc_threshold [{default=GcThreshold{name='default', warnThreshold=10000, infoThreshold=5000, debugThreshold=2000}, young=GcThreshold{name='young', warnThreshold=1000, infoThreshold=700, debugThreshold=400}, old=GcThreshold{name='old', warnThreshold=10000, infoThreshold=5000, debugThreshold=2000}}]
[2017-07-21 10:20:59,387][DEBUG][monitor.os               ] [myapp_es_data_001/cluster1] Using probe [org.elasticsearch.monitor.os.SigarOsProbe@32f96bba] with refresh_interval [1s]
[2017-07-21 10:20:59,388][DEBUG][monitor.process          ] [myapp_es_data_001/cluster1] Using probe [org.elasticsearch.monitor.process.SigarProcessProbe@58f4b31a] with refresh_interval [1s]
[2017-07-21 10:20:59,388][DEBUG][monitor.jvm              ] [myapp_es_data_001/cluster1] Using refresh_interval [1s]
[2017-07-21 10:20:59,389][DEBUG][monitor.network          ] [myapp_es_data_001/cluster1] Using probe [org.elasticsearch.monitor.network.SigarNetworkProbe@7affc159] with refresh_interval [5s]
[2017-07-21 10:20:59,390][DEBUG][monitor.network          ] [myapp_es_data_001/cluster1] net_info
host [localhost.localdomain]
virbr0  display_name [virbr0]
        address [/192.168.122.1] 
        mtu [1500] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
em3 display_name [em3]
        address [/fe80:0:0:0:a36c:b7e7:5ddb:1158%em3] [/10.180.6.133] 
        mtu [1500] multicast [true] ptp [false] loopback [false] up [true] virtual [false]
lo  display_name [lo]
        address [/0:0:0:0:0:0:0:1%lo] [/127.0.0.1] 
        mtu [65536] multicast [false] ptp [false] loopback [true] up [true] virtual [false]

[2017-07-21 10:20:59,391][TRACE][monitor.network          ] [myapp_es_data_001/cluster1] ifconfig

lo  Link encap:Local Loopback
    inet addr:127.0.0.1  Mask:255.0.0.0
    UP LOOPBACK RUNNING  MTU:65536  Metric:1
    RX packets:2318254 errors:0 dropped:0 overruns:0 frame:0
    TX packets:2318254 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0
    RX bytes:830054878 (792M)  TX bytes:830054878 (792M)
em3 Link encap:Ethernet HWaddr 24:6E:96:36:21:C4
    inet addr:10.180.6.133  Bcast:10.180.6.255  Mask:255.255.255.0
    UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
    RX packets:2964685 errors:0 dropped:0 overruns:0 frame:0
    TX packets:702668 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0
    RX bytes:682307534 (651M)  TX bytes:109784965 (105M)
virbr0  Link encap:Ethernet HWaddr 52:54:00:DF:D1:32
    inet addr:192.168.122.1  Bcast:192.168.122.255  Mask:255.255.255.0
    UP BROADCAST MULTICAST  MTU:1500  Metric:1
    RX packets:0 errors:0 dropped:0 overruns:0 frame:0
    TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0
    RX bytes:0 (  0 )  TX bytes:0 (  0 )
p4p1    Link encap:Ethernet HWaddr F4:E9:D4:AD:F2:B0
    inet addr:0.0.0.0  Bcast:0.0.0.0  Mask:0.0.0.0
    UP BROADCAST MULTICAST  MTU:1500  Metric:1
    RX packets:0 errors:0 dropped:0 overruns:0 frame:0
    TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0
    RX bytes:0 (  0 )  TX bytes:0 (  0 )
p4p2    Link encap:Ethernet HWaddr F4:E9:D4:AD:F2:B2
    inet addr:0.0.0.0  Bcast:0.0.0.0  Mask:0.0.0.0
    UP BROADCAST MULTICAST  MTU:1500  Metric:1
    RX packets:0 errors:0 dropped:0 overruns:0 frame:0
    TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0
    RX bytes:0 (  0 )  TX bytes:0 (  0 )
virbr0-nic  Link encap:Ethernet HWaddr 52:54:00:DF:D1:32
    inet addr:0.0.0.0  Bcast:0.0.0.0  Mask:0.0.0.0
    BROADCAST MULTICAST  MTU:1500  Metric:1
    RX packets:0 errors:0 dropped:0 overruns:0 frame:0
    TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0
    RX bytes:0 (  0 )  TX bytes:0 (  0 )
em2 Link encap:Ethernet HWaddr 24:6E:96:36:21:C2
    inet addr:0.0.0.0  Bcast:0.0.0.0  Mask:0.0.0.0
    UP BROADCAST MULTICAST  MTU:1500  Metric:1
    RX packets:0 errors:0 dropped:0 overruns:0 frame:0
    TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0
    RX bytes:0 (  0 )  TX bytes:0 (  0 )
em1 Link encap:Ethernet HWaddr 24:6E:96:36:21:C0
    inet addr:0.0.0.0  Bcast:0.0.0.0  Mask:0.0.0.0
    UP BROADCAST MULTICAST  MTU:1500  Metric:1
    RX packets:0 errors:0 dropped:0 overruns:0 frame:0
    TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0
    RX bytes:0 (  0 )  TX bytes:0 (  0 )
em4 Link encap:Ethernet HWaddr 24:6E:96:36:21:C5
    inet addr:0.0.0.0  Bcast:0.0.0.0  Mask:0.0.0.0
    UP BROADCAST MULTICAST  MTU:1500  Metric:1
    RX packets:0 errors:0 dropped:0 overruns:0 frame:0
    TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0
    RX bytes:0 (  0 )  TX bytes:0 (  0 )

[2017-07-21 10:20:59,392][DEBUG][monitor.fs               ] [myapp_es_data_001/cluster1] Using probe [org.elasticsearch.monitor.fs.SigarFsProbe@4eb9f2af] with refresh_interval [1s]
[2017-07-21 10:20:59,393][DEBUG][discovery.zen.elect      ] [myapp_es_data_001/cluster1] using minimum_master_nodes [-1]
[2017-07-21 10:20:59,395][DEBUG][discovery.zen.ping.multicast] [myapp_es_data_001/cluster1] using group [224.2.2.4], with port [54328], ttl [3], and address [null]
[2017-07-21 10:20:59,398][DEBUG][discovery.zen.ping.unicast] [myapp_es_data_001/cluster1] using initial hosts [], with concurrent_connects [10]
[2017-07-21 10:20:59,399][DEBUG][discovery.zen            ] [myapp_es_data_001/cluster1] using ping.timeout [3s], join.timeout [1m], master_election.filter_client [true], master_election.filter_data [false]
[2017-07-21 10:20:59,400][DEBUG][discovery.zen.fd         ] [myapp_es_data_001/cluster1] [master] uses ping_interval [1s], ping_timeout [30s], ping_retries [3]
[2017-07-21 10:20:59,402][DEBUG][discovery.zen.fd         ] [myapp_es_data_001/cluster1] [node  ] uses ping_interval [1s], ping_timeout [30s], ping_retries [3]
[2017-07-21 10:20:59,407][DEBUG][script                   ] [myapp_es_data_001/cluster1] using script cache with max_size [100], expire [null]
[2017-07-21 10:20:59,407][TRACE][script                   ] [myapp_es_data_001/cluster1] Using scripts directory [/usr/local/elasticsearch/config/scripts] 
[2017-07-21 10:20:59,408][DEBUG][cluster.routing.allocation.decider] [myapp_es_data_001/cluster1] using node_concurrent_recoveries [2], node_initial_primaries_recoveries [4]
[2017-07-21 10:20:59,408][DEBUG][cluster.routing.allocation.decider] [myapp_es_data_001/cluster1] using [cluster.routing.allocation.allow_rebalance] with [indices_all_active]
[2017-07-21 10:20:59,408][DEBUG][cluster.routing.allocation.decider] [myapp_es_data_001/cluster1] using [cluster_concurrent_rebalance] with [2]
[2017-07-21 10:20:59,409][DEBUG][indices.recovery         ] [myapp_es_data_001/cluster1] using max_bytes_per_sec[40mb], concurrent_streams [3], file_chunk_size [512kb], translog_size [512kb], translog_ops [1000], and compress [true]
[2017-07-21 10:20:59,409][DEBUG][gateway.local            ] [myapp_es_data_001/cluster1] using initial_shards [quorum]
[2017-07-21 10:20:59,418][DEBUG][indices.store            ] [myapp_es_data_001/cluster1] using indices.store.throttle.type [MERGE], with index.store.throttle.max_bytes_per_sec [20mb]
[2017-07-21 10:20:59,419][DEBUG][indices.memory           ] [myapp_es_data_001/cluster1] using index_buffer_size [98.9mb], with min_shard_index_buffer_size [4mb], max_shard_index_buffer_size [512mb], shard_inactive_time [5m]
[2017-07-21 10:20:59,419][DEBUG][indices.cache.filter     ] [myapp_es_data_001/cluster1] using [node] weighted filter cache with size [10%], actual_size [98.9mb], expire [null], clean_interval [1m]
[2017-07-21 10:20:59,419][DEBUG][indices.fielddata.cache  ] [myapp_es_data_001/cluster1] using size [-1] [-1b], expire [null]
[2017-07-21 10:20:59,425][DEBUG][gateway.local.state.meta ] [myapp_es_data_001/cluster1] using gateway.local.auto_import_dangled [YES], gateway.local.delete_timeout [30s], with gateway.local.dangling_timeout [2h]
[2017-07-21 10:20:59,426][DEBUG][bulk.udp                 ] [myapp_es_data_001/cluster1] using enabled [false], host [null], port [9700-9800], bulk_actions [1000], bulk_size [5mb], flush_interval [5s], concurrent_requests [4]
[2017-07-21 10:20:59,430][INFO ][node                     ] [myapp_es_data_001/cluster1] initialized
[2017-07-21 10:20:59,430][INFO ][node                     ] [myapp_es_data_001/cluster2] version[1.7.1], pid[93270], build[b88f43f/2015-07-29T09:54:16Z]
[2017-07-21 10:20:59,430][INFO ][node                     ] [myapp_es_data_001/cluster2] initializing ...
[2017-07-21 10:20:59,430][DEBUG][node                     ] [myapp_es_data_001/cluster2] using home [/usr/local/elasticsearch], config [/usr/local/elasticsearch/config], data [[/usr/local/elasticsearch/data]], logs [/usr/local/elasticsearch/logs], work [/usr/local/elasticsearch/work], plugins [/usr/local/elasticsearch/plugins]
[2017-07-21 10:20:59,430][TRACE][plugins                  ] [myapp_es_data_001/cluster2] --- adding plugin [/usr/local/elasticsearch/plugins/head]
[2017-07-21 10:20:59,430][TRACE][plugins                  ] [myapp_es_data_001/cluster2] --- adding plugin [/usr/local/elasticsearch/plugins/marvel]
[2017-07-21 10:20:59,432][DEBUG][plugins                  ] [myapp_es_data_001/cluster2] lucene property is not set in plugin es-plugin.properties file. Skipping test.
[2017-07-21 10:20:59,432][TRACE][plugins                  ] [myapp_es_data_001/cluster2] found a jvm plugin [marvel], [Elasticsearch Management & Monitoring]: with _site structure
[2017-07-21 10:20:59,432][TRACE][plugins                  ] [myapp_es_data_001/cluster2] found a site plugin name [head], version [NA], description [No description found.]
[2017-07-21 10:20:59,432][INFO ][plugins                  ] [myapp_es_data_001/cluster2] loaded [marvel], sites [marvel, head]
[2017-07-21 10:20:59,432][DEBUG][threadpool               ] [myapp_es_data_001/cluster2] creating thread_pool [generic], type [cached], keep_alive [30s]
[2017-07-21 10:20:59,432][DEBUG][threadpool               ] [myapp_es_data_001/cluster2] creating thread_pool [index], type [fixed], size [32], queue_size [200]
[2017-07-21 10:20:59,432][DEBUG][threadpool               ] [myapp_es_data_001/cluster2] creating thread_pool [bulk], type [fixed], size [32], queue_size [50]
[2017-07-21 10:20:59,432][DEBUG][threadpool               ] [myapp_es_data_001/cluster2] creating thread_pool [get], type [fixed], size [32], queue_size [1k]
[2017-07-21 10:20:59,433][DEBUG][threadpool               ] [myapp_es_data_001/cluster2] creating thread_pool [search], type [fixed], size [49], queue_size [1k]
[2017-07-21 10:20:59,433][DEBUG][threadpool               ] [myapp_es_data_001/cluster2] creating thread_pool [suggest], type [fixed], size [32], queue_size [1k]
[2017-07-21 10:20:59,433][DEBUG][threadpool               ] [myapp_es_data_001/cluster2] creating thread_pool [percolate], type [fixed], size [32], queue_size [1k]
[2017-07-21 10:20:59,433][DEBUG][threadpool               ] [myapp_es_data_001/cluster2] creating thread_pool [management], type [scaling], min [1], size [5], keep_alive [5m]
[2017-07-21 10:20:59,433][DEBUG][threadpool               ] [myapp_es_data_001/cluster2] creating thread_pool [listener], type [fixed], size [10], queue_size [null]
[2017-07-21 10:20:59,433][DEBUG][threadpool               ] [myapp_es_data_001/cluster2] creating thread_pool [flush], type [scaling], min [1], size [5], keep_alive [5m]
[2017-07-21 10:20:59,433][DEBUG][threadpool               ] [myapp_es_data_001/cluster2] creating thread_pool [merge], type [scaling], min [1], size [5], keep_alive [5m]
[2017-07-21 10:20:59,434][DEBUG][threadpool               ] [myapp_es_data_001/cluster2] creating thread_pool [refresh], type [scaling], min [1], size [10], keep_alive [5m]
[2017-07-21 10:20:59,434][DEBUG][threadpool               ] [myapp_es_data_001/cluster2] creating thread_pool [warmer], type [scaling], min [1], size [5], keep_alive [5m]
[2017-07-21 10:20:59,434][DEBUG][threadpool               ] [myapp_es_data_001/cluster2] creating thread_pool [snapshot], type [scaling], min [1], size [5], keep_alive [5m]
[2017-07-21 10:20:59,434][DEBUG][threadpool               ] [myapp_es_data_001/cluster2] creating thread_pool [optimize], type [fixed], size [1], queue_size [null]
[2017-07-21 10:20:59,434][DEBUG][threadpool               ] [myapp_es_data_001/cluster2] creating thread_pool [fetch_shard_started], type [scaling], min [1], size [64], keep_alive [5m]
[2017-07-21 10:20:59,434][DEBUG][threadpool               ] [myapp_es_data_001/cluster2] creating thread_pool [fetch_shard_store], type [scaling], min [1], size [64], keep_alive [5m]
[2017-07-21 10:20:59,435][TRACE][marvel.agent             ] agent disabled due to a tribe sub node [cluster2]
[2017-07-21 10:20:59,474][TRACE][monitor.sigar            ] [myapp_es_data_001/cluster2] sigar loaded successfully
[2017-07-21 10:20:59,492][TRACE][indices.breaker          ] [myapp_es_data_001/cluster2] parent circuit breaker with settings [PARENT,type=PARENT,limit=726571417/692.9mb,overhead=1.0]
[2017-07-21 10:20:59,492][TRACE][indices.breaker          ] [myapp_es_data_001/cluster2] creating ChildCircuitBreaker with settings [FIELDDATA,type=MEMORY,limit=622775500/593.9mb,overhead=1.03]
[2017-07-21 10:20:59,492][TRACE][indices.breaker          ] [myapp_es_data_001/cluster2] creating ChildCircuitBreaker with settings [REQUEST,type=MEMORY,limit=415183667/395.9mb,overhead=1.0]
[2017-07-21 10:20:59,492][DEBUG][monitor.jvm              ] [myapp_es_data_001/cluster2] enabled [true], last_gc_enabled [false], interval [1s], gc_threshold [{default=GcThreshold{name='default', warnThreshold=10000, infoThreshold=5000, debugThreshold=2000}, young=GcThreshold{name='young', warnThreshold=1000, infoThreshold=700, debugThreshold=400}, old=GcThreshold{name='old', warnThreshold=10000, infoThreshold=5000, debugThreshold=2000}}]
[2017-07-21 10:20:59,996][DEBUG][monitor.os               ] [myapp_es_data_001/cluster2] Using probe [org.elasticsearch.monitor.os.SigarOsProbe@7fb8bad0] with refresh_interval [1s]
[2017-07-21 10:20:59,996][DEBUG][monitor.process          ] [myapp_es_data_001/cluster2] Using probe [org.elasticsearch.monitor.process.SigarProcessProbe@236ae13d] with refresh_interval [1s]
[2017-07-21 10:20:59,996][DEBUG][monitor.jvm              ] [myapp_es_data_001/cluster2] Using refresh_interval [1s]
[2017-07-21 10:20:59,996][DEBUG][monitor.network          ] [myapp_es_data_001/cluster2] Using probe [org.elasticsearch.monitor.network.SigarNetworkProbe@32256e68] with refresh_interval [5s]
[2017-07-21 10:20:59,997][DEBUG][monitor.network          ] [myapp_es_data_001/cluster2] net_info
host [localhost.localdomain]
virbr0  display_name [virbr0]
        address [/192.168.122.1] 
        mtu [1500] multicast [true] ptp [false] loopback [false] up [false] virtual [false]
em3 display_name [em3]
        address [/fe80:0:0:0:a36c:b7e7:5ddb:1158%em3] [/10.180.6.133] 
        mtu [1500] multicast [true] ptp [false] loopback [false] up [true] virtual [false]
lo  display_name [lo]
        address [/0:0:0:0:0:0:0:1%lo] [/127.0.0.1] 
        mtu [65536] multicast [false] ptp [false] loopback [true] up [true] virtual [false]

[2017-07-21 10:20:59,999][TRACE][monitor.network          ] [myapp_es_data_001/cluster2] ifconfig

lo  Link encap:Local Loopback
    inet addr:127.0.0.1  Mask:255.0.0.0
    UP LOOPBACK RUNNING  MTU:65536  Metric:1
    RX packets:2318257 errors:0 dropped:0 overruns:0 frame:0
    TX packets:2318257 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0
    RX bytes:830055251 (792M)  TX bytes:830055251 (792M)
em3 Link encap:Ethernet HWaddr 24:6E:96:36:21:C4
    inet addr:10.180.6.133  Bcast:10.180.6.255  Mask:255.255.255.0
    UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
    RX packets:2964691 errors:0 dropped:0 overruns:0 frame:0
    TX packets:702668 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0
    RX bytes:682307894 (651M)  TX bytes:109784965 (105M)
virbr0  Link encap:Ethernet HWaddr 52:54:00:DF:D1:32
    inet addr:192.168.122.1  Bcast:192.168.122.255  Mask:255.255.255.0
    UP BROADCAST MULTICAST  MTU:1500  Metric:1
    RX packets:0 errors:0 dropped:0 overruns:0 frame:0
    TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0
    RX bytes:0 (  0 )  TX bytes:0 (  0 )
p4p1    Link encap:Ethernet HWaddr F4:E9:D4:AD:F2:B0
    inet addr:0.0.0.0  Bcast:0.0.0.0  Mask:0.0.0.0
    UP BROADCAST MULTICAST  MTU:1500  Metric:1
    RX packets:0 errors:0 dropped:0 overruns:0 frame:0
    TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0
    RX bytes:0 (  0 )  TX bytes:0 (  0 )
p4p2    Link encap:Ethernet HWaddr F4:E9:D4:AD:F2:B2
    inet addr:0.0.0.0  Bcast:0.0.0.0  Mask:0.0.0.0
    UP BROADCAST MULTICAST  MTU:1500  Metric:1
    RX packets:0 errors:0 dropped:0 overruns:0 frame:0
    TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0
    RX bytes:0 (  0 )  TX bytes:0 (  0 )
virbr0-nic  Link encap:Ethernet HWaddr 52:54:00:DF:D1:32
    inet addr:0.0.0.0  Bcast:0.0.0.0  Mask:0.0.0.0
    BROADCAST MULTICAST  MTU:1500  Metric:1
    RX packets:0 errors:0 dropped:0 overruns:0 frame:0
    TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0
    RX bytes:0 (  0 )  TX bytes:0 (  0 )
em2 Link encap:Ethernet HWaddr 24:6E:96:36:21:C2
    inet addr:0.0.0.0  Bcast:0.0.0.0  Mask:0.0.0.0
    UP BROADCAST MULTICAST  MTU:1500  Metric:1
    RX packets:0 errors:0 dropped:0 overruns:0 frame:0
    TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0
    RX bytes:0 (  0 )  TX bytes:0 (  0 )
em1 Link encap:Ethernet HWaddr 24:6E:96:36:21:C0
    inet addr:0.0.0.0  Bcast:0.0.0.0  Mask:0.0.0.0
    UP BROADCAST MULTICAST  MTU:1500  Metric:1
    RX packets:0 errors:0 dropped:0 overruns:0 frame:0
    TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0
    RX bytes:0 (  0 )  TX bytes:0 (  0 )
em4 Link encap:Ethernet HWaddr 24:6E:96:36:21:C5
    inet addr:0.0.0.0  Bcast:0.0.0.0  Mask:0.0.0.0
    UP BROADCAST MULTICAST  MTU:1500  Metric:1
    RX packets:0 errors:0 dropped:0 overruns:0 frame:0
    TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0
    RX bytes:0 (  0 )  TX bytes:0 (  0 )

[2017-07-21 10:21:00,000][DEBUG][monitor.fs               ] [myapp_es_data_001/cluster2] Using probe [org.elasticsearch.monitor.fs.SigarFsProbe@63bfdbcb] with refresh_interval [1s]
[2017-07-21 10:21:00,001][DEBUG][discovery.zen.elect      ] [myapp_es_data_001/cluster2] using minimum_master_nodes [-1]
[2017-07-21 10:21:00,001][DEBUG][discovery.zen.ping.multicast] [myapp_es_data_001/cluster2] using group [224.2.2.4], with port [54328], ttl [3], and address [null]
[2017-07-21 10:21:00,001][DEBUG][discovery.zen.ping.unicast] [myapp_es_data_001/cluster2] using initial hosts [], with concurrent_connects [10]
[2017-07-21 10:21:00,001][DEBUG][discovery.zen            ] [myapp_es_data_001/cluster2] using ping.timeout [3s], join.timeout [1m], master_election.filter_client [true], master_election.filter_data [false]
[2017-07-21 10:21:00,001][DEBUG][discovery.zen.fd         ] [myapp_es_data_001/cluster2] [master] uses ping_interval [1s], ping_timeout [30s], ping_retries [3]
[2017-07-21 10:21:00,001][DEBUG][discovery.zen.fd         ] [myapp_es_data_001/cluster2] [node  ] uses ping_interval [1s], ping_timeout [30s], ping_retries [3]
[2017-07-21 10:21:00,002][DEBUG][script                   ] [myapp_es_data_001/cluster2] using script cache with max_size [100], expire [null]
[2017-07-21 10:21:00,003][TRACE][script                   ] [myapp_es_data_001/cluster2] Using scripts directory [/usr/local/elasticsearch/config/scripts] 
[2017-07-21 10:21:00,003][DEBUG][cluster.routing.allocation.decider] [myapp_es_data_001/cluster2] using node_concurrent_recoveries [2], node_initial_primaries_recoveries [4]
[2017-07-21 10:21:00,003][DEBUG][cluster.routing.allocation.decider] [myapp_es_data_001/cluster2] using [cluster.routing.allocation.allow_rebalance] with [indices_all_active]
[2017-07-21 10:21:00,003][DEBUG][cluster.routing.allocation.decider] [myapp_es_data_001/cluster2] using [cluster_concurrent_rebalance] with [2]
[2017-07-21 10:21:00,003][DEBUG][indices.recovery         ] [myapp_es_data_001/cluster2] using max_bytes_per_sec[40mb], concurrent_streams [3], file_chunk_size [512kb], translog_size [512kb], translog_ops [1000], and compress [true]
[2017-07-21 10:21:00,004][DEBUG][gateway.local            ] [myapp_es_data_001/cluster2] using initial_shards [quorum]
[2017-07-21 10:21:00,011][DEBUG][indices.store            ] [myapp_es_data_001/cluster2] using indices.store.throttle.type [MERGE], with index.store.throttle.max_bytes_per_sec [20mb]
[2017-07-21 10:21:00,011][DEBUG][indices.memory           ] [myapp_es_data_001/cluster2] using index_buffer_size [98.9mb], with min_shard_index_buffer_size [4mb], max_shard_index_buffer_size [512mb], shard_inactive_time [5m]
[2017-07-21 10:21:00,011][DEBUG][indices.cache.filter     ] [myapp_es_data_001/cluster2] using [node] weighted filter cache with size [10%], actual_size [98.9mb], expire [null], clean_interval [1m]
[2017-07-21 10:21:00,011][DEBUG][indices.fielddata.cache  ] [myapp_es_data_001/cluster2] using size [-1] [-1b], expire [null]
[2017-07-21 10:21:00,013][DEBUG][gateway.local.state.meta ] [myapp_es_data_001/cluster2] using gateway.local.auto_import_dangled [YES], gateway.local.delete_timeout [30s], with gateway.local.dangling_timeout [2h]
[2017-07-21 10:21:00,014][DEBUG][bulk.udp                 ] [myapp_es_data_001/cluster2] using enabled [false], host [null], port [9700-9800], bulk_actions [1000], bulk_size [5mb], flush_interval [5s], concurrent_requests [4]
[2017-07-21 10:21:00,015][INFO ][node                     ] [myapp_es_data_001/cluster2] initialized
[2017-07-21 10:21:00,020][INFO ][node                     ] [myapp_es_data_001] initialized
[2017-07-21 10:21:00,020][INFO ][node                     ] [myapp_es_data_001] starting ...
[2017-07-21 10:21:00,029][DEBUG][netty.channel.socket.nio.SelectorUtil] Using select timeout of 500
[2017-07-21 10:21:00,029][DEBUG][netty.channel.socket.nio.SelectorUtil] Epoll-bug workaround enabled = false
[2017-07-21 10:21:00,052][DEBUG][transport.netty          ] [myapp_es_data_001] using profile[default], worker_count[64], port[9300-9400], bind_host[null], publish_host[null], compress[false], connect_timeout[30s], connections_per_node[2/3/6/1/1], receive_predictor[512kb->512kb]
[2017-07-21 10:21:00,085][DEBUG][transport.netty          ] [myapp_es_data_001] Bound profile [default] to address [/0:0:0:0:0:0:0:0:9300]
[2017-07-21 10:21:00,086][INFO ][transport                ] [myapp_es_data_001] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/10.180.6.133:9300]}
[2017-07-21 10:21:00,094][DEBUG][discovery.local          ] [myapp_es_data_001] Connected to cluster [Cluster [myapp_prod]]
[2017-07-21 10:21:00,096][INFO ][discovery                ] [myapp_es_data_001] myapp_prod/ooDMjZ8wQFG8ew71Ff5x1w
[2017-07-21 10:21:00,096][WARN ][discovery                ] [myapp_es_data_001] waited for 0s and no initial state was set by the discovery
[2017-07-21 10:21:00,096][DEBUG][gateway                  ] [myapp_es_data_001] can't wait on start for (possibly) reading state from gateway, will do it asynchronously
[2017-07-21 10:21:00,097][DEBUG][cluster.service          ] [myapp_es_data_001] processing [local-disco-initial_connect(master)]: execute
[2017-07-21 10:21:00,102][TRACE][cluster.service          ] [myapp_es_data_001] cluster state updated, source [local-disco-initial_connect(master)]
version: 1
meta data version: 0
nodes: 
   [myapp_es_data_001][ooDMjZ8wQFG8ew71Ff5x1w][localhost.localdomain][inet[/10.180.6.133:9300]]{client=true, data=false, master=true}, local, master
routing_table (version 0):
routing_nodes:
---- unassigned

[2017-07-21 10:21:00,150][INFO ][cluster.service          ] [myapp_es_data_001] master {new [myapp_es_data_001][ooDMjZ8wQFG8ew71Ff5x1w][localhost.localdomain][inet[/10.180.6.133:9300]]{client=true, data=false, master=true}}, removed {[myapp_es_data_001][8MOGZbGrQUCPAejhOJDcCg][localhost.localdomain][inet[/10.180.6.133:9300]]{client=true, data=false, master=true},}, reason: local-disco-initial_connect(master)
[2017-07-21 10:21:00,150][DEBUG][cluster.service          ] [myapp_es_data_001] publishing cluster state version 1
[2017-07-21 10:21:00,156][INFO ][http                     ] [myapp_es_data_001] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/10.180.6.133:9200]}
[2017-07-21 10:21:00,156][INFO ][node                     ] [myapp_es_data_001/cluster1] starting ...
[2017-07-21 10:21:00,158][DEBUG][cluster.service          ] [myapp_es_data_001] set local cluster state to version 1
[2017-07-21 10:21:00,160][TRACE][cluster.routing          ] [myapp_es_data_001] no need to schedule reroute due to delayed unassigned, next_delay_setting [0], registered [9223372036854775807]
[2017-07-21 10:21:00,162][DEBUG][river.cluster            ] [myapp_es_data_001] processing [reroute_rivers_node_changed]: execute
[2017-07-21 10:21:00,162][DEBUG][river.cluster            ] [myapp_es_data_001] processing [reroute_rivers_node_changed]: no change in cluster_state
[2017-07-21 10:21:00,163][TRACE][cluster                  ] [myapp_es_data_001] I have been elected master, scheduling a ClusterInfoUpdateJob
[2017-07-21 10:21:00,163][DEBUG][cluster.service          ] [myapp_es_data_001] processing [local-disco-initial_connect(master)]: took 65ms done applying updated cluster_state (version: 1)
[2017-07-21 10:21:00,172][DEBUG][transport.netty          ] [myapp_es_data_001/cluster1] using profile[default], worker_count[64], port[9300-9400], bind_host[null], publish_host[null], compress[false], connect_timeout[30s], connections_per_node[2/3/6/1/1], receive_predictor[512kb->512kb]
[2017-07-21 10:21:00,189][DEBUG][transport.netty          ] [myapp_es_data_001/cluster1] Bound profile [default] to address [/0:0:0:0:0:0:0:0:9301]
[2017-07-21 10:21:00,189][INFO ][transport                ] [myapp_es_data_001/cluster1] bound_address {inet[/0:0:0:0:0:0:0:0:9301]}, publish_address {inet[/10.180.6.133:9301]}
[2017-07-21 10:21:00,195][INFO ][discovery                ] [myapp_es_data_001/cluster1] elasticsearch/hDR-MWl1SyGPzFjMFPNbZw
[2017-07-21 10:21:00,196][DEBUG][cluster.service          ] [myapp_es_data_001/cluster1] processing [initial_join]: execute
[2017-07-21 10:21:00,197][DEBUG][cluster.service          ] [myapp_es_data_001/cluster1] processing [initial_join]: took 0s no change in cluster_state
[2017-07-21 10:21:00,197][TRACE][discovery.zen            ] [myapp_es_data_001/cluster1] starting to ping
[2017-07-21 10:21:00,207][TRACE][discovery.zen.ping.multicast] [myapp_es_data_001/cluster1] [1] sending ping request
[2017-07-21 10:21:01,708][TRACE][discovery.zen.ping.multicast] [myapp_es_data_001/cluster1] [1] sending ping request
[2017-07-21 10:21:03,210][TRACE][discovery.zen.ping.multicast] [myapp_es_data_001/cluster1] [1] sending last pings
[2017-07-21 10:21:03,211][TRACE][discovery.zen.ping.multicast] [myapp_es_data_001/cluster1] [1] sending ping request
[2017-07-21 10:21:03,962][TRACE][discovery.zen            ] [myapp_es_data_001/cluster1] full ping responses: {none}
[2017-07-21 10:21:03,962][DEBUG][discovery.zen            ] [myapp_es_data_001/cluster1] filtered ping responses: (filter_client[true], filter_data[false]) {none}
[2017-07-21 10:21:03,962][TRACE][discovery.zen            ] [myapp_es_data_001/cluster1] starting to ping
[2017-07-21 10:21:03,963][TRACE][discovery.zen.ping.multicast] [myapp_es_data_001/cluster1] [2] sending ping request
[2017-07-21 10:21:05,463][TRACE][discovery.zen.ping.multicast] [myapp_es_data_001/cluster1] [2] sending ping request

jbaiera commented 7 years ago

@splikhita Those look like Elasticsearch logs. Could you attach the TRACE level logs on package org.elasticsearch.hadoop for your Pig job?

splikhita commented 7 years ago

I tried to find them but couldn't. Are these on the elasticsearch-hadoop jar file?

jbaiera commented 7 years ago

@splikhita You'll have to modify the log4j properties file for Pig to allow the org.elasticsearch.hadoop package to log at the TRACE level. That file is often located in the home directory where Pig is installed.

splikhita commented 7 years ago

Yes sir I did change it on that file but my problem is that I couldn't find the logs.

splikhita commented 7 years ago

Hello @jbaiera, I figured out that there were few canges which I have made in the elasticsearch, which was causing the errors. Now I am able to run the commands of loading data into elasticsearch using pig but it gives me an error just for the datetime field. Here is what happens:

From the grunt console I've registered ES-Hadoop jar file with pig:

grunt> REGISTER hdfs://localhost:9000/lib/elasticsearch-hadoop-2.1.1.jar;

Then, load the CSV data file as a relation using the following command:

grunt> SOURCE = load '/lib/crimes_dataset.csv' using PigStorage(',') as (id:chararray, caseNumber:chararray, date:datetime, block:chararray, iucr:chararray, primaryType:chararray, description:chararray, location:chararray, arrest:boolean, domestic:boolean, lat:double,lon:double);

Generated the target Pig relation that has the structure:

grunt> TARGET = foreach SOURCE generate id, caseNumber, date, block, iucr, primaryType, description, location, arrest, domestic, TOTUPLE(lon, lat) AS geoLocation;

And finally, stored the TARGET relation to the Elasticsearch index, as shown in the following command:

grunt> STORE TARGET INTO 'esh_pig/crimes' USING org.elasticsearch.hadoop.pig.EsStorage('es.http.timeout = 5m', 'es.index.auto.create = true', 'es.mapping.names=arrest:isArrest, domestic:isDomestic', 'es.mapping.id=id');

It gives me the following:

2017-07-24 16:50:58,943 [LocalJobRunner Map Task Executor #0] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger - org.apache.pig.builtin.Utf8StorageConverter(FIELD_DISCARDED_TYPE_CONVERSION_FAILED): Unable to interpret value [48, 56, 47, 48, 49, 47, 49, 53, 32, 49, 57, 58, 52, 55] in field being converted to datetime, caught IllegalArgumentException <Invalid format: "08/01/15 19:47" is malformed at "/01/15 19:47"> field discarded
2017-07-24 16:50:58,943 [LocalJobRunner Map Task Executor #0] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger - org.apache.pig.builtin.Utf8StorageConverter(FIELD_DISCARDED_TYPE_CONVERSION_FAILED): Unable to interpret value [48, 56, 47, 48, 49, 47, 49, 53, 32, 49, 57, 58, 52, 53] in field being converted to datetime, caught IllegalArgumentException <Invalid format: "08/01/15 19:45" is malformed at "/01/15 19:45"> field discarded
2017-07-24 16:50:58,944 [LocalJobRunner Map Task Executor #0] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger - org.apache.pig.builtin.Utf8StorageConverter(FIELD_DISCARDED_TYPE_CONVERSION_FAILED): Unable to interpret value [48, 56, 47, 48, 49, 47, 49, 53, 32, 49, 57, 58, 52, 52] in field being converted to datetime, caught IllegalArgumentException <Invalid format: "08/01/15 19:44" is malformed at "/01/15 19:44"> field discarded
2017-07-24 16:50:58,944 [LocalJobRunner Map Task Executor #0] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger - org.apache.pig.builtin.Utf8StorageConverter(FIELD_DISCARDED_TYPE_CONVERSION_FAILED): Unable to interpret value [48, 56, 47, 48, 49, 47, 49, 53, 32, 49, 57, 58, 52, 50] in field being converted to datetime, caught IllegalArgumentException <Invalid format: "08/01/15 19:42" is malformed at "/01/15 19:42"> field discarded
2017-07-24 16:50:58,944 [LocalJobRunner Map Task Executor #0] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger - org.apache.pig.builtin.Utf8StorageConverter(FIELD_DISCARDED_TYPE_CONVERSION_FAILED): Unable to interpret value [48, 56, 47, 48, 49, 47, 49, 53, 32, 49, 57, 58, 51, 57] in field being converted to datetime, caught IllegalArgumentException <Invalid format: "08/01/15 19:39" is malformed at "/01/15 19:39"> field discarded
2017-07-24 16:50:58,944 [LocalJobRunner Map Task Executor #0] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger - org.apache.pig.builtin.Utf8StorageConverter(FIELD_DISCARDED_TYPE_CONVERSION_FAILED): Unable to interpret value [48, 56, 47, 48, 49, 47, 49, 53, 32, 49, 57, 58, 51, 48] in field being converted to datetime, caught IllegalArgumentException <Invalid format: "08/01/15 19:30" is malformed at "/01/15 19:30"> field discarded
2017-07-24 16:50:58,944 [LocalJobRunner Map Task Executor #0] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger - org.apache.pig.builtin.Utf8StorageConverter(FIELD_DISCARDED_TYPE_CONVERSION_FAILED): Unable to interpret value [48, 56, 47, 48, 49, 47, 49, 53, 32, 49, 57, 58, 50, 53] in field being converted to datetime, caught IllegalArgumentException <Invalid format: "08/01/15 19:25" is malformed at "/01/15 19:25"> field discarded
2017-07-24 16:50:58,944 [LocalJobRunner Map Task Executor #0] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger - org.apache.pig.builtin.Utf8StorageConverter(FIELD_DISCARDED_TYPE_CONVERSION_FAILED): Unable to interpret value [48, 56, 47, 48, 49, 47, 49, 53, 32, 49, 57, 58, 50, 52] in field being converted to datetime, caught IllegalArgumentException <Invalid format: "08/01/15 19:24" is malformed at "/01/15 19:24"> field discarded

jbaiera commented 7 years ago

The issue here seems to be with the format of the date not being recognized by Pig. Try loading the date as a string, and converting it to a Date using ToDate(date, '<some format string>') AS date within a GENERATE statement.

Since the original problem seems to be solved, I'm going to go ahead and close this issue.

elastic / elasticsearch-hadoop