elastic / elasticsearch-hadoop

:elephant: Elasticsearch real-time search and analytics natively integrated with Hadoop
https://www.elastic.co/products/hadoop
Apache License 2.0
1.93k stars 990 forks source link

EsTap is not working #162

Closed mbaig closed 10 years ago

mbaig commented 10 years ago

Please note EsTap was/is working as expected in 1.3.0.M2, however, it seems to be broken in the last ~2 weeks of nightly builds. Also note, our usage pattern or code did not change between the release of 1.3.0.M2 and today (2014-03-05).

costin commented 10 years ago

Can you expand on that - what exception do you encounter?

mbaig commented 10 years ago

Hello Costin: thanks for the quick response! Ok, so the exception I'm seeing is below, I pared down everything to keep things simple for debug purposes, but, basically I'm creating an EsTap to an index/type together with an array of fields of interest, then simply outputting all data from the tap to stdout, no queries to complicate matters, no "es.query" config param either since this also doesn't seem to be working any longer. Note, if I use 1.3.0.M2, everything works as expected, but, not so with the snapshots. 14/03/05 15:19:11 ERROR stream.TrapHandler: caught Throwable, no trap available, rethrowing cascading.tuple.TupleException: unable to read from input identifier: 'unknown' at cascading.tuple.TupleEntrySchemeIterator.hasNext(TupleEntrySchemeIterator.java:127) at cascading.flow.stream.SourceStage.map(SourceStage.java:76) at cascading.flow.stream.SourceStage.run(SourceStage.java:58) at cascading.flow.hadoop.FlowMapper.run(FlowMapper.java:127) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) Caused by: java.lang.IllegalStateException: Cannot build scroll [adevents-2014-02-12/click/_search?search_type=scan&scroll=5&size=50&_source=ri,bot_act,psn,sts,ptz,pv_lo,cid&preference=_shards:1;_only_node:s6gZjGaBT6KFEpn23r5vgA] at org.elasticsearch.hadoop.rest.QueryBuilder.build(QueryBuilder.java:201) at org.elasticsearch.hadoop.mr.EsInputFormat$ShardRecordReader.next(EsInputFormat.java:286) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:192) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176) at cascading.tap.hadoop.util.MeasuredRecordReader.next(MeasuredRecordReader.java:61) at org.elasticsearch.hadoop.cascading.EsHadoopScheme.source(EsHadoopScheme.java:154) at cascading.tuple.TupleEntrySchemeIterator.getNext(TupleEntrySchemeIterator.java:140) at cascading.tuple.TupleEntrySchemeIterator.hasNext(TupleEntrySchemeIterator.java:120) ... 6 more Caused by: java.io.IOException: Out of nodes and retries; caught exception at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:98) at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:250) at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:246) at org.elasticsearch.hadoop.rest.RestClient.scan(RestClient.java:274) at org.elasticsearch.hadoop.rest.RestRepository.scan(RestRepository.java:97) at org.elasticsearch.hadoop.rest.QueryBuilder.build(QueryBuilder.java:199) ... 13 more

mbaig commented 10 years ago

Sorry, hit Close by accident, please ignore.

costin commented 10 years ago

The error indicates a network error, that is es-hadoop cannot connect to your host. I have pushed a new nightly build (20140305.224939-329) can you please try it out and let me know how it goes. It seems that es-hadoop does connect to ES initially but then it starts losing the connection for some reason..

mbaig commented 10 years ago

Just tried that nightly, sorry it didn't work, same exception stack(s). Btw, es-hadoop is able to connect to our ES host if I use 1.3.0.M2, so I think we can rule out poor connectivity issues for us, although, there may still be other programmatic connectivity issues in the es-hadoop client.

costin commented 10 years ago

Can you turn on logging (TRACE level) in log4j.properties on org.elasticsearch.hadoop package and report back your findings (upload to a gist somewhere). Ideally try it on a small data set since there will be a lot of output.

Thanks!

mbaig commented 10 years ago

Hi Costin: here is the trace level log output you requested (see gist below), I had to redact some parts of the logs, so if you see something like, Received [200-OK] [], the empty [] actually was populated correctly, hope you understand. Thanks again for all your help with this and the great work you guys are doing with ES in general! https://gist.github.com/mbaig/9397119

costin commented 10 years ago

Hi,

There are several things suspicious in the logs. There's the network error but there's also the NoSuchMethodError at the end (this one caused by some incompatible library).

There were several improvements made to cascading so I've pushed a nightly build [1] - can you please check it out once it completes. Then if possible, please update the gist of the current build and M2 - I've checked the differences between the two but nothing stands out.

Are you available on IRC? This would make things a lot easier to debug - I'm 'costin' on #elasticsearch. Let's connect in 30' or so if that works for you.

Thanks!

On 3/6/2014 9:19 PM, mbaig wrote:

Hi Costin: here is the trace level log output you requested (see gist below), I had to redact some parts of the logs, so if you see something like, Received [200-OK] [], the empty [] actually was populated correctly, hope you understand. Thanks again for all your help with this and the great work you guys are doing with ES in general! https://gist.github.com/mbaig/9397119

— Reply to this email directly or view it on GitHub https://github.com/elasticsearch/elasticsearch-hadoop/issues/162#issuecomment-36925446.

Costin

costin commented 10 years ago

By the way, the #221 build has been published.

On 3/6/2014 9:19 PM, mbaig wrote:

Hi Costin: here is the trace level log output you requested (see gist below), I had to redact some parts of the logs, so if you see something like, Received [200-OK] [], the empty [] actually was populated correctly, hope you understand. Thanks again for all your help with this and the great work you guys are doing with ES in general! https://gist.github.com/mbaig/9397119

— Reply to this email directly or view it on GitHub https://github.com/elasticsearch/elasticsearch-hadoop/issues/162#issuecomment-36925446.

Costin

mbaig commented 10 years ago

Costin, is that build #331 or #221? Also, I'll be available on IRC, 'mbaig'.

costin commented 10 years ago

The id between the build plan and maven are not synchronized. #331 is the Maven number, #221 the number of the build plan. Basically, try the latest available snapshot.

On 3/6/2014 11:03 PM, mbaig wrote:

Costin, is that build #331 or #221? Also, I'll be available on IRC, 'mbaig'.

— Reply to this email directly or view it on GitHub https://github.com/elasticsearch/elasticsearch-hadoop/issues/162#issuecomment-36936618.

Costin

costin commented 10 years ago

As for IRC, give me a ping once you get online - I'll be available for the next 1.5h or so.

Cheers!

On 3/6/2014 11:03 PM, mbaig wrote:

Costin, is that build #331 or #221? Also, I'll be available on IRC, 'mbaig'.

— Reply to this email directly or view it on GitHub https://github.com/elasticsearch/elasticsearch-hadoop/issues/162#issuecomment-36936618.

Costin

costin commented 10 years ago

Hi,

I've pushed some changes on a new branch - cfg-refactor. Can you try it out? You can easily build the package using: gradlew -x test build. Still interested in the logs on M2.

Cheers!

On 3/6/2014 11:03 PM, mbaig wrote:

Costin, is that build #331 or #221? Also, I'll be available on IRC, 'mbaig'.

— Reply to this email directly or view it on GitHub https://github.com/elasticsearch/elasticsearch-hadoop/issues/162#issuecomment-36936618.

Costin

costin commented 10 years ago

Hi,

Can you please try the latest build #333 ? Also please post the update logs just in case.

Thanks,

On 3/6/2014 11:03 PM, mbaig wrote:

Costin, is that build #331 or #221? Also, I'll be available on IRC, 'mbaig'.

— Reply to this email directly or view it on GitHub https://github.com/elasticsearch/elasticsearch-hadoop/issues/162#issuecomment-36936618.

Costin

mbaig commented 10 years ago

Costin: not sure what you changed, but, it looks like build #333 is working. That is, reading from ES looks good. Haven't tried writing, will do that next.

costin commented 10 years ago

That's good to know. Getting some logs between M2 and current master would still be useful - we can chat on IRC more if you'd like. thanks!

mbaig commented 10 years ago

Yeah, I'll definitely get you those logs. I was trying to filter the dataset for the logs using the es.query job config param, which incidentally didn't work, however, passing the filter query to the EsTap constructor did work, so that should get me over that obstacle. I'm going to deploy #333 now to our cluster for larger dataset test, fingers crossed.

mbaig commented 10 years ago

I'm deploying the new jar to our cluster, but, I just realized we upgraded our ES to 1.0.1 (successfully) last night. Will this be a problem for the es-hadoop client?

costin commented 10 years ago

It's not a problem. es-hadoop since M2 supports both ES 1.0 and 9.x

On 3/8/2014 12:54 AM, mbaig wrote:

I'm deploying the new jar to our cluster, but, I just realized we upgraded our ES to 1.0.1 (successfully) last night. Will this be a problem for the es-hadoop client?

— Reply to this email directly or view it on GitHub https://github.com/elasticsearch/elasticsearch-hadoop/issues/162#issuecomment-37077020.

Costin

costin commented 10 years ago

The Configuration option should work in master just like on M2. Along with the logs, can you please post a simple code snippet that reproduces the issue.

Thanks!

On 3/8/2014 12:39 AM, mbaig wrote:

Yeah, I'll definitely get you those logs. I was trying to filter the dataset for the logs using the es.query job config param, which incidentally didn't work, however, passing the filter query to the EsTap constructor did work, so that should get me over that obstacle. I'm going to deploy #333 now to our cluster for larger dataset test, fingers crossed.

— Reply to this email directly or view it on GitHub https://github.com/elasticsearch/elasticsearch-hadoop/issues/162#issuecomment-37075776.

Costin

mbaig commented 10 years ago

Hey: sorry, didn't mean to disappear on you like that, I was busy firefighting issues with our ES upgrade to 1.0.1. One of the issues I ran into was that es-hadoop nightly stopped working again, albeit due to a different issue. Let me know if I should open another issue for it, meantime I'll try to describe it here. Our hadoop cluster connects to our prod ES cluster via ssh tunnels on localhost:9200. It seems the es-hadoop client instead of connecting to localhost, connects instead to the resolved ip of the ES shards (more accurately it connects to localhost the 1st time but switches to ips for all subsequent calls). Since those ips aren't visible I'm seeing ConnectExceptions, see here for logs https://gist.github.com/mbaig/9397119 I tried setting the config param "es.nodes.discovery" to false to force the client to only use localhost, but, this doesn't seem to be doing what I hoped. Btw, I'm using the shapshot of es-hadoop for this.

Thanks again Costin. Oh and I still owe you those M2 logs...

costin commented 10 years ago

@mbaig Best to open another issue. Tunnelling is not supported by es-hadoop and I'm not sure whether it will ever be. Without a direct network connection, the parallel read/write don't make sense since there's no direct connection to each shard and thus the performance goes down the drain. I'll look into this but it's not a priority at the moment - using a VPN is probably a better solution long term since it hides the tunnelling much better than an actual tunnel.

costin commented 10 years ago

@mbaig Those M2 logs would still be nice ....

costin commented 10 years ago

By the way, you could try setting up the JDK property for proxies, in particular the SOCKS one: http://docs.oracle.com/javase/7/docs/api/java/net/doc-files/net-properties.html