Closed mbaig closed 10 years ago
Can you expand on that - what exception do you encounter?
Hello Costin: thanks for the quick response! Ok, so the exception I'm seeing is below, I pared down everything to keep things simple for debug purposes, but, basically I'm creating an EsTap to an index/type together with an array of fields of interest, then simply outputting all data from the tap to stdout, no queries to complicate matters, no "es.query" config param either since this also doesn't seem to be working any longer. Note, if I use 1.3.0.M2, everything works as expected, but, not so with the snapshots. 14/03/05 15:19:11 ERROR stream.TrapHandler: caught Throwable, no trap available, rethrowing cascading.tuple.TupleException: unable to read from input identifier: 'unknown' at cascading.tuple.TupleEntrySchemeIterator.hasNext(TupleEntrySchemeIterator.java:127) at cascading.flow.stream.SourceStage.map(SourceStage.java:76) at cascading.flow.stream.SourceStage.run(SourceStage.java:58) at cascading.flow.hadoop.FlowMapper.run(FlowMapper.java:127) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) Caused by: java.lang.IllegalStateException: Cannot build scroll [adevents-2014-02-12/click/_search?search_type=scan&scroll=5&size=50&_source=ri,bot_act,psn,sts,ptz,pv_lo,cid&preference=_shards:1;_only_node:s6gZjGaBT6KFEpn23r5vgA] at org.elasticsearch.hadoop.rest.QueryBuilder.build(QueryBuilder.java:201) at org.elasticsearch.hadoop.mr.EsInputFormat$ShardRecordReader.next(EsInputFormat.java:286) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:192) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176) at cascading.tap.hadoop.util.MeasuredRecordReader.next(MeasuredRecordReader.java:61) at org.elasticsearch.hadoop.cascading.EsHadoopScheme.source(EsHadoopScheme.java:154) at cascading.tuple.TupleEntrySchemeIterator.getNext(TupleEntrySchemeIterator.java:140) at cascading.tuple.TupleEntrySchemeIterator.hasNext(TupleEntrySchemeIterator.java:120) ... 6 more Caused by: java.io.IOException: Out of nodes and retries; caught exception at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:98) at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:250) at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:246) at org.elasticsearch.hadoop.rest.RestClient.scan(RestClient.java:274) at org.elasticsearch.hadoop.rest.RestRepository.scan(RestRepository.java:97) at org.elasticsearch.hadoop.rest.QueryBuilder.build(QueryBuilder.java:199) ... 13 more
Sorry, hit Close by accident, please ignore.
The error indicates a network error, that is es-hadoop cannot connect to your host. I have pushed a new nightly build (20140305.224939-329) can you please try it out and let me know how it goes. It seems that es-hadoop does connect to ES initially but then it starts losing the connection for some reason..
Just tried that nightly, sorry it didn't work, same exception stack(s). Btw, es-hadoop is able to connect to our ES host if I use 1.3.0.M2, so I think we can rule out poor connectivity issues for us, although, there may still be other programmatic connectivity issues in the es-hadoop client.
Can you turn on logging (TRACE level) in log4j.properties on org.elasticsearch.hadoop
package and report back your findings (upload to a gist somewhere). Ideally try it on a small data set since there will be a lot of output.
Thanks!
Hi Costin: here is the trace level log output you requested (see gist below), I had to redact some parts of the logs, so if you see something like, Received [200-OK] [], the empty [] actually was populated correctly, hope you understand. Thanks again for all your help with this and the great work you guys are doing with ES in general! https://gist.github.com/mbaig/9397119
Hi,
There are several things suspicious in the logs. There's the network error but there's also the NoSuchMethodError at the end (this one caused by some incompatible library).
There were several improvements made to cascading so I've pushed a nightly build [1] - can you please check it out once it completes. Then if possible, please update the gist of the current build and M2 - I've checked the differences between the two but nothing stands out.
Are you available on IRC? This would make things a lot easier to debug - I'm 'costin' on #elasticsearch. Let's connect in 30' or so if that works for you.
Thanks!
On 3/6/2014 9:19 PM, mbaig wrote:
Hi Costin: here is the trace level log output you requested (see gist below), I had to redact some parts of the logs, so if you see something like, Received [200-OK] [], the empty [] actually was populated correctly, hope you understand. Thanks again for all your help with this and the great work you guys are doing with ES in general! https://gist.github.com/mbaig/9397119
— Reply to this email directly or view it on GitHub https://github.com/elasticsearch/elasticsearch-hadoop/issues/162#issuecomment-36925446.
Costin
By the way, the #221 build has been published.
On 3/6/2014 9:19 PM, mbaig wrote:
Hi Costin: here is the trace level log output you requested (see gist below), I had to redact some parts of the logs, so if you see something like, Received [200-OK] [], the empty [] actually was populated correctly, hope you understand. Thanks again for all your help with this and the great work you guys are doing with ES in general! https://gist.github.com/mbaig/9397119
— Reply to this email directly or view it on GitHub https://github.com/elasticsearch/elasticsearch-hadoop/issues/162#issuecomment-36925446.
Costin
Costin, is that build #331 or #221? Also, I'll be available on IRC, 'mbaig'.
The id between the build plan and maven are not synchronized. #331 is the Maven number, #221 the number of the build plan. Basically, try the latest available snapshot.
On 3/6/2014 11:03 PM, mbaig wrote:
Costin, is that build #331 or #221? Also, I'll be available on IRC, 'mbaig'.
— Reply to this email directly or view it on GitHub https://github.com/elasticsearch/elasticsearch-hadoop/issues/162#issuecomment-36936618.
Costin
As for IRC, give me a ping once you get online - I'll be available for the next 1.5h or so.
Cheers!
On 3/6/2014 11:03 PM, mbaig wrote:
Costin, is that build #331 or #221? Also, I'll be available on IRC, 'mbaig'.
— Reply to this email directly or view it on GitHub https://github.com/elasticsearch/elasticsearch-hadoop/issues/162#issuecomment-36936618.
Costin
Hi,
I've pushed some changes on a new branch - cfg-refactor.
Can you try it out? You can easily build the package using: gradlew -x test build
. Still interested in the logs on M2.
Cheers!
On 3/6/2014 11:03 PM, mbaig wrote:
Costin, is that build #331 or #221? Also, I'll be available on IRC, 'mbaig'.
— Reply to this email directly or view it on GitHub https://github.com/elasticsearch/elasticsearch-hadoop/issues/162#issuecomment-36936618.
Costin
Hi,
Can you please try the latest build #333 ? Also please post the update logs just in case.
Thanks,
On 3/6/2014 11:03 PM, mbaig wrote:
Costin, is that build #331 or #221? Also, I'll be available on IRC, 'mbaig'.
— Reply to this email directly or view it on GitHub https://github.com/elasticsearch/elasticsearch-hadoop/issues/162#issuecomment-36936618.
Costin
Costin: not sure what you changed, but, it looks like build #333 is working. That is, reading from ES looks good. Haven't tried writing, will do that next.
That's good to know. Getting some logs between M2 and current master would still be useful - we can chat on IRC more if you'd like. thanks!
Yeah, I'll definitely get you those logs. I was trying to filter the dataset for the logs using the es.query job config param, which incidentally didn't work, however, passing the filter query to the EsTap constructor did work, so that should get me over that obstacle. I'm going to deploy #333 now to our cluster for larger dataset test, fingers crossed.
I'm deploying the new jar to our cluster, but, I just realized we upgraded our ES to 1.0.1 (successfully) last night. Will this be a problem for the es-hadoop client?
It's not a problem. es-hadoop since M2 supports both ES 1.0 and 9.x
On 3/8/2014 12:54 AM, mbaig wrote:
I'm deploying the new jar to our cluster, but, I just realized we upgraded our ES to 1.0.1 (successfully) last night. Will this be a problem for the es-hadoop client?
— Reply to this email directly or view it on GitHub https://github.com/elasticsearch/elasticsearch-hadoop/issues/162#issuecomment-37077020.
Costin
The Configuration option should work in master just like on M2. Along with the logs, can you please post a simple code snippet that reproduces the issue.
Thanks!
On 3/8/2014 12:39 AM, mbaig wrote:
Yeah, I'll definitely get you those logs. I was trying to filter the dataset for the logs using the es.query job config param, which incidentally didn't work, however, passing the filter query to the EsTap constructor did work, so that should get me over that obstacle. I'm going to deploy #333 now to our cluster for larger dataset test, fingers crossed.
— Reply to this email directly or view it on GitHub https://github.com/elasticsearch/elasticsearch-hadoop/issues/162#issuecomment-37075776.
Costin
Hey: sorry, didn't mean to disappear on you like that, I was busy firefighting issues with our ES upgrade to 1.0.1. One of the issues I ran into was that es-hadoop nightly stopped working again, albeit due to a different issue. Let me know if I should open another issue for it, meantime I'll try to describe it here. Our hadoop cluster connects to our prod ES cluster via ssh tunnels on localhost:9200. It seems the es-hadoop client instead of connecting to localhost, connects instead to the resolved ip of the ES shards (more accurately it connects to localhost the 1st time but switches to ips for all subsequent calls). Since those ips aren't visible I'm seeing ConnectExceptions, see here for logs https://gist.github.com/mbaig/9397119 I tried setting the config param "es.nodes.discovery" to false to force the client to only use localhost, but, this doesn't seem to be doing what I hoped. Btw, I'm using the shapshot of es-hadoop for this.
Thanks again Costin. Oh and I still owe you those M2 logs...
@mbaig Best to open another issue. Tunnelling is not supported by es-hadoop and I'm not sure whether it will ever be. Without a direct network connection, the parallel read/write don't make sense since there's no direct connection to each shard and thus the performance goes down the drain. I'll look into this but it's not a priority at the moment - using a VPN is probably a better solution long term since it hides the tunnelling much better than an actual tunnel.
@mbaig Those M2 logs would still be nice ....
By the way, you could try setting up the JDK property for proxies, in particular the SOCKS one: http://docs.oracle.com/javase/7/docs/api/java/net/doc-files/net-properties.html
Please note EsTap was/is working as expected in 1.3.0.M2, however, it seems to be broken in the last ~2 weeks of nightly builds. Also note, our usage pattern or code did not change between the release of 1.3.0.M2 and today (2014-03-05).